Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random test failed: System.NotSupportedException : The WriteAsync method cannot be called when another write operation is pending. #46

Closed
yyjdelete opened this issue May 22, 2021 · 5 comments · Fixed by #47

Comments

@yyjdelete
Copy link
Collaborator

Test failed

looks like another issue and not related to #41
net_core_tests_mac_1015> DotNetty.Handlers.Tests.TlsHandlerTest.TlsWrite(frameLengths: [14465, 9801, 16911, -1, 7057, ...], isClient: False, serverProtocol: Tls11, clientProtocol: Tls11)

System.AggregateException : One or more errors occurred. ( The WriteAsync method cannot be called when another write operation is pending.)
---- System.NotSupportedException :  The WriteAsync method cannot be called when another write operation is pending.


Stack trace
   at DotNetty.Transport.Channels.Embedded.EmbeddedChannel.CheckException(IPromise promise) in /Users/runner/work/1/s/src/DotNetty.Transport/Channels/Embedded/EmbeddedChannel.cs:line 631
   at DotNetty.Transport.Channels.Embedded.EmbeddedChannel.CheckException() in /Users/runner/work/1/s/src/DotNetty.Transport/Channels/Embedded/EmbeddedChannel.cs:line 638
   at DotNetty.Transport.Channels.Embedded.EmbeddedChannel.WriteOutbound(Object[] msgs) in /Users/runner/work/1/s/src/DotNetty.Transport/Channels/Embedded/EmbeddedChannel.cs:line 448
   at DotNetty.Handlers.Tests.TlsHandlerTest.TlsWrite(Int32[] frameLengths, Boolean isClient, SslProtocols serverProtocol, SslProtocols clientProtocol) in /Users/runner/work/1/s/test/DotNetty.Handlers.Tests/TlsHandlerTest.cs:line 203
   at DotNetty.Handlers.Tests.TlsHandlerTest.TlsWrite(Int32[] frameLengths, Boolean isClient, SslProtocols serverProtocol, SslProtocols clientProtocol) in /Users/runner/work/1/s/test/DotNetty.Handlers.Tests/TlsHandlerTest.cs:line 232
--- End of stack trace from previous location ---
----- Inner Stack Trace -----
   at System.Net.Security.SslStream.WriteAsyncInternal[TIOAdapter](TIOAdapter writeAdapter, ReadOnlyMemory`1 buffer)
   at System.Net.Security.SslStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.Stream.Write(ReadOnlySpan`1 buffer)
   at DotNetty.Buffers.UnpooledHeapByteBuffer.GetBytes(Int32 index, Stream destination, Int32 length) in /Users/runner/work/1/s/src/DotNetty.Buffers/UnpooledHeapByteBuffer.cs:line 173
   at DotNetty.Buffers.AbstractByteBuffer.ReadBytes(Stream destination, Int32 length) in /Users/runner/work/1/s/src/DotNetty.Buffers/AbstractByteBuffer.cs:line 940
   at DotNetty.Handlers.Tls.TlsHandler.Wrap(IChannelHandlerContext context) in /Users/runner/work/1/s/src/DotNetty.Handlers/Tls/TlsHandler.Writer.cs:line 148
   at DotNetty.Handlers.Tls.TlsHandler.WrapAndFlush(IChannelHandlerContext context) in /Users/runner/work/1/s/src/DotNetty.Handlers/Tls/TlsHandler.Writer.cs:line 108
   at DotNetty.Handlers.Tls.TlsHandler.Flush(IChannelHandlerContext context) in /Users/runner/work/1/s/src/DotNetty.Handlers/Tls/TlsHandler.Writer.cs:line 66
--- End of stack trace from previous location ---
   at DotNetty.Handlers.Tls.TlsHandler.Flush(IChannelHandlerContext context) in /Users/runner/work/1/s/src/DotNetty.Handlers/Tls/TlsHandler.Writer.cs:line 72
   at DotNetty.Transport.Channels.AbstractChannelHandlerContext.InvokeFlush0() in /Users/runner/work/1/s/src/DotNetty.Transport/Channels/AbstractChannelHandlerContext.cs:line 871
@yyjdelete
Copy link
Collaborator Author

I'm not sure, but seems TlsHandler.Wrap can only be called by TlsHandler.HandleHandshakeCompleted and Flush, and the latter one is already in loop thread, so maybe HandleHandshakeCompleted should be executed in loop thread to avoid concurrent issue(now it's executed on the threadAuthenticateAsServer/ClientAsync finished)
https://github.com/cuteant/SpanNetty/blob/7e2252b2dfe5cbaf36bf6c15b8219b98685c2bdb/src/DotNetty.Handlers/Tls/TlsHandler.Handshake.cs

@cuteant
Copy link
Owner

cuteant commented May 22, 2021

真是搞不清除 Azure Pipeline有什么限制,也没有搜索到相关的文档,
#19 里边列的测试结果本地测试windows全部通过,本地测试ubuntu下Suite.Tests部分测试因为libuv没有通过,
我已经把 DotNetty.Suite.Tests和 Transport.Tests 在 Azure Pipeline里屏蔽了,不过相同的测试在appveyor中完全没问题,
除了 TlsHandlerTest ,还有 End2EndTests.MqttServerAndClient经常无法通过测试

@yyjdelete
Copy link
Collaborator Author

yyjdelete commented May 22, 2021

  1. tls那个我感觉可以先去跑一遍原生的sslStream检查下系统支持什么类型, 然后过滤下测试项目, 或者干脆只跑Tls12(反正这个测试的组合也只是验证指定的SslProtocols确实被发送给了SslStream, 验证SslStream工作正常是dotnet本身的测试做的事)

  2. 但我自己本地在ubuntu20.04跑测试的时候发现tls12和tls13也握手失败了(tls10和11是本来就被屏蔽了不行的状态)
    用ub作为客户端, win为服务端的时候正常;
    ub同时做服务端和客户端时报Interop+Crypto+OpenSslCryptographicException: error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version;
    win做客户端, ub做服务端的时候报Using SSL certificate failed with OpenSSL error - ee key too small.

然后拿dotnet dev-certs https -ep localhost.pfx生成了一个新的证书, 用那个新的证书又是完全正常的
怀疑可能还需要重新生成下dotnetty.com.pfxcontoso.com.pfx, 不知道是不是这个老证书用到的sha1或者RSA1024已经被弃用, 然后openssl加载出错了的原因

  1. MqttServerAndClient的timeout感觉可能是吃机器的性能或者cpu内核数, 我记得原来在单核的虚拟机上跑过, 一堆不过的...
    应该是在MqttServerAndClient的第一行加上能模拟出同样的效果
    System.Diagnostics.Process.GetCurrentProcess().ProcessorAffinity = new IntPtr(0x01);

@cuteant
Copy link
Owner

cuteant commented May 22, 2021

我的utuntu18.04虚拟机崩溃了,我明天再装个测试下,MqttServerAndClient我想加个预编译符,只在本地参与测试Azure Pipeline就忽略它

@cuteant cuteant reopened this May 23, 2021
@yyjdelete
Copy link
Collaborator Author

yyjdelete commented May 23, 2021

@cuteant MqttServerAndClient那个我本地调了下, 感觉好像是我DotNetty那边这个PR想修的问题, 就是当TlsHandler.MediationStream.ResetSource被调用的时候, _input中的数据还没有被读取完毕, 造成这部分未读取的加密数据会被永久的丢弃, 最后因为缺失数据造成读取超时

            public void ResetSource()
            {
                Debug.Assert(SourceReadableBytes == 0);//netcore的子文件中添加后, 会触发这个Assert

                _input = null;
                _inputLength = 0;
                _inputOffset = 0;
            }

这个我一直还以为是合过了的, 我有空可以研究下怎么合过来, 那边的这个PR代码中对ownerBuffer的使用好像也不是很干净的样子, 也不确定是不是netcore/netfx/netstandard2.0三个版本的代码都要改...

Azure/DotNetty#374
具体的提交是
Azure/DotNetty@1a203bc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants