Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit code 139 on 'dotnet new' #34231

Closed
BrennanConroy opened this issue Mar 28, 2020 · 32 comments · Fixed by #35925
Closed

Exit code 139 on 'dotnet new' #34231

BrennanConroy opened this issue Mar 28, 2020 · 32 comments · Fixed by #35925
Labels
area-System.Net.Security bug os-linux Linux OS (any supported distro)
Milestone

Comments

@BrennanConroy
Copy link
Member

Looks like the same issue as #28932 which was fixed in 2.1 and 2.2?

Core dump: https://dev.azure.com/dnceng/internal/_build/results?buildId=574320&view=ms.vss-test-web.build-test-results-tab&runId=18013826&resultId=120463&paneView=attachments
and
https://dev.azure.com/dnceng/internal/_build/results?buildId=564240&view=ms.vss-test-web.build-test-results-tab&runId=17753940&resultId=120476&paneView=attachments

I took a look at the first core dump:
dotnet dump analyze shows:

00007FB47A9A46A0                  [InlinedCallFrame: 00007fb47a9a46a0] System.Net.Security.dll!Interop+Crypto.ErrReasonErrorString(UInt64)
00007FB47A9A46A0                  [InlinedCallFrame: 00007fb47a9a46a0] System.Net.Security.dll!Interop+Crypto.ErrReasonErrorString(UInt64)
00007FB47A9A4690 00007FB40C94DB9A System.IO.FileSystem.dll!ILStubClass.IL_STUB_PInvoke(UInt64) + 106
00007FB47A9A4720 00007FB409FBB487 System.Net.Security.dll!Interop+OpenSsl.CreateSslException(System.String) + 39 [/_/src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs @ 625]
00007FB47A9A4750 00007FB409FB4D69 System.Net.Security.dll!Interop+OpenSsl.AllocateSslContext(System.Security.Authentication.SslProtocols, Microsoft.Win32.SafeHandles.SafeX509Handle, System.Security.Cryptography.SafeEvpPKeyHandle, System.Net.Security.EncryptionPolicy, System.Net.Security.SslAuthenticationOptions) + 185 [/_/src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs @ 69]
00007FB47A9A47D0 00007FB409FB4C5D System.Net.Security.dll!System.Net.Security.SafeDeleteSslContext..ctor(System.Net.Security.SafeFreeSslCredentials, System.Net.Security.SslAuthenticationOptions) + 141 [/_/src/libraries/Common/src/System/Net/Security/Unix/SafeDeleteSslContext.cs @ 30]
00007FB47A9A4810 00007FB409FD7C5B System.Net.Security.dll!System.Net.Security.SslStreamPal.HandshakeInternal(System.Net.Security.SafeFreeCredentials, System.Net.Security.SafeDeleteSslContext ByRef, System.ReadOnlySpan`1<Byte>, Byte[] ByRef, System.Net.Security.SslAuthenticationOptions) + 155 [/_/src/libraries/System.Net.Security/src/System/Net/Security/SslStreamPal.Unix.cs @ 112]

gdb shows:
#0 0x00007fb482f382c2 in __pthread_mutex_trylock (mutex=0x0) at ../nptl/pthread_mutex_trylock.c:172 for the first frame

Runtime version 5.0.0-preview.2.20119.9

@bartonjs @janvorli

@wfurt
Copy link
Member

wfurt commented Mar 31, 2020

cc: @janvorli

@wfurt wfurt added os-linux Linux OS (any supported distro) and removed os-alpine labels Mar 31, 2020
@ryanbrandenburg
Copy link
Contributor

Possibly related, ran into 139 when running dotnet nuget list source. Ctr

'/home/helixbot/work/A22508F7/w/BE2309D4/e/.dotnet18061/dotnet nuget list source' completed with exit code '139'
Exception in InstallAspNetAppIfNeeded: System.InvalidOperationException: Command /home/helixbot/work/A22508F7/w/BE2309D4/e/.dotnet18061/dotnet nuget list source returned exit code 139 output: 
   at RunTests.ProcessUtil.RunAsync(String filename, String arguments, String workingDirectory, Boolean throwOnError, IDictionary`2 environmentVariables, Action`1 outputDataReceived, Action`1 errorDataReceived, Action`1 onStart, CancellationToken cancellationToken) in /home/helixbot/work/A22508F7/w/BE2309D4/e/RunTests/ProcessUtil.cs:line 144
   at RunTests.ProcessUtil.RunAsync(String filename, String arguments, String workingDirectory, Boolean throwOnError, IDictionary`2 environmentVariables, Action`1 outputDataReceived, Action`1 errorDataReceived, Action`1 onStart, CancellationToken cancellationToken) in /home/helixbot/work/A22508F7/w/BE2309D4/e/RunTests/ProcessUtil.cs:line 144
   at RunTests.TestRunner.InstallAspNetAppIfNeededAsync() in /home/helixbot/work/A22508F7/w/BE2309D4/e/RunTests/TestRunner.cs:line 134

@Pilchie
Copy link
Member

Pilchie commented Apr 27, 2020

@karelz @danmosemsft - can we prioritize investigation here? This is causing infrastructure flakiness for ASP.NET Core.

@wfurt
Copy link
Member

wfurt commented Apr 27, 2020

What is base OS version? Is there some better way how to reproduce beside CI?

@BrennanConroy
Copy link
Member Author

What is base OS version?

Well the two builds I linked are no longer around because it's been a month. Luckily I think I still have one of the core dumps and it loads on my Ubuntu machine.

Is there some better way how to reproduce beside CI?

Nope, very random and infrequent.

@ryanbrandenburg
Copy link
Contributor

What is base OS version?

the one I linked had QueueId ubuntu.1804.amd64.open

@janvorli
Copy link
Member

@BrennanConroy can you please share the core so that I can take a deeper look?

@BrennanConroy
Copy link
Member Author

Sent an email.

@ghost
Copy link

ghost commented Apr 28, 2020

Tagging subscribers to this area: @dotnet/ncl
Notify danmosemsft if you want to be subscribed.

@karelz
Copy link
Member

karelz commented Apr 28, 2020

@Pilchie do you have more details how often it happens?
@wfurt can you please take a look with @janvorli?

@karelz karelz added this to the 5.0 milestone Apr 28, 2020
@karelz karelz added bug and removed untriaged New issue has not been triaged by the area owner labels Apr 28, 2020
@karelz
Copy link
Member

karelz commented Apr 28, 2020

@ryanbrandenburg why do you think your callstack in #34231 (comment) is related to this issue? It is because the symptom (exit code) is the same or is there more to it?

@ryanbrandenburg
Copy link
Contributor

The exit code is the only connection I know of. That call-stack is purely on our end and was included only in case looking at the site that calls it might be helpful to the investigator.

@BrennanConroy was the one who originally called out the possible connection, so he might have done a more in-depth analysis.

@janvorli
Copy link
Member

Exit code 139 means "access violation", so it can be effectively anything. But this is what makes it point to the old issue fixed in 2.1 / 2.2:

gdb shows:
#0 0x00007fb482f382c2 in __pthread_mutex_trylock (mutex=0x0) at ../nptl/pthread_mutex_trylock.c:172 for the first frame

@janvorli
Copy link
Member

Here is the top of stack trace for the old fixed issue:

        Child SP               IP Call Site
00007FA39E11F0E8 00007FA3A290F3AD ld-musl-x86_64.so.1!pthread_rwlock_tryrdlock at /home/buildozer/aports/main/musl/src/musl-1.1.20/src/thread/pthread_rwlock_tryrdlock.c:7
00007FA39E11F0F0 00007FA3A290F2BF ld-musl-x86_64.so.1!pthread_rwlock_timedrdlock + 14 at /home/buildozer/aports/main/musl/src/musl-1.1.20/src/thread/pthread_rwlock_timedrdlock.c:8
00007FA39E11F110 00007FA39DA1AC71 libcrypto.so.1.1!CRYPTO_THREAD_read_lock + 6
00007FA39E11F120 00007FA39D9A94EB libcrypto.so.1.1!___lldb_unnamed_symbol1122$$libcrypto.so.1.1 + 16
00007FA39E11F130 00007FA39D9A9940 libcrypto.so.1.1!ERR_reason_error_string + 93

And here is the full stack trace from the current dump:

(lldb) clrstack -f
OS Thread Id: 0x55b4 (1)
        Child SP               IP Call Site
00007FB47A9A45F0 00007FB482F382C2 libpthread.so.0!__GI___pthread_rwlock_rdlock + 18
00007FB47A9A4630 00007FB4062E5869 libcrypto.so.1.1!CRYPTO_THREAD_read_lock + 9
00007FB47A9A4640 00007FB40625D820 libcrypto.so.1.1!___lldb_unnamed_symbol1102$$libcrypto.so.1.1 + 16
00007FB47A9A4650 00007FB40625DD39 libcrypto.so.1.1!ERR_reason_error_string + 89
00007FB47A9A4690 00007FB40C94DB9A
00007FB47A9A46A0                  [InlinedCallFrame: 00007fb47a9a46a0] System.Net.Security.dll!Interop+Crypto.ErrReasonErrorString(UInt64)
00007FB47A9A46A0                  [InlinedCallFrame: 00007fb47a9a46a0] System.Net.Security.dll!Interop+Crypto.ErrReasonErrorString(UInt64)
00007FB47A9A4690 00007FB40C94DB9A System.IO.FileSystem.dll!ILStubClass.IL_STUB_PInvoke(UInt64) + 106
00007FB47A9A4720 00007FB409FBB487 System.Net.Security.dll!Interop+OpenSsl.CreateSslException(System.String) + 39 [/_/src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs @ 625]
00007FB47A9A4750 00007FB409FB4D69 System.Net.Security.dll!Interop+OpenSsl.AllocateSslContext(System.Security.Authentication.SslProtocols, Microsoft.Win32.SafeHandles.SafeX509Handle, System.Security.Cryptography.SafeEvpPKeyHandle, System.Net.Security.EncryptionPolicy, System.Net.Security.SslAuthenticationOptions) + 185 [/_/src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs @ 69]
00007FB47A9A47D0 00007FB409FB4C5D System.Net.Security.dll!System.Net.Security.SafeDeleteSslContext..ctor(System.Net.Security.SafeFreeSslCredentials, System.Net.Security.SslAuthenticationOptions) + 141 [/_/src/libraries/Common/src/System/Net/Security/Unix/SafeDeleteSslContext.cs @ 30]
00007FB47A9A4810 00007FB409FD7C5B System.Net.Security.dll!System.Net.Security.SslStreamPal.HandshakeInternal(System.Net.Security.SafeFreeCredentials, System.Net.Security.SafeDeleteSslContext ByRef, System.ReadOnlySpan`1<Byte>, Byte[] ByRef, System.Net.Security.SslAuthenticationOptions) + 155 [/_/src/libraries/System.Net.Security/src/System/Net/Security/SslStreamPal.Unix.cs @ 112]
00007FB47A9A48D0 00007FB409FCF8D9 System.Net.Security.dll!System.Net.Security.SecureChannel.GenerateToken(System.ReadOnlySpan`1<Byte>, Byte[] ByRef) + 505 [/_/src/libraries/System.Net.Security/src/System/Net/Security/SecureChannel.cs @ 802]
00007FB47A9A4980 00007FB409FCF504 System.Net.Security.dll!System.Net.Security.SecureChannel.NextMessage(System.ReadOnlySpan`1<Byte>) + 132 [/_/src/libraries/System.Net.Security/src/System/Net/Security/SecureChannel.cs @ 726]
00007FB47A9A4A10 00007FB409FDA43A System.Net.Security.dll!System.Net.Security.SslStream+<ForceAuthenticationAsync>d__177`1[[System.Net.Security.SslStream+AsyncSslIOAdapter, System.Net.Security]].MoveNext() + 298 [/_/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs @ 251]
00007FB47A9A4C00 00007FB40C6BB31A System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.Net.Security.SslStream+<ForceAuthenticationAsync>d__177`1[[System.Net.Security.SslStream+AsyncSslIOAdapter, System.Net.Security]], System.Net.Security]](<ForceAuthenticationAsync>d__177`1<AsyncSslIOAdapter> ByRef) + 122 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncMethodBuilderCore.cs @ 63]
00007FB47A9A4C70 00007FB40C6BB27B System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder.Start[[System.Net.Security.SslStream+<ForceAuthenticationAsync>d__177`1[[System.Net.Security.SslStream+AsyncSslIOAdapter, System.Net.Security]], System.Net.Security]](<ForceAuthenticationAsync>d__177`1<AsyncSslIOAdapter> ByRef) + 27 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilder.cs @ 34]
00007FB47A9A4C90 00007FB409FC8ED7 System.Net.Security.dll!System.Net.Security.SslStream.ForceAuthenticationAsync[[System.Net.Security.SslStream+AsyncSslIOAdapter, System.Net.Security]](AsyncSslIOAdapter, Boolean, Byte[], Boolean) + 135
00007FB47A9A4D30 00007FB409FC7506 System.Net.Security.dll!System.Net.Security.SslStream.ProcessAuthentication(Boolean, Boolean, System.Threading.CancellationToken) + 134 [/_/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.Implementation.cs @ 204]
00007FB47A9A4D80 00007FB409FB2F6F System.Net.Security.dll!System.Net.Security.SslStream.AuthenticateAsClientAsync(System.Net.Security.SslClientAuthenticationOptions, System.Threading.CancellationToken) + 95 [/_/src/libraries/System.Net.Security/src/System/Net/Security/SslStream.cs @ 372]
00007FB47A9A4DB0 00007FB409DB046F System.Net.Http.dll!System.Net.Http.ConnectHelper+<EstablishSslConnectionAsyncCore>d__4.MoveNext() + 143 [/_/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ConnectHelper.cs @ 152]
00007FB47A9A4E30 00007FB40C6BAB4A System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.Net.Http.ConnectHelper+<EstablishSslConnectionAsyncCore>d__4, System.Net.Http]](<EstablishSslConnectionAsyncCore>d__4 ByRef) + 122 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncMethodBuilderCore.cs @ 63]
00007FB47A9A4EA0 00007FB40C6BAAAF System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1[[System.__Canon, System.Private.CoreLib]].Start[[System.Net.Http.ConnectHelper+<EstablishSslConnectionAsyncCore>d__4, System.Net.Http]](<EstablishSslConnectionAsyncCore>d__4 ByRef) + 31 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncValueTaskMethodBuilderT.cs @ 44]
00007FB47A9A4ED0 00007FB409DB03BB System.Net.Http.dll!System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(System.IO.Stream, System.Net.Security.SslClientAuthenticationOptions, System.Threading.CancellationToken) + 139
00007FB47A9A4F60 00007FB409DB031B System.Net.Http.dll!System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(System.Net.Security.SslClientAuthenticationOptions, System.Net.Http.HttpRequestMessage, System.IO.Stream, System.Threading.CancellationToken) + 219 [/_/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ConnectHelper.cs @ 138]
00007FB47A9A4FB0 00007FB409E08F3B System.Net.Http.dll!System.Net.Http.HttpConnectionPool+<ConnectAsync>d__75.MoveNext() + 1611 [/_/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnectionPool.cs @ 1074]
00007FB47A9A5190 00007FB40C6BAA70 System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.ValueTuple`4[[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib]], System.Private.CoreLib],[System.Net.Http.HttpConnectionPool+<ConnectAsync>d__75, System.Net.Http]].ExecutionContextCallback(System.Object) + 128 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilderT.cs @ 291]
00007FB47A9A51D0 00007FB407B53B91 System.Private.CoreLib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) + 129 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs @ 172]
00007FB47A9A5220 00007FB40C6BA84A System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.ValueTuple`4[[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib]], System.Private.CoreLib],[System.Net.Http.HttpConnectionPool+<ConnectAsync>d__75, System.Net.Http]].MoveNext(System.Threading.Thread) + 202 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilderT.cs @ 366]
00007FB47A9A5290 00007FB40C6BA759 System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.ValueTuple`4[[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib],[System.__Canon, System.Private.CoreLib]], System.Private.CoreLib],[System.Net.Http.HttpConnectionPool+<ConnectAsync>d__75, System.Net.Http]].MoveNext() + 25 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilderT.cs @ 307]
00007FB47A9A52B0 00007FB407B70D6A System.Private.CoreLib.dll!System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Runtime.CompilerServices.IAsyncStateMachineBox, Boolean) + 218 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/TaskContinuation.cs @ 785]
00007FB47A9A52F0 00007FB407B6A2BB System.Private.CoreLib.dll!System.Threading.Tasks.Task.RunContinuations(System.Object) + 155 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 3236]
00007FB47A9A53A0 00007FB407B6A219 System.Private.CoreLib.dll!System.Threading.Tasks.Task.FinishContinuations() + 57 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 3208]
00007FB47A9A53C0 00007FB407B61F8D System.Private.CoreLib.dll!System.Threading.Tasks.Task`1[[System.__Canon, System.Private.CoreLib]].TrySetResult(System.__Canon) + 157 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs @ 398]
00007FB47A9A53F0 00007FB407BC80F8 System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncValueTaskMethodBuilder`1[[System.__Canon, System.Private.CoreLib]].SetResult(System.__Canon) + 152 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncValueTaskMethodBuilderT.cs @ 66]
00007FB47A9A5430 00007FB409DBC8B9 System.Net.Http.dll!System.Net.Http.ConnectHelper+<ConnectAsync>d__1.MoveNext() + 1017 [/_/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ConnectHelper.cs @ 81]
00007FB47A9A54E0 00007FB40C6BA5C0 System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.__Canon, System.Private.CoreLib],[System.Net.Http.ConnectHelper+<ConnectAsync>d__1, System.Net.Http]].ExecutionContextCallback(System.Object) + 128 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilderT.cs @ 291]
00007FB47A9A5520 00007FB407B53B91 System.Private.CoreLib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) + 129 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs @ 172]
00007FB47A9A5570 00007FB40C6BA37D System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.__Canon, System.Private.CoreLib],[System.Net.Http.ConnectHelper+<ConnectAsync>d__1, System.Net.Http]].MoveNext(System.Threading.Thread) + 205 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilderT.cs @ 366]
00007FB47A9A55E0 00007FB40C6BA289 System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.__Canon, System.Private.CoreLib],[System.Net.Http.ConnectHelper+<ConnectAsync>d__1, System.Net.Http]].MoveNext() + 25 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilderT.cs @ 307]
00007FB47A9A5600 00007FB407B70D6A System.Private.CoreLib.dll!System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Runtime.CompilerServices.IAsyncStateMachineBox, Boolean) + 218 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/TaskContinuation.cs @ 785]
00007FB47A9A5640 00007FB407B6A2BB System.Private.CoreLib.dll!System.Threading.Tasks.Task.RunContinuations(System.Object) + 155 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 3236]
00007FB47A9A56F0 00007FB407B6A219 System.Private.CoreLib.dll!System.Threading.Tasks.Task.FinishContinuations() + 57 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Task.cs @ 3208]
00007FB47A9A5710 00007FB407CEEAA7 System.Private.CoreLib.dll!System.Threading.Tasks.Task`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib]].TrySetResult(System.Threading.Tasks.VoidTaskResult) + 151 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs @ 402]
00007FB47A9A5740 00007FB407BC6AC0 System.Private.CoreLib.dll!System.Runtime.CompilerServices.AsyncTaskMethodBuilder.SetResult() + 80 [/_/src/libraries/System.Private.CoreLib/src/System/Runtime/CompilerServices/AsyncTaskMethodBuilder.cs @ 108]
00007FB47A9A5770 00007FB409DBCAF3 System.Net.Http.dll!System.Net.Http.ConnectHelper+ConnectEventArgs.OnCompleted(System.Net.Sockets.SocketAsyncEventArgs) + 83 [/_/src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/ConnectHelper.cs @ 109]
00007FB47A9A57A0 00007FB40A1D6420 System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncEventArgs.FinishWrapperConnectSuccess(System.Net.Sockets.Socket, Int32, System.Net.Sockets.SocketFlags) + 208 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.cs @ 663]
00007FB47A9A57C0 00007FB40A1D6202 System.Net.Sockets.dll!System.Net.Sockets.MultipleConnectAsync.Succeed() + 82 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/MultipleConnectAsync.cs @ 289]
00007FB47A9A57E0 00007FB40A1D614D System.Net.Sockets.dll!System.Net.Sockets.MultipleConnectAsync.InternalConnectCallback(System.Object, System.Net.Sockets.SocketAsyncEventArgs) + 301 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/MultipleConnectAsync.cs @ 219]
00007FB47A9A5820 00007FB40A1D427B System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncEventArgs.OnCompleted(System.Net.Sockets.SocketAsyncEventArgs) + 43 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.cs @ 203]
00007FB47A9A5830 00007FB40A1D423F System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncEventArgs.ExecutionCallback(System.Object) + 31 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.cs @ 423]
00007FB47A9A5840 00007FB407B53B91 System.Private.CoreLib.dll!System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) + 129 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs @ 172]
00007FB47A9A5890 00007FB407B53AFF System.Private.CoreLib.dll!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) + 15 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs @ 131]
00007FB47A9A58A0 00007FB40A1D5FB0 System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncEventArgs.ConnectCompletionCallback(System.Net.Sockets.SocketError) + 96 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncEventArgs.Unix.cs @ 95]
00007FB47A9A58C0 00007FB40A1D5F46 System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncContext+ConnectOperation.InvokeCallback(Boolean) + 38 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs @ 618]
00007FB47A9A58D0 00007FB40A1F8CDD System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncContext+OperationQueue`1[[System.__Canon, System.Private.CoreLib]].ProcessAsyncOperation(System.__Canon) + 93 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs @ 888]
00007FB47A9A5900 00007FB40A1D5E07 System.Net.Sockets.dll!System.Net.Sockets.SocketAsyncContext+WriteOperation.System.Threading.IThreadPoolWorkItem.Execute() + 39 [/_/src/libraries/System.Net.Sockets/src/System/Net/Sockets/SocketAsyncContext.Unix.cs @ 339]
00007FB47A9A5910 00007FB407B5D7B9 System.Private.CoreLib.dll!System.Threading.ThreadPoolWorkQueue.Dispatch() + 473 [/_/src/libraries/System.Private.CoreLib/src/System/Threading/ThreadPool.cs @ 659]
00007FB47A9A5990 00007FB407B4BEC7 System.Private.CoreLib.dll!System.Threading._ThreadPoolWaitCallback.PerformWaitCallback() + 7 [/_/src/coreclr/src/System.Private.CoreLib/src/System/Threading/ThreadPool.CoreCLR.cs @ 29]
00007FB47A9A59C0 00007FB481359C9B libcoreclr.so!MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 1643 at /__w/2/s/src/coreclr/src/pal/inc/pal.h:4637
00007FB47A9A59C0 00007FB481359C40 libcoreclr.so!MethodDescCallSite::CallTargetWorker(unsigned long const*, unsigned long*, int) + 1552 at /__w/2/s/src/coreclr/src/vm/threads.inl:34
00007FB47A9A5BA0 00007FB4814B67F0 libcoreclr.so!QueueUserWorkItemManagedCallback(void*) + 176 at /__w/2/s/src/coreclr/src/vm/callhelpers.h:459
00007FB47A9A5BA0 00007FB4814B67E1 libcoreclr.so!QueueUserWorkItemManagedCallback(void*) + 161 at /__w/2/s/src/coreclr/src/vm/callhelpers.h:459
00007FB47A9A5CC0 00007FB481328B9A libcoreclr.so!ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) + 330 at /__w/2/s/src/coreclr/src/vm/threads.inl:34
00007FB47A9A5CC0 00007FB481328B98 libcoreclr.so!ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) + 328 at /__w/2/s/src/coreclr/src/vm/threads.cpp:7357
00007FB47A9A5CC0 00007FB481328B53 libcoreclr.so!ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) + 259 at /__w/2/s/src/coreclr/src/vm/threads.inl:34
00007FB47A9A5CC0 00007FB481328B53 libcoreclr.so!ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) + 259 at /__w/2/s/src/coreclr/src/vm/threads.inl:34
00007FB47A9A5CC0 00007FB481328AE4 libcoreclr.so!ManagedThreadBase_DispatchOuter(ManagedThreadCallState*) + 148 at /__w/2/s/src/coreclr/src/vm/threads.cpp:7559
00007FB47A9A5D10                  [DebuggerU2MCatchHandlerFrame: 00007fb47a9a5d10]
00007FB47A9A5DE0 00007FB48132921D libcoreclr.so!ManagedThreadBase::ThreadPool(void (*)(void*), void*) + 45 at /__w/2/s/src/coreclr/src/vm/threads.cpp:0
00007FB47A9A5DE0 00007FB481329205 libcoreclr.so!ManagedThreadBase::ThreadPool(void (*)(void*), void*) + 21 at /__w/2/s/src/coreclr/src/vm/threads.cpp:7331
00007FB47A9A5E10 00007FB4814A1D11 libcoreclr.so!ManagedPerAppDomainTPCount::DispatchWorkItem(bool*, bool*) + 321 at /__w/2/s/src/coreclr/src/vm/threadpoolrequest.cpp:643
00007FB47A9A5E60 00007FB481349939 libcoreclr.so!ThreadpoolMgr::WorkerThreadStart(void*) + 1337 at /__w/2/s/src/coreclr/src/vm/win32threadpool.cpp:1977
00007FB47A9A5E60 00007FB48134990E libcoreclr.so!ThreadpoolMgr::WorkerThreadStart(void*) + 1294 at /__w/2/s/src/coreclr/src/vm/win32threadpool.cpp:0
00007FB47A9A5EE0 00007FB48169F922 libcoreclr.so!CorUnix::CPalThread::ThreadEntry(void*) + 418 at /__w/2/s/src/coreclr/src/pal/src/include/pal/thread.hpp:469
00007FB47A9A5F00 00007FB482F336DB libpthread.so.0!start_thread + 219
00007FB47A9A5FC0 00007FB48211988F libc.so.6!clone + 63

As you can see from the fact that the clone is at the bottom of the stack, this has happened on a secondary thread. I can see that the primary thread is the native thread #12, which has stack trace like this:

    frame #0: 0x00007fb482a293f0 libstdc++.so.6`__cxxabiv1::__enum_type_info::~__enum_type_info()
    frame #1: 0x00007fb481b54480 libhostpolicy.so`deps_entry_t::~deps_entry_t() [inlined] std::string::_Rep::_M_dispose(this=<unavailable>, __a=0x00007ffd031b7e50) at basic_string.h:3251
    frame #2: 0x00007fb481b5443c libhostpolicy.so`deps_entry_t::~deps_entry_t() [inlined] std::string::~(this=error: summary string parsing error) at basic_string.h:3621
    frame #3: 0x00007fb481b54434 libhostpolicy.so`deps_entry_t::~deps_entry_t(this=0x000055e50cea8ef0) at deps_entry.h:30
    frame #4: 0x00007fb481b563b8 libhostpolicy.so`deps_json_t::~deps_json_t() [inlined] void std::_Destroy<deps_entry_t>(__pointer=0x000055e50cea8ef0) at stl_construct.h:98
    frame #5: 0x00007fb481b563ae libhostpolicy.so`deps_json_t::~deps_json_t() [inlined] void std::_Destroy_aux<false>::__destroy<deps_entry_t*>(__first=0x000055e50cea8ef0, __last=0x000055e50cea91f0) at stl_construct.h:108
    frame #6: 0x00007fb481b563a9 libhostpolicy.so`deps_json_t::~deps_json_t() [inlined] void std::_Destroy<deps_entry_t*>(__first=0x000055e50cea8ef0, __last=0x000055e50cea91f0) at stl_construct.h:136
    frame #7: 0x00007fb481b563a9 libhostpolicy.so`deps_json_t::~deps_json_t() [inlined] void std::_Destroy<deps_entry_t*, deps_entry_t>(__first=0x000055e50cea8ef0, __last=0x000055e50cea91f0, (null)=0x000055e50cde75a8) at stl_construct.h:206
    frame #8: 0x00007fb481b563a9 libhostpolicy.so`deps_json_t::~deps_json_t() [inlined] std::vector<deps_entry_t, std::allocator<deps_entry_t> >::~vector(this=0x000055e50cde75a8) at stl_vector.h:434
    frame #9: 0x00007fb481b56335 libhostpolicy.so`deps_json_t::~deps_json_t(this=0x000055e50cde7578) at deps_format.h:16
    frame #10: 0x00007fb481b5dd52 libhostpolicy.so`fx_definition_t::~fx_definition_t(this=0x000055e50cde7470) at fx_definition.h:12
    frame #11: 0x00007fb481b5cd30 libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t() [inlined] std::default_delete<fx_definition_t>::operator(__ptr=0x000055e50cde7470)(fx_definition_t*) const at unique_ptr.h:78
    frame #12: 0x00007fb481b5cd28 libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t() [inlined] std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> >::~unique_ptr(this=0x000055e50cde8d00) at unique_ptr.h:268
    frame #13: 0x00007fb481b5cd20 libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t() [inlined] void std::_Destroy<std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> > >(__pointer=0x000055e50cde8d00) at stl_construct.h:98
    frame #14: 0x00007fb481b5cd20 libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t() [inlined] void std::_Destroy_aux<false>::__destroy<std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> >*>(__first=0x000055e50cde8d00, __last=0x000055e50cde8d10) at stl_construct.h:108
    frame #15: 0x00007fb481b5ccfc libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t() [inlined] void std::_Destroy<std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> >*>(__first=0x000055e50cde8d00, __last=0x000055e50cde8d10) at stl_construct.h:136
    frame #16: 0x00007fb481b5ccfc libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t() [inlined] void std::_Destroy<std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> >*, std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> > >(__first=0x000055e50cde8d00, __last=0x000055e50cde8d10, (null)=0x00007fb481d93110) at stl_construct.h:206
    frame #17: 0x00007fb481b5ccfc libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t() [inlined] std::vector<std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> >, std::allocator<std::unique_ptr<fx_definition_t, std::default_delete<fx_definition_t> > > >::~vector(this=0x00007fb481d93110) at stl_vector.h:434
    frame #18: 0x00007fb481b5ccf4 libhostpolicy.so`hostpolicy_init_t::~hostpolicy_init_t(this=0x00007fb481d930b8) at hostpolicy_init.h:12
    frame #19: 0x00007fb48203b041 libc.so.6`___lldb_unnamed_symbol108$$libc.so.6 + 369
    frame #20: 0x00007fb48203b13a libc.so.6`exit + 26
    frame #21: 0x00007fb482019b9e libc.so.6`__libc_start_main + 238
    frame #22: 0x000055e50af06a0a dotnet`_start + 41

So the process is exiting. From this point of view, it looks pretty similar to the old fixed issue.

@janvorli
Copy link
Member

@bartonjs any idea why the openssl shutdown could get triggered even though we are passing the OPENSSL_INIT_NO_ATEXIT flag? (I've verified that we are passing that flag in there). The openssl on Ubuntu 18.04 is openssl 1.1.1 (I've verified that).

@bartonjs
Copy link
Member

Nope. Git-blame says that that code hasn't ever changed. https://github.com/openssl/openssl/blame/OpenSSL_1_1_1-stable/crypto/init.c#L650-L657.

... unless something in the process loaded OpenSSL before we did, so our NO_ATEXIT didn't matter?

@janvorli
Copy link
Member

@BrennanConroy do you happen to know the full command line used for running the crashed process? GDB shows just a part of it:

Core was generated by `/home/helixbot/work/93A00859/w/AFD10972/e/.dotnet7093/dotnet new mvc -au MultiO'.

@BrennanConroy
Copy link
Member Author

I don't know 100%, but my guess is:
dotnet new mvc -au MultiOrg --aad-instance https://login.microsoftonline.com/ --client-id bac81cbb-9fab-4a4e-9521-be07290c3a51

@janvorli
Copy link
Member

It seems to be the one, I've found all these as strings in the dump.

@janvorli
Copy link
Member

@bartonjs I've ran the command above, set a breakpoint at the exit function, and then just stepped through the exit code. And I've ended up being in OPENSSL_cleanup. I've also tried to set a breakpoint at all functions in the libssl.so.1.1 and I could see that the first hit came from our CryptoNative_EnsureOpenSslInitialized and that the OPENSSL_INIT_NO_ATEXIT was passed in.

@janvorli
Copy link
Member

So after installing the matching openssl sources on the Ubuntu 18.04, I've found a surprising thing. The sources don't contain the lines https://github.com/openssl/openssl/blame/OpenSSL_1_1_1-stable/crypto/init.c#L650-L657 at all! And examining the disassembly of the function confirmed that. While I can see references to symbols ossl_init_base_ossl_ or ossl_init_load_crypto_nodelete_ossl_, there is no reference to ossl_init_no_register_atexit_ossl_ or ossl_init_register_atexit_ossl_ (the macros add the _ossl_ at the end of the symbols you can see in the source). And these symbols don't exist anywhere in the source.

Unfortunately this version registers the OPENSSL_cleanup unconditionally.

@bartonjs
Copy link
Member

If the only real error we ever see is in ERR_get_error_string, maybe we want to register our own atexit after openssl init that sets a static and we just return NULL from the shim? (Since they're called in reverse order of registration we need to do it after) Presumably other things were happening more gracefully, since we tried to throw an exception.

@janvorli
Copy link
Member

It seems like a workaround we could possibly do, however there could be a race between the call to ERR_get_error_string and the new atexit handler you suggest. If the new atexit handler was called while the ERR_get_error_string was already running and the openssl's atexit handler was called before the ERR_get_error_string accessed the lock, we would still crash. So it seems we would need to place a lock around the ERR_get_error_string call and take the same lock for setting the static you've mentioned.

@wtgodbe
Copy link
Member

wtgodbe commented May 5, 2020

@janvorli
Copy link
Member

janvorli commented May 5, 2020

@bartonjs are you planning to implement the workaround you've suggested?

@bartonjs
Copy link
Member

bartonjs commented May 5, 2020

@janvorli It's on my todo list, but if you have some free time and want to jump on it first, that's good with me.

@janvorli
Copy link
Member

janvorli commented May 6, 2020

@bartonjs I don't have a free time now, but I'll ping you if I find some before you get to that.

@wfurt
Copy link
Member

wfurt commented May 6, 2020

I can take a look. Should we do that only for OpenSSL 1.1.1?

@bartonjs
Copy link
Member

bartonjs commented May 6, 2020

When it rains, it pours, eh? 😄 When Jan said he didn't have time I pushed this up to my current task. Hopefully I'll have a PR soon (though I don't know how to test to see that it covered everything we cared about).

@bartonjs
Copy link
Member

bartonjs commented May 6, 2020

OK, I have a fix... but every weird threading test I've thrown at it hasn't reproduced the problem (all of my threads seem to exit before my atexit handler gets called, whether they're background threads or foreground threads). So I don't get the satisfaction of empirical verification.

@bartonjs
Copy link
Member

bartonjs commented May 6, 2020

And... so 6 minutes later I finally get one that works (it required background threads, and apparently just luck).

@bartonjs
Copy link
Member

bartonjs commented May 6, 2020

Terrible code of race-induction:

using System;
using System.Reflection;
using System.Runtime.InteropServices;
using System.Security.Cryptography;
using System.Security.Cryptography.X509Certificates;
using System.Threading;

namespace RaceIt
{
    internal static class Program
    {
        private static int Main()
        {
            int count = 0;
            bool shutdown = false;
            Type type = typeof(X509Certificate2).Assembly.GetType("Interop").GetNestedType("Crypto", BindingFlags.NonPublic);
            MethodInfo info = type.GetMethod("CreateOpenSslCryptographicException", BindingFlags.NonPublic | BindingFlags.Static);
            Func<Exception> func = (Func<Exception>)info.CreateDelegate(typeof(Func<Exception>));
            try { new X509Certificate2(new byte[5]).Dispose(); } catch {}

            for (int i = 0; i < 50; i++)
            {
                Thread t = new Thread(
                    () =>
                    {
                        string last = null;
                        var stopwatch = new System.Diagnostics.Stopwatch();

                        while (stopwatch.ElapsedMilliseconds < 2000)
                        {
                            ERR_put_error(1, 2, 3, "this file", 17);
                            Exception e = func();
                            string cur = $"{e.GetType().Name}: {e.Message}";

                            if (cur != last)
                            {
                                Console.WriteLine(cur);
                                last = cur;
                            }

                            Interlocked.Increment(ref count);

                            if (shutdown) { stopwatch.Start(); }
                        }

                        Console.WriteLine($"{count} exceptions, says thread {Thread.CurrentThread.ManagedThreadId}.");
                    });

                t.IsBackground = true;
                t.Start();
            }

            for (int i = 10; i >= 1; i--)
            {
                Console.WriteLine(i);
                Thread.Sleep(1000);
            }

            shutdown = true;
            Console.WriteLine($"{count} exceptions, roughly.");
            return 2;
        }

        [DllImport("libcrypto.so.1.1")]
        private extern static void ERR_put_error(int lib, int func, int reason, string file, int line);
    }
}

@karelz karelz assigned bartonjs and unassigned wfurt May 7, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
@bartonjs bartonjs removed their assignment Jul 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Security bug os-linux Linux OS (any supported distro)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants