-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Description
Description
See initial thread: dotnet/diagnostics#3600
While developing CLR profilers, I encountered a segfault that happens systematically when attaching for the second time to a dotnet process in Linux (Debian 11). I don't know if this because it's running in docker or not.
I managed to take a core dump that you can get here: segfault_coredump.zip
Using lldb + sos, I was able to load the symbols and here the stacktrace for the thread that ends in segfault:
* thread #1, name = 'dotnet', stop reason = signal SIGSEGV
* frame #0: 0x00007f8c755eb06c libcoreclr.so`EESocketCleanupHelper(bool) [inlined] InterlockedOr(Destination=0x0000000000000008, Value=64) at pal.h:3748:1
frame #1: 0x00007f8c755eb06c libcoreclr.so`EESocketCleanupHelper(bool) [inlined] Thread::SetThreadState(this=0x0000000000000000, ts=TS_ExecutingOnAltStack) at threads.h:1060
frame #2: 0x00007f8c755eb06c libcoreclr.so`EESocketCleanupHelper(bool) [inlined] Thread::SetExecutingOnAltStack(this=0x0000000000000000) at threads.h:1257
frame #3: 0x00007f8c755eb06c libcoreclr.so`EESocketCleanupHelper(isExecutingOnAltStack=<unavailable>) at ceemain.cpp:560
frame #4: 0x00007f8c756020b8 libcoreclr.so`sigsegv_handler(int, siginfo_t*, void*) [inlined] invoke_previous_action(action=<unavailable>, code=11, siginfo=0x00007f8c73fd1bf0, context=0x00007f8c73fd1ac0, signalRestarts=true) at signal.cpp:430:5
frame #5: 0x00007f8c7560204e libcoreclr.so`sigsegv_handler(code=11, siginfo=0x00007f8c73fd1bf0, context=0x00007f8c73fd1ac0) at signal.cpp:639
frame #6: 0x00007f8c75cf1140 libpthread.so.0`__restore_rt
frame #7: 0x00007f8c75d0de7a ld-linux-x86-64.so.2`___lldb_unnamed_symbol38$$ld-linux-x86-64.so.2 + 10
frame #8: 0x00007f8c75d0e3a4 ld-linux-x86-64.so.2`___lldb_unnamed_symbol39$$ld-linux-x86-64.so.2 + 932
frame #9: 0x00007f8c75d0ece1 ld-linux-x86-64.so.2`___lldb_unnamed_symbol40$$ld-linux-x86-64.so.2 + 289
frame #10: 0x00007f8c7590e3a4 libc.so.6`___lldb_unnamed_symbol1115$$libc.so.6 + 116
frame #11: 0x00007f8c75cd93b4 libdl.so.2`___lldb_unnamed_symbol6$$libdl.so.2 + 20
frame #12: 0x00007f8c7590ea90 libc.so.6`_dl_catch_exception + 128
frame #13: 0x00007f8c7590eb4f libc.so.6`_dl_catch_error + 47
frame #14: 0x00007f8c75cd9a65 libdl.so.2`___lldb_unnamed_symbol11$$libdl.so.2 + 101
frame #15: 0x00007f8c75cd941c libdl.so.2`dlsym + 92
frame #16: 0x00007f8c7560e44c libcoreclr.so`::GetProcAddress(hModule=0x00007f8be8002790, lpProcName="") at module.cpp:333:33
frame #17: 0x00007f8c754d56de libcoreclr.so`FakeCoCreateInstanceEx(_GUID const&, char16_t const*, _GUID const&, void**, void**) [inlined] (anonymous namespace)::FakeCoCallDllGetClassObject(rclsid=<unavailable>, wszDllPath=u"/tmp/dr-dotnet/libprofilers.so", riid=<unavailable>, ppv=0x00007f8c747d13c8, phmodDll=<unavailable>) at util.cpp:224:68
frame #18: 0x00007f8c754d5639 libcoreclr.so`FakeCoCreateInstanceEx(rclsid=0x00007f8be800794c, wszDllPath=u"/tmp/dr-dotnet/libprofilers.so", riid=<unavailable>, ppv=0x00007f8c747d1678, phmodDll=0x00007f8c747d1658) at util.cpp:292
frame #19: 0x00007f8c75319bea libcoreclr.so`EEToProfInterfaceImpl::CreateProfiler(_GUID const*, char const*, char16_t const*) [inlined] CoCreateProfiler(pClsid=<unavailable>, szClsid=<unavailable>, wszProfileDLL=<unavailable>, ppCallback=<unavailable>, phmodProfilerDLL=0x00007f8c747d1658) at eetoprofinterfaceimpl.cpp:293:10
frame #20: 0x00007f8c75319b9e libcoreclr.so`EEToProfInterfaceImpl::CreateProfiler(this=0x00007f8be8007590, pClsid=<unavailable>, szClsid="{805a308b-061c-47f3-9b30-f785c3186e82}", wszProfileDLL=<unavailable>) at eetoprofinterfaceimpl.cpp:667
frame #21: 0x00007f8c75319908 libcoreclr.so`EEToProfInterfaceImpl::Init(this=0x00007f8be8007590, pProfToEE=0x00007f8be8007380, pClsid=0x00007f8be800794c, szClsid="{805a308b-061c-47f3-9b30-f785c3186e82}", wszProfileDLL=u"/tmp/dr-dotnet/libprofilers.so", fLoadedViaAttach=<unavailable>, dwConcurrentGCWaitTimeoutInMs=10) at eetoprofinterfaceimpl.cpp:581:14
frame #22: 0x00007f8c75388d21 libcoreclr.so`ProfilingAPIUtility::DoPreInitialization(pEEProf=0x00007f8be8007590, pClsid=0x00007f8be800794c, szClsid="{805a308b-061c-47f3-9b30-f785c3186e82}", wszProfilerDLL=u"/tmp/dr-dotnet/libprofilers.so", loadType=kAttachLoad, dwConcurrentGCWaitTimeoutInMs=10) at profilinghelper.cpp:989:19
frame #23: 0x00007f8c75388779 libcoreclr.so`ProfilingAPIUtility::LoadProfiler(loadType=kAttachLoad, pClsid=0x00007f8be800794c, szClsid="{805a308b-061c-47f3-9b30-f785c3186e82}", wszProfilerDLL=u"/tmp/dr-dotnet/libprofilers.so", pvClientData=0x00007f8be80014aa, cbClientData=<unavailable>, dwConcurrentGCWaitTimeoutInMs=10) at profilinghelper.cpp:1133:10
frame #24: 0x00007f8c755e0a74 libcoreclr.so`ds_profiler_protocol_helper_handle_ipc_message(_DiagnosticsIpcMessage*, _DiagnosticsIpcStream*) [inlined] ProfilingAPIUtility::LoadProfilerForAttach(pClsid=<unavailable>, wszProfilerDLL=u"/tmp/dr-dotnet/libprofilers.so", pvClientData=0x00007f8be80014aa, cbClientData=37, dwConcurrentGCWaitTimeoutInMs=10) at profilinghelper.inl:183:12
frame #25: 0x00007f8c755e0a47 libcoreclr.so`ds_profiler_protocol_helper_handle_ipc_message(_DiagnosticsIpcMessage*, _DiagnosticsIpcStream*) at ds-rt-coreclr.h:294
frame #26: 0x00007f8c755e0a47 libcoreclr.so`ds_profiler_protocol_helper_handle_ipc_message(_DiagnosticsIpcMessage*, _DiagnosticsIpcStream*) [inlined] profiler_protocol_helper_attach_profiler(message=<unavailable>, stream=0x00007f8be8007570) at ds-profiler-protocol.c:129
frame #27: 0x00007f8c755e0938 libcoreclr.so`ds_profiler_protocol_helper_handle_ipc_message(message=<unavailable>, stream=0x00007f8be8007570) at ds-profiler-protocol.c:269
frame #28: 0x00007f8c755dbfb1 libcoreclr.so`server_thread(data=<unavailable>) at ds-server.c:167:4
frame #29: 0x00007f8c7563bfee libcoreclr.so`CorUnix::CPalThread::ThreadEntry(pvParam=0x000055ec7ccbb580) at thread.cpp:1829:16
frame #30: 0x00007f8c75ce5ea7 libpthread.so.0`start_thread + 215
frame #31: 0x00007f8c758d4a2f libc.so.6`__clone + 63Reproduction Steps
Reproduction is currently a little complex, but I believe it could be reproduced by attaching twice in linux (eg: tweaking the CLR GC profiler test to attach, detach and reattach).
Expected behavior
Attach without segfault
Actual behavior
Segfault and crash the profiler application
Regression?
No response
Known Workarounds
No response
Configuration
The bug was observed under net 6.0 and net 7.0. I haven't tested on earlier versions.
Linux version is Debian 11 (the one that is shipped with the CLR on Microsoft docker repository)
Other information
No response