You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.
I set up continuous integration on different Linux distros and it turned out that our tests don't work under .NET Core 1.1 on Fedora 30/31/32, but work on CentOS/ Debian/ Ubuntu/ Fedora 29 etc.
Sometimes it fails due to abort() from here (if interested):
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50 return ret;
[Current thread is 1 (Thread 0x7fa75c48e700 (LWP 16915))]
compat-libicu60-60.2-2.fc29.x86_64 libgcc-9.2.1-1.fc32.x86_64 libstdc++-9.2.1-1.fc32.x86_64 libunwind-1.3.1-3.fc31.x86_64 libuuid-2.35-0.5.fc32.x86_64 sssd-client-2.2.2-1.fc32.x86_64
(gdb) bt
Missing separate debuginfos, use: dnf debuginfo-install#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1 0x00007fa76197e899 in __GI_abort () at abort.c:79
#2 0x00007fa760f11f4b in PROCEndProcess (hProcess=<optimized out>, uExitCode=2148734214, bTerminateUnconditionally=1) at /root/coreclr/src/pal/src/thread/process.cpp:1385
#3 0x00007fa760b8ea3b in SafeExitProcess (exitCode=2148734214, fAbort=1, sca=SCA_ExitProcessWhenShutdownComplete) at /root/coreclr/src/vm/eepolicy.cpp:579
#4 0x00007fa760b8ff88 in EEPolicy::HandleFatalError (exitCode=2148734214, address=140356859828821, pszMessage=0x0, pExceptionInfo=0x7fa75c48bad0) at /root/coreclr/src/vm/eepolicy.cpp:1506
#5 0x00007fa760c59a66 in LazyMachState::unwindLazyState (baseState=<optimized out>, unwoundState=0x7fa75c48c1f0, threadId=<optimized out>, funCallDepth=<optimized out>, hostCallPreference=AllowHostCalls)
at /root/coreclr/src/vm/amd64/gmsamd64.cpp:69
#6 0x00007fa760ac5d5e in HelperMethodFrame::InsureInit (this=0x7ffeab5528f0, initialInit=<optimized out>, unwindState=<optimized out>, hostCallPreference=AllowHostCalls) at /root/coreclr/src/vm/frames.cpp:1890
#7 0x00007fa760ac5b62 in HelperMethodFrame::GetFunction (this=0x7ffeab5528f0) at /root/coreclr/src/vm/frames.cpp:1808
#8 0x00007fa760b2332b in StackFrameIterator::ProcessCurrentFrame (this=0x7fa75c48c3d0) at /root/coreclr/src/vm/stackwalk.cpp:2993
#9 0x00007fa760b24d4d in StackFrameIterator::NextRaw (this=0x7fa75c48c3d0) at /root/coreclr/src/vm/stackwalk.cpp:2743
#10 0x00007fa760b22ab8 in StackFrameIterator::Next (this=<optimized out>) at /root/coreclr/src/vm/stackwalk.cpp:1615
#11 Thread::StackWalkFramesEx (this=0x121a600, pRD=<optimized out>, pCallback=0x7fa760ba1020 <GcStackCrawlCallBack(CrawlFrame*, void*)>, pData=0x7fa75c48d890, flags=34048, pStartFrame=0x0) at /root/coreclr/src/vm/stackwalk.cpp:966
#12 0x00007fa760b22ead in Thread::StackWalkFrames (this=0x121a600, pCallback=0x7fa760ba1020 <GcStackCrawlCallBack(CrawlFrame*, void*)>, pData=0x7fa75c48d890, flags=34048, pStartFrame=0x0) at /root/coreclr/src/vm/stackwalk.cpp:1043
#13 0x00007fa760ba1975 in ScanStackRoots (fn=<optimized out>, sc=<optimized out>, pThread=<optimized out>) at /root/coreclr/src/vm/gcenv.ee.cpp:544
#14 GCToEEInterface::GcScanRoots (fn=<optimized out>, condemned=<optimized out>, max_gen=<optimized out>, sc=0x7fa75c48d940) at /root/coreclr/src/vm/gcenv.ee.cpp:573
#15 0x00007fa760d9f74a in WKS::gc_heap::mark_phase (condemned_gen_number=2, mark_only_p=0) at /root/coreclr/src/gc/gc.cpp:19490
#16 0x00007fa760d9cb74 in WKS::gc_heap::gc1 () at /root/coreclr/src/gc/gc.cpp:15233
#17 0x00007fa760da709d in WKS::gc_heap::garbage_collect (n=<optimized out>) at /root/coreclr/src/gc/gc.cpp:16751
#18 0x00007fa760d9979f in WKS::GCHeap::GarbageCollectGeneration (this=<optimized out>, gen=<optimized out>, reason=WKS::reason_induced) at /root/coreclr/src/gc/gc.cpp:35231
#19 0x00007fa760dc1b89 in WKS::GCHeap::GarbageCollectTry (generation=<optimized out>, mode=<optimized out>, this=<optimized out>, low_memory_p=<optimized out>) at /root/coreclr/src/gc/gc.cpp:34846
#20 WKS::GCHeap::GarbageCollect (this=<optimized out>, generation=<optimized out>, low_memory_p=<optimized out>, mode=<optimized out>) at /root/coreclr/src/gc/gc.cpp:34786
#21 0x00007fa760c396ca in ETW::GCLog::ForceGCForDiagnostics () at /root/coreclr/src/vm/eventtrace.cpp:1036
#22 0x00007fa760be13ab in ProfToEEInterfaceImpl::ForceGC (this=<optimized out>) at /root/coreclr/src/vm/proftoeeinterfaceimpl.cpp:4823
#23 0x00007fa75e83c948
#24 0x00007fa75e83eabe
#25 0x00007fa75e83e972
#26 0x00007fa75e83c561
#27 0x00007fa75e83b384
#28 0x00007fa761e86482 in start_thread (arg=<optimized out>) at pthread_create.c:477
#29 0x00007fa761a5a583 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
which means CoreCLR was unable to unwind the stack. It regularly happens on Fedora 30 (or later) which uses libunwind-1.3.1 as the default version, but not on others that use libunwind-1.2.1.
For now it looks pretty clear: msync()/mincore() validation was added 9 years ago (28f33c8), but due to the bug in mincore() detection, msync() was used all the time till v1.3. Then, this bug was fixed in v1.3-stable by ot [PATCH] x86_64: fix mincore_validate (bc8698f), and starting from this moment access_mem was backed by mincore_validate, which fails if the address was swapped out.
I have checked that v1.3.1 with cherry-picked 05d814b works fine and solves my issue. @djwatson Is there anything I can help with to get it fixed in v1.3?
The text was updated successfully, but these errors were encountered:
@djwatson Sorry for delay. Yes, I'll be still happy to port this to v1.3-stable because some distros will continue to use this version. Here is PR #167.
Hi!
I set up continuous integration on different Linux distros and it turned out that our tests don't work under .NET Core 1.1 on Fedora 30/31/32, but work on CentOS/ Debian/ Ubuntu/ Fedora 29 etc.
Sometimes it fails due to
abort()from here (if interested):which means CoreCLR was unable to unwind the stack. It regularly happens on Fedora 30 (or later) which uses libunwind-1.3.1 as the default version, but not on others that use libunwind-1.2.1.
Trying to figure out what the problem is, I found that the issue has been fixed in v1.4-rc1 by @ShutterQuick Don't check if the memory is in core #64 (05d814b), after having the same issue: https://github.com/dotnet/coreclr/issues/15840.
For now it looks pretty clear:
msync()/mincore()validation was added 9 years ago (28f33c8), but due to the bug inmincore()detection,msync()was used all the time till v1.3. Then, this bug was fixed in v1.3-stable by ot [PATCH] x86_64: fix mincore_validate (bc8698f), and starting from this momentaccess_memwas backed bymincore_validate, which fails if the address was swapped out.I have checked that v1.3.1 with cherry-picked 05d814b works fine and solves my issue. @djwatson Is there anything I can help with to get it fixed in v1.3?
The text was updated successfully, but these errors were encountered: