Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky crashes on arm64 server HW (with weak memory model) #52333

Closed
mkustermann opened this issue May 10, 2023 · 2 comments
Closed

Flaky crashes on arm64 server HW (with weak memory model) #52333

mkustermann opened this issue May 10, 2023 · 2 comments
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. gardening

Comments

@mkustermann
Copy link
Member

See crash in GC from this log:

Running: retainers -n10 filter (closure global) String

stderr:
===== CRASH =====
si_signo=Illegal instruction(4), si_code=1, si_addr=0xffffbf6b6000
version=3.1.0-edge.e98d824b5ee84d6e21d005f4e26e94725ef3ad4e (be) (Wed May 10 03:57:31 2023 +0000) on "linux_arm64"
pid=9796, thread=12903, isolate_group=main(0xaaaadf979cf0), isolate=(nil)((nil))
os=linux, arch=arm64, comp=no, sim=no
isolate_instructions=aaaad30ca100, vm_instructions=aaaad30ca100
fp=ffffbdffab90, sp=ffffbdffab90, pc=ffffbf6b6000
  pc 0x0000ffffbf6b6000 fp 0x0000ffffbdffab90 Unknown symbol
  pc 0x0000aaaad3207814 fp 0x0000ffffbdffabd0 dart::IsolateGroup::VisitWeakPersistentHandles(dart::HandleVisitor*)+0x5c
  pc 0x0000aaaad335d794 fp 0x0000ffffbdffaca0 dart::Scavenger::Scavenge(dart::Thread*, dart::GCType, dart::GCReason)+0x31c
  pc 0x0000aaaad334e1a8 fp 0x0000ffffbdffadc0 dart::Heap::CollectNewSpaceGarbage(dart::Thread*, dart::GCType, dart::GCReason)+0x104
  pc 0x0000aaaad334d750 fp 0x0000ffffbdffae00 dart::Heap::AllocateNew(dart::Thread*, long)+0x110
  pc 0x0000aaaad32458a0 fp 0x0000ffffbdffae40 dart::Object::Allocate(long, long, dart::Heap::Space, bool)+0x50
  pc 0x0000aaaad327a08c fp 0x0000ffffbdffae90 dart::TypedData::New(long, long, dart::Heap::Space)+0x168
  pc 0x0000aaaad32f8658 fp 0x0000ffffbdffb3f0 dart::DRT_AllocateTypedData(dart::NativeArguments)+0x198
  pc 0x0000ffffbef02f4c fp 0x0000ffffbdffb470 Unknown symbol
  pc 0x0000ffffbef018fc fp 0x0000ffffbdffb4a8 Unknown symbol
  pc 0x0000ffff9ee1eaa0 fp 0x0000ffffbdffb508 Unknown symbol
  pc 0x0000ffffbe27b2ac fp 0x0000ffffbdffb560 Unknown symbol
  pc 0x0000ffff9ee1b684 fp 0x0000ffffbdffb5e8 Unknown symbol
...

--- Re-run this test:
python3 tools/test.py -n vm-linux-release-arm64 vm/dart/heapsnapshot_cli_test

Or crash in isolate shutdown from this log



00:02 �[32m+37�[0m: All tests passed!�[0m

stderr:
===== CRASH =====
si_signo=Illegal instruction(4), si_code=1, si_addr=0xffff8a25e000
version=3.1.0-edge.e98d824b5ee84d6e21d005f4e26e94725ef3ad4e (be) (Wed May 10 03:57:31 2023 +0000) on "linux_arm64"
pid=39066, thread=39066, isolate_group=main(0xaaaabb0443c0), isolate=(nil)((nil))
os=linux, arch=arm64, comp=no, sim=no
isolate_instructions=ffff89cbd6c0, vm_instructions=ffff89cb7780
fp=ffffe075b8e0, sp=ffffe075b8e0, pc=ffff8a25e000
  pc 0x0000ffff8a25e000 fp 0x0000ffffe075b8e0 Unknown symbol
  pc 0x0000aaaab4b8b6ac fp 0x0000ffffe075b940 dart::Isolate::LowLevelCleanup(dart::Isolate*)+0x1b0
  pc 0x0000aaaab4b8c450 fp 0x0000ffffe075be90 dart::Isolate::Shutdown()+0xf4
  pc 0x0000aaaab4ef1d18 fp 0x0000ffffe075c3d0 Dart_ShutdownIsolate+0xc8
  pc 0x0000aaaab4a54c90 fp 0x0000ffffe075c470 dart::bin::RunMainIsolate(char const*, char const*, bool, dart::bin::CommandLineOptions*)+0x258
  pc 0x0000aaaab4a55678 fp 0x0000ffffe075d580 dart::bin::main(int, char**)+0x50c
  pc 0x0000aaaab4a54a14 fp 0x0000ffffe075d5e0 main+0x10
  pc 0x0000ffff8a001720 fp 0x0000ffffe075d5f0 __libc_start_main+0xe0
-- End of DumpStackTrace

--- Re-run this test:
python3 tools/test.py -n vm-aot-linux-release-arm64 vm/dart/heapsnapshot_cli_test

Unclear why it would hit an illegal instruction in runtime code.

Should try to get access to the bot and try reproducing it there.

/cc @rmacnak-google

@mkustermann mkustermann added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. gardening crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. labels May 10, 2023
@rmacnak-google
Copy link
Contributor

I can reproduce a crash on Linux (but not Mac) running on an M1. The VM is trying to invoke a finalizable handle callback.

../../runtime/vm/dart_api_state.h: 337: error: expected: (void*)callback != (void*)0xfffff7ff6000
version=3.1.0-edge.28c2491f8db4934e0a67b6ee9ab79d2743c0af93 (be) (Wed May 10 16:12:28 2023 +0000) on "linux_arm64"
pid=34129, thread=34138, isolate_group=main(0xaaaaad351070), isolate=main(0xaaaaad3503c0)
os=linux, arch=arm64, comp=no, sim=no
isolate_instructions=aaaaacbf7c80, vm_instructions=aaaaacbf7c80
fp=ffffd1bfc2f0, sp=ffffd1bfb1c8, pc=aaaaacde57c8
  pc 0x0000aaaaacde57c8 fp 0x0000ffffd1bfc2f0 dart::Profiler::DumpStackTrace(void*)+0x80
-- End of DumpStackTrace
  pc 0x0000000000000000 fp 0x0000ffffd1bfc590 sp 0x0000000000000000 Cannot find code object
  pc 0x0000ffffcaa405a0 fp 0x0000ffffd1bfc630 sp 0x0000ffffd1bfc5a0 [Optimized] FfiTrampoline_Dart_NewExternalTypedDataWithFinalizer
  pc 0x0000ffffcaa3ddd0 fp 0x0000ffffd1bfc6a0 sp 0x0000ffffd1bfc640 [Unoptimized] toExternalDataWithFinalizer
  pc 0x0000ffffcaa37b98 fp 0x0000ffffd1bfc738 sp 0x0000ffffd1bfc6b0 [Unoptimized] mmapFile
  pc 0x0000ffffcaa37118 fp 0x0000ffffd1bfc788 sp 0x0000ffffd1bfc748 [Unoptimized] mmapOrReadFileSync
  pc 0x0000ffffd7773508 fp 0x0000ffffd1bfc820 sp 0x0000ffffd1bfc798 [Unoptimized] LoadCommand.executeInternal
  pc 0x0000ffffd776f984 fp 0x0000ffffd1bfc8c8 sp 0x0000ffffd1bfc830 [Unoptimized] Command.execute
  pc 0x0000ffffd776dc34 fp 0x0000ffffd1bfc948 sp 0x0000ffffd1bfc8d8 [Unoptimized] CommandRunner.run
  pc 0x0000ffffd77619a8 fp 0x0000ffffd1bfc9a8 sp 0x0000ffffd1bfc958 [Unoptimized] main.<anonymous closure>.run

And it looks like pkg:mmap has only x64 code for its finalizer :)

@mkustermann
Copy link
Member Author

Makes perfect sense. Thanks @rmacnak-google for trying on arm64 HW (I don't have access to any atm).

copybara-service bot pushed a commit that referenced this issue May 11, 2023
TEST=pkg/mmap, local qemu
Bug: #52333
Bug: #51591
Change-Id: I939f225b2e0e08a20a07cd7bf3147b2ad8a737c8
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/302453
Reviewed-by: Martin Kustermann <kustermann@google.com>
Commit-Queue: Ryan Macnak <rmacnak@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. crash Process exits with SIGSEGV, SIGABRT, etc. An unhandled exception is not a crash. gardening
Projects
None yet
Development

No branches or pull requests

2 participants