Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install Abseil failure signal handler in distributor/proton daemons #30873

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

vekterli
Copy link
Member

@toregge and @baldersheim please review.

This will attempt to dump a stack trace for the offending thread to stderr, which should greatly improve crash visibility and debug-ability for everyone running Vespa on systems with core dumps disabled.

Signal handler chaining is explicitly enabled to allow sanitizer handlers to be called as expected.

Note that we install our own signal handlers after the Abseil handlers to avoid noisy stack dumping on SIGTERM. It is considered a fatal signal by the failure handler, but the config sentinel uses it as a friendly "please shutdown now, or else 😘" nudge in the common case.

Example log output when doing a manual kill -SEGV of a searchnode process that is also TSan-instrumented:

WARNING searchnode       stderr       *** SIGSEGV received at time=1712753635 on cpu 2 ***
WARNING searchnode       stderr       PC: @     0xfffff235afb4  (unknown)  nanosleep
WARNING searchnode       stderr           @     0xfffff6e3a5e8        464  absl::lts_20240116::AbslFailureSignalHandler()
WARNING searchnode       stderr           @     0xfffff6ead38c       5008  (unknown)
WARNING searchnode       stderr           @     0xfffff235afa0         16  nanosleep
WARNING searchnode       stderr           @     0xfffff6ed3ef8        128  __interceptor_nanosleep
WARNING searchnode       stderr           @           0x527010        704  App::main()
WARNING searchnode       stderr           @           0x5274e0        272  main
WARNING searchnode       stderr           @     0xffffe8a6a384         48  __libc_start_main
WARNING searchnode       stderr       ThreadSanitizer:DEADLYSIGNAL
WARNING searchnode       stderr       ThreadSanitizer: nested bug in the same thread, aborting.

Prior to this, all we'd get (if no core dumps are enabled) is the sound of silence and an exit code log message from the config sentinel.

This mostly resolves issue #29928, but does not currently apply to all C++ processes.

This will attempt to dump a stack trace for the offending thread
to stderr, which greatly improves visibility for everyone running
Vespa on systems with core dumps disabled.

Signal handler chaining is explicitly enabled to allow sanitizer
handlers to be called as expected.

Note that we install our own signal handlers _after_ the Abseil
handlers to avoid noisy stack dumping on `SIGTERM`. It is considered
a fatal signal by the failure handler, but the config sentinel
uses it as a friendly "please shutdown now, or else" nudge in the
common case.
@vekterli vekterli merged commit 8234c73 into master Apr 10, 2024
2 checks passed
@vekterli vekterli deleted the vekterli/install-abseil-failure-handler branch April 10, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants