Skip to content

fix(profiler): close SIGVTALRM race in pthread_create wrapper teardown (PROF-14603)#529

Closed
jbachorik wants to merge 1 commit into
muse/sigsegv-in-recordingfrom
muse/teardown-race-sigvtalrm
Closed

fix(profiler): close SIGVTALRM race in pthread_create wrapper teardown (PROF-14603)#529
jbachorik wants to merge 1 commit into
muse/sigsegv-in-recordingfrom
muse/teardown-race-sigvtalrm

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

What does this PR do?:

Wraps Profiler::unregisterThread() + ProfiledThread::release() under a single SignalBlocker via a new unregister_and_release() helper, closing the race window where a SIGVTALRM delivered between unregisterThread() returning and release() acquiring its internal guard would dereference a dangling ProfiledThread pointer.

Motivation:

JavaThread::~JavaThread / OSThread::~OSThread crashed (SIGSEGV) on JDK 25 because the signal handler called currentSignalSafe() in the race window above and dereferenced freed memory. PROF-14603.

Additional Notes:

How to test the change?:

  • thread_teardown_safety_ut.cpp adds ThreadTeardownSafetyTest (T-01..T-10) covering the full teardown lifecycle under signal load.
  • CI: test-linux-musl-aarch64 and test-linux-glibc-amd64 matrices must remain green.

For Datadog employees:

  • This PR doesn't touch any of that.

  • JIRA: PROF-14603

JavaThread::~JavaThread / OSThread::~OSThread crashed on JDK 25 when the
ddprof pthread_create hook delivered SIGVTALRM between
Profiler::unregisterThread() returning and ProfiledThread::release()
acquiring its internal guard. The signal handler called
currentSignalSafe() and dereferenced the now-freed ProfiledThread.

Fix: extract unregister_and_release(tid) — a noinline helper that holds
a SignalBlocker for the entire unregister+release sequence. Both
start_routine_wrapper and start_routine_wrapper_spec invoke it; the
race window is eliminated without duplicating signal-masking logic.

Same SignalBlocker pattern is applied to perfEvents_linux.cpp's
pthread_setspecific_hook teardown path.

thread.h guards clearCurrentThreadTLS() with #ifdef UNIT_TEST so it
is absent from production builds; GtestTaskBuilder.kt adds -DUNIT_TEST
to the gtest compiler flags so the guarded method compiles in tests.

thread_teardown_safety_ut.cpp adds an acceptance-test suite
(ThreadTeardownSafetyTest T-01..T-10) covering the full teardown
lifecycle under signal load.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

Folded into #510. The PROF-14603 race fix is now the third commit on muse/sigsegv-in-recording (unregister_and_release helper in libraryPatcher_linux.cpp + perfEvents_linux.cpp + thread_teardown_safety_ut.cpp + GtestTaskBuilder.kt + thread.h UNIT_TEST guard). Since both fixes touch libraryPatcher_linux.cpp and address the same wrapper, a single PR is cleaner than stacking.

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 13, 2026

CI Test Results

Run: #25810280499 | Commit: 4ddb70a | Duration: 3h 1m 5s (longest job)

22 of 32 test jobs failed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal 🚫 - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Failed Tests

musl-amd64/debug / 17-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 25-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 25-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 11-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 8-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 11-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 21-librca

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 17-librca

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8-j9

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 11-j9

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 8-librca

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 17-j9

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 17-graal

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 11

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8-ibm

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8-orcl

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 25

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 17

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 21-librca

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 21

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 25-graal

Job: View logs

No detailed failure information available. Check the job logs.

Summary: Total: 32 | Passed: 9 | Failed: 22 | Cancelled: 1


Updated: 2026-05-13 18:59:51 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant