Skip to content

fix(profiler): pre-register vtable receiver classes via JVMTI (PROF-14618)#527

Draft
jbachorik wants to merge 1 commit into
muse/crash-sigsegv-in-std-rb-tree-increment-cleanfrom
muse/vtable-target-jvmti-prereg
Draft

fix(profiler): pre-register vtable receiver classes via JVMTI (PROF-14618)#527
jbachorik wants to merge 1 commit into
muse/crash-sigsegv-in-std-rb-tree-increment-cleanfrom
muse/vtable-target-jvmti-prereg

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

@jbachorik jbachorik commented May 13, 2026

What does this PR do?:

Restores vtable_target frame capture in CPU-only recordings. After PR #512 (signal-safe walkVM), _class_map was only populated by allocation/liveness paths via lookupClass. In a CPU-only recording these paths never fire, so bounded_lookup(.., size_limit=0) in hotspotSupport.cpp always missed and vtable-target frames were silently dropped.

This change pre-populates _class_map from safe JVM-thread contexts:

  • Profiler::preregisterLoadedClasses(jvmtiEnv*) calls GetLoadedClasses + GetClassSignature + ObjectSampler::normalizeClassSignature, inserting every reference class into _class_map. Invoked inside the existing exclusive _class_map_lock at Profiler::start() and Profiler::dump() so signal handlers see a coherent, fully-populated map.
  • VM::ClassPrepare is moved out-of-line to vmEntry.cpp (vmEntry.h is included by profiler.h, hence the prior inline definition could not reference Profiler::instance()). The handler still calls loadMethodIDs first, then conditionally inserts the new class when _features.vtable_target is set.

The hot signal-handler path in hotspotSupport.cpp is unchanged — pre-registration runs only on JVM threads, never under a signal.

Motivation:

Follow-up to PR #512 / PROF-14582. lookupClass from a signal handler is unsafe (malloc inside Dictionary::lookup), so the signal-safe path was made read-only via bounded_lookup(size_limit=0). That left CPU-only recordings without any path that would populate the map, silently breaking the vtable-target feature.

Additional Notes:

  • Profiler::start() is called before _features is assigned, so the gate inside the exclusive-lock block reads args._features.vtable_target && VMStructs::hasClassNames() (mirroring the sanitisation at profiler.cpp:1129). Profiler::dump() uses the same condition on _features directly.
  • preregisterLoadedClasses calls _class_map.lookup directly rather than Profiler::lookupClass, because the caller already holds the exclusive lock — lookupClass would re-attempt tryLockShared and either spin or return -1, defeating the purpose.
  • JVMTI errors on GetLoadedClasses / GetClassSignature are now logged via Log::warn so silent skips are observable.
  • Primitive and array signatures from GetClassSignature pass through normalizeClassSignature unstripped; the resulting keys never match walkVM's lookup format, so they are harmless dictionary entries.
  • Locking-overhead concerns raised in code review are addressed by the locking-improvement work in fix(profiler): lock-free class/endpoint/context maps via TripleBufferedDictionary #524 — once that lands, _class_map_lock no longer blocks signal readers across the full bulk-iteration window.
  • The Java integration test was intentionally dropped from this PR. vtable_target has no CLI toggle yet and the synthetic BCI_ALLOC frame is recorded as a class id (not as text in STACK_TRACE_STRING), so a useful integration test needs a separate follow-up.

How to test the change?:

  • New C++ unit test: dictionary_concurrent_ut.cpp::BulkInsertThenBoundedLookupHitsMirrorInsertedIds validates that bulk pre-registration under an exclusive lock makes every key visible to bounded_lookup(.., 0) from a shared context, and that absent keys return INT_MAX.
  • All 6 DictionaryConcurrent tests pass locally via ./gradlew :ddprof-lib:gtestDebug_dictionary_concurrent_ut.

For Datadog employees:

  • If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.
  • JIRA: PROF-14618

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR restores vtable_target frame capture for CPU-only recordings by ensuring _class_map is populated from JVM-thread contexts (rather than relying on allocation/liveness paths that don’t run in CPU-only mode), keeping the signal-handler lookup path read-only and signal-safe.

Changes:

  • Bulk pre-registers loaded classes into _class_map during Profiler::start() and Profiler::dump() under the existing exclusive _class_map_lock.
  • Moves VM::ClassPrepare out-of-line and extends it to insert newly prepared classes into _class_map when vtable_target is enabled.
  • Adds a C++ unit test covering the “bulk insert then bounded_lookup(…, 0) hits” contract and stages a (currently disabled) Java integration test.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
ddprof-test/src/test/java/com/datadoghq/profiler/cpu/VtableTargetCpuTest.java Adds a staged (disabled) integration test outlining expected vtable-target behavior in CPU samples.
ddprof-lib/src/test/cpp/dictionary_concurrent_ut.cpp Adds a unit test validating bulk insertion visibility to bounded_lookup(..., 0) and sentinel behavior on miss.
ddprof-lib/src/main/cpp/vmEntry.h Moves VM::ClassPrepare definition out of the header into vmEntry.cpp.
ddprof-lib/src/main/cpp/vmEntry.cpp Implements VM::ClassPrepare to insert normalized class signatures into the class map when vtable_target is enabled.
ddprof-lib/src/main/cpp/profiler.h Adds Profiler::preregisterLoadedClasses(jvmtiEnv*) declaration and documents intended locking/threading constraints.
ddprof-lib/src/main/cpp/profiler.cpp Calls preregisterLoadedClasses() during start/dump class-map resets and implements the bulk JVMTI enumeration/insertion logic.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

const char* slice = nullptr;
size_t slice_len = 0;
if (ObjectSampler::normalizeClassSignature(sig, &slice, &slice_len)) {
(void)profiler->lookupClass(slice, slice_len);
Comment thread ddprof-lib/src/main/cpp/profiler.h Outdated
Comment on lines +216 to +219
// Pre-populate _class_map with all currently-loaded reference classes so that
// signal-safe lookups in walkVM (vtable_target) can resolve them without ever
// needing to malloc. Caller MUST hold _class_map_lock EXCLUSIVELY. Runs on a
// JVM thread (never in a signal handler).
Comment on lines +253 to +257
// (4a) Bulk exclusive-lock insert (mimicking preregisterLoadedClasses) followed
// by read-only bounded_lookup(0) on each inserted key. Verifies that the
// pre-registration contract holds: every key inserted while the exclusive lock
// is held is subsequently visible to bounded_lookup(0) from a shared context.
TEST(DictionaryConcurrent, BulkInsertThenBoundedLookupHitsMirrorInsertedIds) {
@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 13, 2026

CI Test Results

Run: #25813605948 | Commit: 3196908 | Duration: 25m 22s (longest job)

23 of 32 test jobs failed

Status Overview

JDK glibc-aarch64/debug glibc-amd64/debug musl-aarch64/debug musl-amd64/debug
8 - - -
8-ibm - - -
8-j9 - -
8-librca - -
8-orcl - - -
11 - - -
11-j9 - -
11-librca - -
17 - -
17-graal - -
17-j9 - -
17-librca - -
21 - -
21-graal - -
21-librca - -
25 - -
25-graal - -
25-librca - -

Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled

Failed Tests

musl-aarch64/debug / 8-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 25-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 17-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 21-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 8-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 11-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-amd64/debug / 17-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 21-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 11-librca

Job: View logs

No detailed failure information available. Check the job logs.

musl-aarch64/debug / 25-librca

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 21

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 25

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8-j9

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 17

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 21-graal

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 17-graal

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 25-graal

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 11

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8-ibm

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 8-orcl

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 17-j9

Job: View logs

No detailed failure information available. Check the job logs.

glibc-amd64/debug / 11-j9

Job: View logs

No detailed failure information available. Check the job logs.

Summary: Total: 32 | Passed: 9 | Failed: 23


Updated: 2026-05-13 17:26:16 UTC

@jbachorik
Copy link
Copy Markdown
Collaborator Author

Reorganization plan

This PR is PR D in the reorganized sequence:

#510 (PR B) → #524 (PR A) → #527 (this, PR D)

This PR conflicts with #524: it uses _class_map_lock.lock()/unlock() and _class_map.lookup() directly, but #524 removes _class_map_lock entirely and changes _class_map from Dictionary to TripleBufferedDictionary.

After #524 merges, this PR should be rebased onto main with these adjustments:

@jbachorik jbachorik force-pushed the muse/vtable-target-jvmti-prereg branch from 3a34b65 to 3115f92 Compare May 13, 2026 16:45
@jbachorik jbachorik changed the base branch from main to muse/crash-sigsegv-in-std-rb-tree-increment-clean May 13, 2026 16:45
@jbachorik
Copy link
Copy Markdown
Collaborator Author

Rebased onto #524 (muse/crash-sigsegv-in-std-rb-tree-increment-clean). The two locking changes the previous comment flagged are now applied:

  • _class_map_lock.lock()/unlock() removed; the surrounding clearAll() / clearStandby() in fix(profiler): lock-free class/endpoint/context maps via TripleBufferedDictionary #524 already serialise dictionary lifecycle.
  • _class_map.lookup(slice, slice_len) now resolves to TripleBufferedDictionary::lookup which acquires a per-thread RefCountGuard on the active buffer — thread-safe with concurrent signal-handler bounded_lookup(.., 0) callers.

The test addition in dictionary_concurrent_ut.cpp was dropped because #524 removes that file. If a regression test for bulk pre-registration is desired, it should go into dictionary_ut.cpp (or a new file) as a follow-up.

@jbachorik jbachorik force-pushed the muse/crash-sigsegv-in-std-rb-tree-increment-clean branch from b90761e to 76d919d Compare May 13, 2026 16:52
…4618)

Restores vtable_target frame capture in CPU-only recordings. After PR #512
made walkVM signal-safe (bounded_lookup with size_limit=0, no malloc),
_class_map was only populated by allocation/liveness paths via lookupClass.
In a CPU-only recording these paths never fire, so signal-safe lookups
always missed and vtable-target frames were silently dropped.

This change pre-populates _class_map from safe JVM-thread contexts:

- Profiler::preregisterLoadedClasses(jvmtiEnv*) calls GetLoadedClasses +
  GetClassSignature + ObjectSampler::normalizeClassSignature, inserting
  every reference class into _class_map.  Invoked at Profiler::start()
  (after clearAll()) and Profiler::dump() (after clearStandby()).
  TripleBufferedDictionary::lookup is thread-safe via RefCountGuard so no
  external locking is needed.

- VM::ClassPrepare is moved out-of-line to vmEntry.cpp (vmEntry.h is
  included by profiler.h, hence the prior inline definition could not
  reference Profiler::instance()).  The handler still calls loadMethodIDs
  first, then conditionally inserts the new class when
  _features.vtable_target is set.

The hot signal-handler path in hotspotSupport.cpp is unchanged —
pre-registration runs only on JVM threads, never under a signal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik jbachorik force-pushed the muse/vtable-target-jvmti-prereg branch from 3115f92 to 6e825c7 Compare May 13, 2026 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants