fix(profiler): pre-register vtable receiver classes via JVMTI (PROF-14618)#527
Conversation
There was a problem hiding this comment.
Pull request overview
This PR restores vtable_target frame capture for CPU-only recordings by ensuring _class_map is populated from JVM-thread contexts (rather than relying on allocation/liveness paths that don’t run in CPU-only mode), keeping the signal-handler lookup path read-only and signal-safe.
Changes:
- Bulk pre-registers loaded classes into
_class_mapduringProfiler::start()andProfiler::dump()under the existing exclusive_class_map_lock. - Moves
VM::ClassPrepareout-of-line and extends it to insert newly prepared classes into_class_mapwhenvtable_targetis enabled. - Adds a C++ unit test covering the “bulk insert then bounded_lookup(…, 0) hits” contract and stages a (currently disabled) Java integration test.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| ddprof-test/src/test/java/com/datadoghq/profiler/cpu/VtableTargetCpuTest.java | Adds a staged (disabled) integration test outlining expected vtable-target behavior in CPU samples. |
| ddprof-lib/src/test/cpp/dictionary_concurrent_ut.cpp | Adds a unit test validating bulk insertion visibility to bounded_lookup(..., 0) and sentinel behavior on miss. |
| ddprof-lib/src/main/cpp/vmEntry.h | Moves VM::ClassPrepare definition out of the header into vmEntry.cpp. |
| ddprof-lib/src/main/cpp/vmEntry.cpp | Implements VM::ClassPrepare to insert normalized class signatures into the class map when vtable_target is enabled. |
| ddprof-lib/src/main/cpp/profiler.h | Adds Profiler::preregisterLoadedClasses(jvmtiEnv*) declaration and documents intended locking/threading constraints. |
| ddprof-lib/src/main/cpp/profiler.cpp | Calls preregisterLoadedClasses() during start/dump class-map resets and implements the bulk JVMTI enumeration/insertion logic. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const char* slice = nullptr; | ||
| size_t slice_len = 0; | ||
| if (ObjectSampler::normalizeClassSignature(sig, &slice, &slice_len)) { | ||
| (void)profiler->lookupClass(slice, slice_len); |
| // Pre-populate _class_map with all currently-loaded reference classes so that | ||
| // signal-safe lookups in walkVM (vtable_target) can resolve them without ever | ||
| // needing to malloc. Caller MUST hold _class_map_lock EXCLUSIVELY. Runs on a | ||
| // JVM thread (never in a signal handler). |
| // (4a) Bulk exclusive-lock insert (mimicking preregisterLoadedClasses) followed | ||
| // by read-only bounded_lookup(0) on each inserted key. Verifies that the | ||
| // pre-registration contract holds: every key inserted while the exclusive lock | ||
| // is held is subsequently visible to bounded_lookup(0) from a shared context. | ||
| TEST(DictionaryConcurrent, BulkInsertThenBoundedLookupHitsMirrorInsertedIds) { |
CI Test ResultsRun: #25813605948 | Commit:
Status Overview
Legend: ✅ passed | ❌ failed | ⚪ skipped | 🚫 cancelled Failed Testsmusl-aarch64/debug / 8-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 25-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 17-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 21-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 8-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 11-librcaJob: View logs No detailed failure information available. Check the job logs. musl-amd64/debug / 17-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 21-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 11-librcaJob: View logs No detailed failure information available. Check the job logs. musl-aarch64/debug / 25-librcaJob: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 21Job: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 25Job: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 8-j9Job: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 17Job: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 21-graalJob: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 17-graalJob: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 25-graalJob: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 11Job: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 8-ibmJob: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 8-orclJob: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 17-j9Job: View logs No detailed failure information available. Check the job logs. glibc-amd64/debug / 11-j9Job: View logs No detailed failure information available. Check the job logs. Summary: Total: 32 | Passed: 9 | Failed: 23 Updated: 2026-05-13 17:26:16 UTC |
Reorganization planThis PR is PR D in the reorganized sequence: This PR conflicts with #524: it uses After #524 merges, this PR should be rebased onto
|
3a34b65 to
3115f92
Compare
|
Rebased onto #524 (
The test addition in |
b90761e to
76d919d
Compare
…4618) Restores vtable_target frame capture in CPU-only recordings. After PR #512 made walkVM signal-safe (bounded_lookup with size_limit=0, no malloc), _class_map was only populated by allocation/liveness paths via lookupClass. In a CPU-only recording these paths never fire, so signal-safe lookups always missed and vtable-target frames were silently dropped. This change pre-populates _class_map from safe JVM-thread contexts: - Profiler::preregisterLoadedClasses(jvmtiEnv*) calls GetLoadedClasses + GetClassSignature + ObjectSampler::normalizeClassSignature, inserting every reference class into _class_map. Invoked at Profiler::start() (after clearAll()) and Profiler::dump() (after clearStandby()). TripleBufferedDictionary::lookup is thread-safe via RefCountGuard so no external locking is needed. - VM::ClassPrepare is moved out-of-line to vmEntry.cpp (vmEntry.h is included by profiler.h, hence the prior inline definition could not reference Profiler::instance()). The handler still calls loadMethodIDs first, then conditionally inserts the new class when _features.vtable_target is set. The hot signal-handler path in hotspotSupport.cpp is unchanged — pre-registration runs only on JVM threads, never under a signal. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3115f92 to
6e825c7
Compare
What does this PR do?:
Restores
vtable_targetframe capture in CPU-only recordings. After PR #512 (signal-safewalkVM),_class_mapwas only populated by allocation/liveness paths vialookupClass. In a CPU-only recording these paths never fire, sobounded_lookup(.., size_limit=0)inhotspotSupport.cppalways missed and vtable-target frames were silently dropped.This change pre-populates
_class_mapfrom safe JVM-thread contexts:Profiler::preregisterLoadedClasses(jvmtiEnv*)callsGetLoadedClasses+GetClassSignature+ObjectSampler::normalizeClassSignature, inserting every reference class into_class_map. Invoked inside the existing exclusive_class_map_lockatProfiler::start()andProfiler::dump()so signal handlers see a coherent, fully-populated map.VM::ClassPrepareis moved out-of-line tovmEntry.cpp(vmEntry.h is included by profiler.h, hence the prior inline definition could not referenceProfiler::instance()). The handler still callsloadMethodIDsfirst, then conditionally inserts the new class when_features.vtable_targetis set.The hot signal-handler path in
hotspotSupport.cppis unchanged — pre-registration runs only on JVM threads, never under a signal.Motivation:
Follow-up to PR #512 / PROF-14582.
lookupClassfrom a signal handler is unsafe (malloc insideDictionary::lookup), so the signal-safe path was made read-only viabounded_lookup(size_limit=0). That left CPU-only recordings without any path that would populate the map, silently breaking the vtable-target feature.Additional Notes:
Profiler::start()is called before_featuresis assigned, so the gate inside the exclusive-lock block readsargs._features.vtable_target && VMStructs::hasClassNames()(mirroring the sanitisation atprofiler.cpp:1129).Profiler::dump()uses the same condition on_featuresdirectly.preregisterLoadedClassescalls_class_map.lookupdirectly rather thanProfiler::lookupClass, because the caller already holds the exclusive lock —lookupClasswould re-attempttryLockSharedand either spin or return-1, defeating the purpose.GetLoadedClasses/GetClassSignatureare now logged viaLog::warnso silent skips are observable.GetClassSignaturepass throughnormalizeClassSignatureunstripped; the resulting keys never matchwalkVM's lookup format, so they are harmless dictionary entries._class_map_lockno longer blocks signal readers across the full bulk-iteration window.vtable_targethas no CLI toggle yet and the syntheticBCI_ALLOCframe is recorded as a class id (not as text inSTACK_TRACE_STRING), so a useful integration test needs a separate follow-up.How to test the change?:
dictionary_concurrent_ut.cpp::BulkInsertThenBoundedLookupHitsMirrorInsertedIdsvalidates that bulk pre-registration under an exclusive lock makes every key visible tobounded_lookup(.., 0)from a shared context, and that absent keys returnINT_MAX.DictionaryConcurrenttests pass locally via./gradlew :ddprof-lib:gtestDebug_dictionary_concurrent_ut.For Datadog employees:
@DataDog/security-design-and-guidance.