hs_err_pid1.redacted.log
Tracer Version(s)
1.62.0 (crashes). Reverting to 1.60.1 resolves it — so this is a regression introduced between 1.60.1 and 1.62.0.
Java Version(s)
OpenJDK Runtime Environment Temurin-25.0.1+8 (build 25.0.1+8-LTS), linux-amd64
JVM Vendor
Eclipse Adoptium / Temurin
OS / Environment
Alpine Linux v3.22 (musl libc), QEMU 4 cores / 11G, container -Xmx6g, G1 GC. Spring Boot 4.0.6 app, virtual threads in use. Continuous profiler enabled (native ddprof, allocation profiling on).
Bug Report
After upgrading the agent to 1.62.0, the JVM hard-crashes (SIGSEGV, is_crash) intermittently — ~10 min after startup under normal traffic. The crash is in the Datadog allocation profiler: on a sampled allocation it calls JVMTI GetStackTrace, and walking a virtual thread's frames segfaults inside vframe::java_sender().
This appears to be a sibling of #9830 but on a different stackwalk path: #9830 was the async/ASGCT path (vframeStreamForte::forte_next), mitigated by -Ddd.profiling.ddprof.cstack=vm (default since 1.55.0). This one is the JVMTI GetStackTrace path used by the allocation sampler on virtual threads, which cstack=vm does not appear to cover (we are on 1.62.0, well past that default, and still crashing).
si_code: 128 (SI_KERNEL), si_addr: 0x0; the faulting thread is a virtual-thread carrier (ForkJoinPool-1-worker), _thread_in_vm. The ArrayList.grow / QueryExecutorImpl.processResults Java frames are just the innocent allocation being sampled (no large allocation — verified, the underlying collection is small).
Workaround: downgrade to 1.60.1 (clean). We expect DD_PROFILING_ALLOCATION_ENABLED=false (or DD_PROFILING_DDPROF_ENABLED=false, given musl) would also avoid it.
Crash stack (top frames from hs_err_pid1.log):
# SIGSEGV (0xb) at pc=..., pid=1, tid=197
# Java VM: OpenJDK 64-Bit Server VM Temurin-25.0.1+8 (25.0.1+8-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0x1190128] vframe::java_sender() const+0x38
Current thread: JavaThread "ForkJoinPool-1-worker-18" daemon [_thread_in_vm, id=197]
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V [libjvm.so] vframe::java_sender() const+0x38
V [libjvm.so] JvmtiEnvBase::get_stack_trace(javaVFrame*, int, int, jvmtiFrameInfo*, int*)
V [libjvm.so] GetStackTraceClosure::do_vthread(Handle)
V [libjvm.so] JvmtiHandshake::execute(...)
V [libjvm.so] JvmtiEnv::GetStackTrace(...)
V [libjvm.so] jvmti_GetStackTrace
C [libjavaProfiler-dd-*.so] Profiler::recordJVMTISample(...)
C [libjavaProfiler-dd-*.so] ObjectSampler::recordAllocation(...)
C [libjavaProfiler-dd-*.so] ObjectSampler::SampledObjectAlloc(...)
V [libjvm.so] JvmtiExport::post_sampled_object_alloc(JavaThread*, oopDesc*)
V [libjvm.so] MemAllocator::allocate() / InstanceKlass::allocate_objArray(...)
V [libjvm.so] OptoRuntime::new_array_C(...)
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J java.util.ArrayList.grow() java.base@25.0.1
J org.postgresql.core.v3.QueryExecutorImpl.processResults(...)
J jdk.internal.vm.Continuation.run() java.base@25.0.1
J java.lang.VirtualThread.runContinuation() java.base@25.0.1
J java.util.concurrent.ForkJoinPool.runWorker(...) java.base@25.0.1
j java.util.concurrent.ForkJoinWorkerThread.run() java.base@25.0.1
siginfo: si_signo: 11 (SIGSEGV), si_code: 128 (SI_KERNEL), si_addr: 0x0
I can attach the full hs_err_pid1.log (and the .jfr) if useful.
hs_err_pid1.redacted.log
Tracer Version(s)
1.62.0 (crashes). Reverting to 1.60.1 resolves it — so this is a regression introduced between 1.60.1 and 1.62.0.
Java Version(s)
OpenJDK Runtime Environment Temurin-25.0.1+8 (build 25.0.1+8-LTS), linux-amd64
JVM Vendor
Eclipse Adoptium / Temurin
OS / Environment
Alpine Linux v3.22 (musl libc), QEMU 4 cores / 11G, container
-Xmx6g, G1 GC. Spring Boot 4.0.6 app, virtual threads in use. Continuous profiler enabled (nativeddprof, allocation profiling on).Bug Report
After upgrading the agent to 1.62.0, the JVM hard-crashes (
SIGSEGV,is_crash) intermittently — ~10 min after startup under normal traffic. The crash is in the Datadog allocation profiler: on a sampled allocation it calls JVMTIGetStackTrace, and walking a virtual thread's frames segfaults insidevframe::java_sender().This appears to be a sibling of #9830 but on a different stackwalk path: #9830 was the async/ASGCT path (
vframeStreamForte::forte_next), mitigated by-Ddd.profiling.ddprof.cstack=vm(default since 1.55.0). This one is the JVMTIGetStackTracepath used by the allocation sampler on virtual threads, whichcstack=vmdoes not appear to cover (we are on 1.62.0, well past that default, and still crashing).si_code: 128 (SI_KERNEL), si_addr: 0x0; the faulting thread is a virtual-thread carrier (ForkJoinPool-1-worker),_thread_in_vm. TheArrayList.grow/QueryExecutorImpl.processResultsJava frames are just the innocent allocation being sampled (no large allocation — verified, the underlying collection is small).Workaround: downgrade to 1.60.1 (clean). We expect
DD_PROFILING_ALLOCATION_ENABLED=false(orDD_PROFILING_DDPROF_ENABLED=false, given musl) would also avoid it.Crash stack (top frames from
hs_err_pid1.log):I can attach the full
hs_err_pid1.log(and the.jfr) if useful.