-
Notifications
You must be signed in to change notification settings - Fork 38
Description
In the current master branch (e350a01), if we run the h2o
benchmark from DaCapo Chopin repeatedly using ConcurrentImmix, it is very likely to crash due to segmentation fault.
The command line I am using is:
while MMTK_PLAN=ConcurrentImmix /home/wks/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-fastdebug/jdk/bin/java -XX:MetaspaceSize=500M \
-XX:+DisableExplicitGC \
-server \
-XX:+CrashOnOutOfMemoryError \
-XX:+UseThirdPartyHeap \
-Xms340M -Xmx340M \
-XX:+UnlockDiagnosticVMOptions -XX:CompilerDirectivesFile=compiler-directives/noc2.json \
-jar dacapo-23.11-MR2-chopin.jar \
-n 1 h2o; do true; done
And noc2.json
is a compiler directive file:
[
{
"match": ["*.*"],
"c2": {
"Exclude": true
}
}
]
The command runs the h2o benchmark using the fastdebug
build, repeatedly, with the C2 JIT compiler disabled. It will usually crash in one or two minutes with the following error message:
[2025-10-20T08:02:55Z INFO mmtk::plan::concurrent::immix::global] FinalMark start
[2025-10-20T08:02:55Z INFO mmtk::plan::concurrent::immix::global] FinalMark end
[2025-10-20T08:02:55Z INFO mmtk::scheduler::scheduler] End of GC (37342/87040 pages, took 49 ms)
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007fac23771365, pid=159679, tid=159756
#
# JRE version: OpenJDK Runtime Environment (11.0.19) (fastdebug build 11.0.19-internal+0-adhoc.wks.openjdk)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 11.0.19-internal+0-adhoc.wks.openjdk, mixed mode, tiered, compressed oops, third-party gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0x1371365] Klass::method_at_vtable(int)+0x25
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %d %F" (or dumping to /home/wks/opt/dacapo/core.159679)
#
# An error report file with more information is saved as:
# /home/wks/opt/dacapo/hs_err_pid159679.log
#
# If you would like to submit a bug report, please visit:
# https://bugreport.java.com/bugreport/crash.jsp
#
Current thread is 159756
Dumping core ...
Aborted (core dumped) MMTK_PLAN=ConcurrentImmix /home/wks/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-fastdebug/jdk/bin/java -XX:MetaspaceSize=500M -XX:+DisableExplicitGC -server -XX:+CrashOnOutOfMemoryError -XX:+UseThirdPartyHeap -Xms340M -Xmx340M -XX:+UnlockDiagnosticVMOptions -XX:CompilerDirectivesFile=compiler-directives/noc2.json -jar dacapo-23.11-MR2-chopin.jar -n 1 h2o
And here is hs_err_pid159679.log just in case it is useful.
Sometimes it may crash with some intermediate language dump:
[2025-10-20T08:10:09Z INFO mmtk::plan::concurrent::immix::global] FinalMark start
[2025-10-20T08:10:09Z INFO mmtk::plan::concurrent::immix::global] FinalMark end
[2025-10-20T08:10:09Z INFO mmtk::scheduler::scheduler] End of GC (43722/87040 pages, took 117 ms)
implicit exception happened at 0x00007f4b24eb8ac2
Compiled method (c1) 47092 5036 1 water.Value::get (46 bytes)
total in heap [0x00007f4b24eb8810,0x00007f4b24eb8ee8] = 1752
relocation [0x00007f4b24eb8998,0x00007f4b24eb8a30] = 152
main code [0x00007f4b24eb8a40,0x00007f4b24eb8c60] = 544
stub code [0x00007f4b24eb8c60,0x00007f4b24eb8ce8] = 136
oops [0x00007f4b24eb8ce8,0x00007f4b24eb8cf0] = 8
metadata [0x00007f4b24eb8cf0,0x00007f4b24eb8d08] = 24
scopes data [0x00007f4b24eb8d08,0x00007f4b24eb8da0] = 152
scopes pcs [0x00007f4b24eb8da0,0x00007f4b24eb8ee0] = 320
dependencies [0x00007f4b24eb8ee0,0x00007f4b24eb8ee8] = 8
0 fast_aload_0
1 invokespecial 141 <water/Value.touch()V>
0 bci: 1 CounterData count(8095)
4 fast_aaccess_0
5 fast_agetfield 75 <water/Value._pojo/Lwater/Freezable;>
8 checkcast 4 <water/Iced>
16 bci: 8 ReceiverTypeData flags(1) count(0) nonprofiled_count(0) entries(2)
'water/Job'(344 0.90)
'water/fvec/NFSFileVec'(38 0.10)
11 astore_1
12 aload_1
13 ifnull 18
72 bci: 13 BranchData taken(38) displacement(32)
not taken(8122)
...
I am still not sure whether the bug is in mmtk-core or the binding because we currently don't have another VM binding that supports ConcurrentImmix.
We can also observed this crash after refactoring the barrier implementations in mmtk-openjdk (#332), so the crash should belong to something which that PR didn't change.