Skip to content

Segmentation fault with ConcurrentImmix and h2o #334

@wks

Description

@wks

In the current master branch (e350a01), if we run the h2o benchmark from DaCapo Chopin repeatedly using ConcurrentImmix, it is very likely to crash due to segmentation fault.

The command line I am using is:

while MMTK_PLAN=ConcurrentImmix /home/wks/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-fastdebug/jdk/bin/java -XX:MetaspaceSize=500M \
  -XX:+DisableExplicitGC \
  -server \
  -XX:+CrashOnOutOfMemoryError \
  -XX:+UseThirdPartyHeap \
  -Xms340M -Xmx340M \
  -XX:+UnlockDiagnosticVMOptions -XX:CompilerDirectivesFile=compiler-directives/noc2.json \
  -jar dacapo-23.11-MR2-chopin.jar \
  -n 1 h2o; do true; done

And noc2.json is a compiler directive file:

[
    {
        "match": ["*.*"],
        "c2": {
            "Exclude": true
        }
    }
]

The command runs the h2o benchmark using the fastdebug build, repeatedly, with the C2 JIT compiler disabled. It will usually crash in one or two minutes with the following error message:

[2025-10-20T08:02:55Z INFO  mmtk::plan::concurrent::immix::global] FinalMark start
[2025-10-20T08:02:55Z INFO  mmtk::plan::concurrent::immix::global] FinalMark end
[2025-10-20T08:02:55Z INFO  mmtk::scheduler::scheduler] End of GC (37342/87040 pages, took 49 ms)
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fac23771365, pid=159679, tid=159756
#
# JRE version: OpenJDK Runtime Environment (11.0.19) (fastdebug build 11.0.19-internal+0-adhoc.wks.openjdk)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 11.0.19-internal+0-adhoc.wks.openjdk, mixed mode, tiered, compressed oops, third-party gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0x1371365]  Klass::method_at_vtable(int)+0x25
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %d %F" (or dumping to /home/wks/opt/dacapo/core.159679)
#
# An error report file with more information is saved as:
# /home/wks/opt/dacapo/hs_err_pid159679.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#
Current thread is 159756
Dumping core ...
Aborted                    (core dumped) MMTK_PLAN=ConcurrentImmix /home/wks/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-fastdebug/jdk/bin/java -XX:MetaspaceSize=500M -XX:+DisableExplicitGC -server -XX:+CrashOnOutOfMemoryError -XX:+UseThirdPartyHeap -Xms340M -Xmx340M -XX:+UnlockDiagnosticVMOptions -XX:CompilerDirectivesFile=compiler-directives/noc2.json -jar dacapo-23.11-MR2-chopin.jar -n 1 h2o

And here is hs_err_pid159679.log just in case it is useful.

Sometimes it may crash with some intermediate language dump:

[2025-10-20T08:10:09Z INFO  mmtk::plan::concurrent::immix::global] FinalMark start
[2025-10-20T08:10:09Z INFO  mmtk::plan::concurrent::immix::global] FinalMark end
[2025-10-20T08:10:09Z INFO  mmtk::scheduler::scheduler] End of GC (43722/87040 pages, took 117 ms)
implicit exception happened at 0x00007f4b24eb8ac2
Compiled method (c1)   47092 5036       1       water.Value::get (46 bytes)
 total in heap  [0x00007f4b24eb8810,0x00007f4b24eb8ee8] = 1752
 relocation     [0x00007f4b24eb8998,0x00007f4b24eb8a30] = 152
 main code      [0x00007f4b24eb8a40,0x00007f4b24eb8c60] = 544
 stub code      [0x00007f4b24eb8c60,0x00007f4b24eb8ce8] = 136
 oops           [0x00007f4b24eb8ce8,0x00007f4b24eb8cf0] = 8
 metadata       [0x00007f4b24eb8cf0,0x00007f4b24eb8d08] = 24
 scopes data    [0x00007f4b24eb8d08,0x00007f4b24eb8da0] = 152
 scopes pcs     [0x00007f4b24eb8da0,0x00007f4b24eb8ee0] = 320
 dependencies   [0x00007f4b24eb8ee0,0x00007f4b24eb8ee8] = 8
0 fast_aload_0
1 invokespecial 141 <water/Value.touch()V> 
  0   bci: 1    CounterData         count(8095)
4 fast_aaccess_0
5 fast_agetfield 75 <water/Value._pojo/Lwater/Freezable;> 
8 checkcast 4 <water/Iced>
  16  bci: 8    ReceiverTypeData    flags(1) count(0) nonprofiled_count(0) entries(2)
                                    'water/Job'(344 0.90)
                                    'water/fvec/NFSFileVec'(38 0.10)
11 astore_1
12 aload_1
13 ifnull 18
  72  bci: 13   BranchData          taken(38) displacement(32)
                                    not taken(8122)
...

I am still not sure whether the bug is in mmtk-core or the binding because we currently don't have another VM binding that supports ConcurrentImmix.

We can also observed this crash after refactoring the barrier implementations in mmtk-openjdk (#332), so the crash should belong to something which that PR didn't change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions