Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with value type array element flattening enabled with -Xint #13848

Closed
a7ehuo opened this issue Nov 1, 2021 · 17 comments · Fixed by #13935
Closed

Crash with value type array element flattening enabled with -Xint #13848

a7ehuo opened this issue Nov 1, 2021 · 17 comments · Fixed by #13935
Labels
bug comp:vm project:valhalla Used to track Project Valhalla related work segfault Issues that describe segfaults / JVM crashes

Comments

@a7ehuo
Copy link
Contributor

a7ehuo commented Nov 1, 2021

I ran into a crash in GC_ArrayletObjectModel::getDataSizeInBytes [1] with the following code [2] when value type array element flattening is enabled (JDK18). It looks clazzPtr is not a legit J9Class. Attached the test code [3].

  • Using new SingleFieldPrimitive(j) to create value type arrays is fine but using new SingleFieldPrimitive() causes the crash
  • Only happens when array element flattening is enabled

Options used

-Xshareclasses:none -Xverify:none -XX:+EnableValhalla -XX:+EnableArrayFlattening -XX:ValueTypeFlatteningThreshold=999999 -Xint TestSingleFieldPrimitive

Running the following option without array element flattening is fine

-Xshareclasses:none -Xverify:none -XX:+EnableValhalla -Xint TestSingleFieldPrimitive

[1]

#12 <signal handler called>
#13 0x00007fc52a81edce in GC_ArrayletObjectModel::getDataSizeInBytes (numberOfElements=4294615088, clazzPtr=0xffe17400, 
    this=0x7fc52c0297b8) at /root/hostdir/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:369
#14 GC_ArrayletObjectModel::getDataSizeInBytes (arrayPtr=0xfffaa020, this=0x7fc52c0297b8)
    at /root/hostdir/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:353
#15 GC_ArrayletObjectModel::getArrayLayout (objPtr=0xfffaa020, this=0x7fc52c0297b8)
    at /root/hostdir/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:407
#16 MM_ObjectAccessBarrier::indexableEffectiveAddress (elementSize=4, index=4105, array=0xfffaa020, vmThread=0x1aa00, 
    this=0x7fc52c036270) at /root/hostdir/openj9/runtime/gc_base/ObjectAccessBarrier.hpp:110
#17 MM_ObjectAccessBarrier::copyObjectFieldsFromFlattenedArrayElement (this=0x7fc52c036270, vmThread=0x1aa00, 
    arrayClazz=0x159b00, destObject=0xffe7f270, arrayRef=0xfffaa020, index=4105)
    at /root/hostdir/openj9/runtime/gc_base/ObjectAccessBarrier.cpp:1159
#18 0x00007fc530a5211c in MM_ObjectAccessBarrierAPICompressed::copyObjectFieldsFromFlattenedArrayElement (
    index=<optimized out>, arrayRef=<optimized out>, destObject=<optimized out>, arrayClazz=<optimized out>, 
    vmThread=<optimized out>, this=<optimized out>)
    at /root/hostdir/openj9/runtime/gc_include/ObjectAccessBarrierAPI.hpp:397
#19 VM_ValueTypeHelpersCompressed::loadFlattenableArrayElement (fast=<optimized out>, index=<optimized out>, 
    receiverObject=<optimized out>, _objectAllocate=..., _objectAccessBarrier=..., currentThread=<optimized out>)
    at /root/hostdir/openj9/runtime/vm/ValueTypeHelpers.hpp:495
#20 VM_BytecodeInterpreterCompressed::aaload (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /root/hostdir/openj9/runtime/vm/BytecodeInterpreter.hpp:6175
#21 VM_BytecodeInterpreterCompressed::run (this=0x7fc531a0a800, vmThread=0x0)
--Type <RET> for more, q to quit, c to continue without paging--
    at /root/hostdir/openj9/runtime/vm/BytecodeInterpreter.hpp:10667
#22 0x00007fc530a3cbe5 in bytecodeLoopCompressed (currentThread=<optimized out>)
    at /root/hostdir/openj9/runtime/vm/BytecodeInterpreter.inc:112
#23 0x00007fc530af1e92 in c_cInterpreter () at /root/hostdir/build/linux-x86_64-server-release/vm/runtime/vm/xcinterp.s:158
#24 0x00007fc5309c6c8f in runCallInMethod (env=0x7fc531a0a910, receiver=0x0, clazz=0xe33e0, methodID=0x7fc52c456608, 
    args=0x7fc531a0ad78) at /root/hostdir/openj9/runtime/vm/callin.cpp:1123
#25 0x00007fc5309eaca9 in gpProtectedRunCallInMethod (entryArg=0x7fc531a0ad30)
    at /root/hostdir/openj9/runtime/vm/jnicsup.cpp:301
#26 0x00007fc52bdb4f03 in omrsig_protect (portLibrary=0x7fc531207340 <j9portLibrary>, 
    fn=0x7fc530afd290 <signalProtectAndRunGlue>, fn_arg=0x7fc531a0acd0, handler=0x7fc5309e7ce0 <structuredSignalHandler>, 
    handler_arg=0x1aa00, flags=506, result=0x7fc531a0acc8) at /root/hostdir/omr/port/unix/omrsignal.c:425
#27 0x00007fc530afd32c in gpProtectAndRun (function=0x7fc5309eac70 <gpProtectedRunCallInMethod(void*)>, env=0x1aa00, 
    args=0x7fc531a0ad30) at /root/hostdir/openj9/runtime/util/jniprotect.c:78
#28 0x00007fc5309ec644 in gpCheckCallin (env=0x1aa00, receiver=receiver@entry=0x0, cls=0xe33e0, methodID=0x7fc52c456608, 
    args=args@entry=0x7fc531a0ad78) at /root/hostdir/openj9/runtime/vm/jnicsup.cpp:489
#29 0x00007fc5309ea67a in callStaticVoidMethod (env=<optimized out>, cls=<optimized out>, methodID=<optimized out>)
    at /root/hostdir/openj9/runtime/vm/jnicgen.c:384
#30 0x00007fc532658396 in JavaMain (_args=<optimized out>) at src/java.base/share/native/libjli/java.c:545
#31 0x00007fc53265b339 in ThreadJavaMain (args=<optimized out>) at src/java.base/unix/native/libjli/java_md.c:677
#32 0x00007fc531c296db in start_thread (arg=0x7fc531a0b700) at pthread_create.c:463
#33 0x00007fc532383a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:9

(gdb) fr 13
#13 0x00007fc52a81edce in GC_ArrayletObjectModel::getDataSizeInBytes (numberOfElements=4294615088, clazzPtr=0xffe17400, 
    this=0x7fc52c0297b8) at /root/hostdir/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:369
369			if ((size / stride) == numberOfElements) {
(gdb) print size
$1 = 0
(gdb) print stride
$2 = 0
(gdb) print numberOfElements
$3 = 4294615088
(gdb) print clazzPtr
$16 = (J9Class *) 0xffe17400
(gdb) print *clazzPtr
$5 = {eyecatcher = 35184373504784, romClass = 0xffe17410, superclasses = 0x0, classDepthAndFlags = 0, 
  classDepthWithFlags = 0, classFlags = 0, classLoader = 0x0, classObject = 0x0, initializeStatus = 0, ramMethods = 0x0, 
  ramStatics = 0x0, arrayClass = 0x0, totalInstanceSize = 0, lastITable = 0x0, instanceDescription = 0x0, 
  instanceLeafDescription = 0x0, instanceHotFieldDescription = 0, selfReferencingField1 = 0, selfReferencingField2 = 0, 
  initializerCache = 0x0, romableAotITable = 0, packageID = 0, module = 0x0, subclassTraversalLink = 0x0, 
  subclassTraversalReverseLink = 0x0, iTable = 0x0, castClassCache = 0, jniIDs = 0x0, lockOffset = 0, 
  paddingForGLRCounters = 0, reservedCounter = 0, cancelCounter = 0, newInstanceCount = 0, backfillOffset = 0, 
  replacedClass = 0x0, finalizeLinkOffset = 0, nextClassInSegment = 0x0, ramConstantPool = 0x0, callSites = 0x0, 
  invokeCache = 0x0, varHandleMethodTypes = 0x0, customSpinOption = 0x0, staticSplitMethodTable = 0x0, 
  specialSplitMethodTable = 0x0, jitMetaDataList = 0x0, gcLink = 0x0, hostClass = 0x0, nestHost = 0x0, 
  flattenedClassCache = 0x0, hotFieldsInfo = 0x0}

(gdb) fr 20
#20 VM_BytecodeInterpreterCompressed::aaload (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
    at /root/hostdir/openj9/runtime/vm/BytecodeInterpreter.hpp:6175
6175						value = VM_ValueTypeHelpers::loadFlattenableArrayElement(_currentThread, _objectAccessBarrier, _objectAllocate, arrayref, index, false);
(gdb) print arrayref
$17 = (j9object_t) 0xfffaa020
(gdb) print *arrayref
$18 = {clazz = 4292965380}
(gdb) print /x *arrayref
$20 = {clazz = 0xffe17404}

[2]

public class TestSingleFieldPrimitive {
    public static int _total;
 
    public static final int ARRAY_LENGTH = 8192;
 
    public static SingleFieldPrimitive[][] arrays = new SingleFieldPrimitive[2][];
 
    public static void main(String[] args) {
        long numIterations = 40;
 
        for (int i = 0; i < arrays.length; i++) {
            SingleFieldPrimitive[] arr = new SingleFieldPrimitive[ARRAY_LENGTH];
            arrays[i] = arr;
 
            for (int j = 0; j < arr.length; j++) {
                arr[j] = new SingleFieldPrimitive();   //<==== crash
                //arr[j] = new SingleFieldPrimitive(j); //<=== good 
            }
        }
 
        for (long loop = 0; loop < numIterations; loop++) {
            SingleFieldPrimitive[] arr = arrays[(int) (loop % 2)];
 
            for (int i = 0; i < arr.length; i++) {
                _total = _total + arr[i].i;
            }
        }
    }
}

[3]
Archive.zip

@a7ehuo a7ehuo added the project:valhalla Used to track Project Valhalla related work label Nov 1, 2021
@a7ehuo
Copy link
Contributor Author

a7ehuo commented Nov 1, 2021

@tajila @hangshao0 fyi

@hzongaro hzongaro added the bug label Nov 2, 2021
@hangshao0 hangshao0 added comp:vm segfault Issues that describe segfaults / JVM crashes labels Nov 2, 2021
@hangshao0
Copy link
Contributor

We may want to add the test here into the build once the issue is fixed.

@hangshao0
Copy link
Contributor

@a7ehuo
Seems that I cannot reproduce the crash using the latest Valhalla build here: https://openj9-jenkins.osuosl.org/job/Pipeline_Build_Test_JDKnext_x86-64_linux_valhalla/21/,

java -Xshareclasses:none -Xverify:none -XX:+EnableValhalla -XX:+EnableArrayFlattening -XX:ValueTypeFlatteningThreshold=999999 -Xint TestSingleFieldPrimitive
JVMJ9VM193W Since Java 13 -Xverify:none and -noverify were deprecated for removal and may not be accepted options in the future.

java -version
openjdk version "18-internal" 2022-03-15
OpenJDK Runtime Environment (build 18-internal+0-adhoc.jenkins.BuildJDKnextx86-64linuxvalhallaPersonal)
Eclipse OpenJ9 VM (build HEAD-7e067331a33, JRE 18 Linux amd64-64-Bit Compressed References 20211108_21 (JIT enabled, AOT enabled)
OpenJ9   - 7e067331a33
OMR      - 2a36842992a
JCL      - a3abf3d6e8a based on jdk-18+20)

@a7ehuo
Copy link
Contributor Author

a7ehuo commented Nov 11, 2021

I could still reproduce the crash with the build mentioned above and also the latest JDKNext build. Attached the class file Archive2.tar.gz.

~tmp/Archive2# ../jdk/bin/java  -version
openjdk version "18-internal" 2022-03-15
OpenJDK Runtime Environment (build 18-internal+0-adhoc.jenkins.BuildJDKnextx86-64linuxvalhallaPersonal)
Eclipse OpenJ9 VM (build HEAD-7e067331a33, JRE 18 Linux amd64-64-Bit Compressed References 20211108_21 (JIT enabled, AOT enabled)
OpenJ9   - 7e067331a33
OMR      - 2a36842992a
JCL      - a3abf3d6e8a based on jdk-18+20)

~/tmp/Archive2# ../jdk/bin/java -Xint -Xverify:none  -XX:+EnableValhalla -XX:+EnableArrayFlattening -XX:ValueTypeFlatteningThreshold=999999  TestSingleFieldPrimitive
JVMJ9VM193W Since Java 13 -Xverify:none and -noverify were deprecated for removal and may not be accepted options in the future.
Unhandled exception
Type=Floating point error vmState=0x00000000
J9Generic_Signal_Number=00000888 Signal_Number=00000008 Error_Value=00000000 Signal_Code=00000001
Handler1=00007FB9832F7100 Handler2=00007FB983051BA0
RDI=00007FB98403E520 RSI=00000000FFE11700 RAX=0000000000000000 RBX=00000000FFFA6DA0
RCX=0000000000000000 RDX=0000000000000000 R8=0000000000000000 R9=0000000000001125
R10=00000000FFFA6DB0 R11=0000000000000008 R12=00000000FFE6E750 R13=000000000001AA00
R14=0000000000159B00 R15=0000000000000004
RIP=00007FB9813B00DE GS=0000 FS=0000 RSP=00007FB988DB33F0
EFlags=0000000000010206 CS=0033 RBP=00007FB98404B110 ERR=0000000000000000
TRAPNO=0000000000000000 OLDMASK=0000000000000000 CR2=0000000000000000
xmm0 0000003000000020 (f: 32.000000, d: 1.018558e-312)
xmm1 0000000048447000 (f: 1212444672.000000, d: 5.990273e-315)
xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm3 bfdfa04932597e07 (f: 844725760.000000, d: -4.941581e-01)
xmm4 3fbc5e53aa362eb4 (f: 2855677696.000000, d: 1.108143e-01)
xmm5 00000000403ea660 (f: 1077847680.000000, d: 5.325275e-315)
xmm6 bff0000000000000 (f: 0.000000, d: -1.000000e+00)
xmm7 4140000000000000 (f: 0.000000, d: 2.097152e+06)
xmm8 00000000fff4a098 (f: 4294222080.000000, d: 2.121628e-314)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/root/home/ahuo/src/valhallabench-2/tmp/jdk/lib/default/libj9gc29.so
Module_base_address=00007FB98137F000
Target=2_90_20211108_21 (Linux 4.15.0-159-generic)
CPU=amd64 (8 logical CPUs) (0x17477f000 RAM)
----------- Stack Backtrace -----------
(0x00007FB9813B00DE [libj9gc29.so+0x310de])
(0x00007FB9833614EC [libj9vm29.so+0xac4ec])
(0x00007FB98334BFB5 [libj9vm29.so+0x96fb5])
(0x00007FB983401262 [libj9vm29.so+0x14c262])
---------------------------------------
...


~/tmp# /root/home/ahuo/src/openj9-openjdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -version
openjdk version "18-internal" 2022-03-15
OpenJDK Runtime Environment (build 18-internal+0-adhoc.root.openj9-openjdk-jdk)
Eclipse OpenJ9 VM (build master-c4bbd378d, JRE 18 Linux amd64-64-Bit Compressed References 20211111_000000 (JIT enabled, AOT enabled)
OpenJ9   - c4bbd378d
OMR      - dd087489c
JCL      - 6b0732d47e2 based on jdk-18+22)

~tmp# /root/home/ahuo/src/openj9-openjdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -Xint -Xverify:none  -XX:+EnableValhalla -XX:+EnableArrayFlattening -XX:ValueTypeFlatteningThreshold=999999  TestSingleFieldPrimitive
JVMJ9VM193W Since Java 13 -Xverify:none and -noverify were deprecated for removal and may not be accepted options in the future.
Unhandled exception
Type=Floating point error vmState=0x00000000
J9Generic_Signal_Number=00000888 Signal_Number=00000008 Error_Value=00000000 Signal_Code=00000001
Handler1=00007FC1B004DD00 Handler2=00007FC1AB5F51D0
RDI=00007FC1AC02A160 RSI=00000000FFE17400 RAX=0000000000000000 RBX=00000000FFFA6030
RCX=0000000000000000 RDX=0000000000000000 R8=0000000000000000 R9=0000000000001220
R10=00000000FFFA6040 R11=0000000000000008 R12=00000000FFE78460 R13=000000000001AA00
R14=0000000000151B00 R15=0000000000000004
RIP=00007FC1A9D7E5DE GS=0000 FS=0000 RSP=00007FC1B10723F0
EFlags=0000000000010206 CS=0033 RBP=00007FC1AC036D10 ERR=0000000000000000
TRAPNO=0000000000000000 OLDMASK=0000000000000000 CR2=0000000000000000
xmm0 0000003000000020 (f: 32.000000, d: 1.018558e-312)
xmm1 00000000483c5700 (f: 1211913984.000000, d: 5.987651e-315)
xmm2 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm3 bfdfdc55ff129ce0 (f: 4279409920.000000, d: -4.978232e-01)
xmm4 3faaa5aa5df25984 (f: 1576163712.000000, d: 5.204518e-02)
xmm5 0000000040a4852c (f: 1084523776.000000, d: 5.358260e-315)
xmm6 bff0000000000000 (f: 0.000000, d: -1.000000e+00)
xmm7 4140000000000000 (f: 0.000000, d: 2.097152e+06)
xmm8 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/root/home/ahuo/src/openj9-openjdk-jdk/build/linux-x86_64-server-release/images/jdk/lib/default/libj9gc29.so
Module_base_address=00007FC1A9D4D000
Target=2_90_20211111_000000 (Linux 4.15.0-159-generic)
CPU=amd64 (8 logical CPUs) (0x17477f000 RAM)
----------- Stack Backtrace -----------
(0x00007FC1A9D7E5DE [libj9gc29.so+0x315de])
(0x00007FC1B00B891E [libj9vm29.so+0xac91e])
(0x00007FC1B00A2C55 [libj9vm29.so+0x96c55])
(0x00007FC1B0159492 [libj9vm29.so+0x14d492])
---------------------------------------
...

@hangshao0
Copy link
Contributor

Looking at the info from the description:

#12 <signal handler called>
#13 0x00007fc52a81edce in GC_ArrayletObjectModel::getDataSizeInBytes (numberOfElements=4294615088, clazzPtr=0xffe17400, 
    this=0x7fc52c0297b8) at /root/hostdir/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:369
#14 GC_ArrayletObjectModel::getDataSizeInBytes (arrayPtr=0xfffaa020, this=0x7fc52c0297b8)
    at /root/hostdir/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:353
(gdb) print arrayref
$17 = (j9object_t) 0xfffaa020
(gdb) print /x *arrayref
$20 = {clazz = 0xffe17404}

The arrayref is 0xfffaa020, seems the object has been forwarded. I guess GC_ArrayletObjectModel should not treat 0xffe17400 as pointer to J9Class, it is the object pointer. @dmitripivkine

@dmitripivkine
Copy link
Contributor

Most likely there is stall pointer to 0xfffaa020 has not been fixed up to 0xffe17400. And (I assume this is not Concurrent Scavenger case) this is pointer to wrong (reserved) side of the Nursery.
GC fixes all known roots so we need to investigate where we get this stall pointer. For example scenario unexpected GC occur due missed (or dropped and re-aquired) VM Access

@hangshao0
Copy link
Contributor

hangshao0 commented Nov 12, 2021

Annabelle didn't save the core for the crash #13848 (comment), she reproduced the crash on her machine and gave me a new core. It is still crashing at the same palce, just with a different stall object pointer value 0xfffb5030


#12 <signal handler called>
#13 0x00007f99cb1115de in GC_ArrayletObjectModel::getDataSizeInBytes (numberOfElements=4294660160, clazzPtr=0xffe17400, 
    this=0x7f99cc02a258) at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:369
#14 GC_ArrayletObjectModel::getDataSizeInBytes (arrayPtr=0xfffb5030, this=0x7f99cc02a258)
    at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:353
#15 GC_ArrayletObjectModel::getArrayLayout (objPtr=0xfffb5030, this=0x7f99cc02a258)
    at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/gc_glue_java/ArrayletObjectModel.hpp:407
#16 MM_ObjectAccessBarrier::indexableEffectiveAddress (elementSize=4, index=2041, array=0xfffb5030, vmThread=0x1aa00, 
    this=0x7f99cc036d00) at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/gc_base/ObjectAccessBarrier.hpp:110
#17 MM_ObjectAccessBarrier::copyObjectFieldsFromFlattenedArrayElement (this=0x7f99cc036d00, vmThread=0x1aa00, 
    arrayClazz=0x159b00, destObject=0xffe7fa98, arrayRef=0xfffb5030, index=2041)
    at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/gc_base/ObjectAccessBarrier.cpp:1159
#18 0x00007f99d145f91e in MM_ObjectAccessBarrierAPICompressed::copyObjectFieldsFromFlattenedArrayElement (
    index=<optimized out>, arrayRef=<optimized out>, destObject=<optimized out>, arrayClazz=<optimized out>, 
    vmThread=<optimized out>, this=<optimized out>)
    at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/gc_include/ObjectAccessBarrierAPI.hpp:397
#19 VM_ValueTypeHelpersCompressed::loadFlattenableArrayElement (fast=<optimized out>, index=<optimized out>, 
    receiverObject=<optimized out>, _objectAllocate=..., _objectAccessBarrier=..., currentThread=<optimized out>)
    at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/vm/ValueTypeHelpers.hpp:495
#20 VM_BytecodeInterpreterCompressed::aaload (_pc=<optimized out>, _sp=<optimized out>, this=<optimized out>)
--Type <RET> for more, q to quit, c to continue without paging--
    at /root/home/ahuo/src/openj9-openjdk-jdk/openj9/runtime/vm/BytecodeInterpreter.hpp:6175
#21 VM_BytecodeInterpreterCompressed::run (this=0x7f99d2419800, vmThread=0x0)

The stall object pointer (arrayRef) is 0xfffb5030, the forward address 0xffe17400 is the valid object pointer.

!j9x 0xfffb5030
0xFFFB5030 :  00000000ffe17404 00000000fffb5040 [ .t......@P...... ]

From the source code

j9object_t arrayref = *(j9object_t*)(_sp + 1);
if (NULL == arrayref) {
rc = THROW_NPE;
} else {
U_32 arrayLength = J9INDEXABLEOBJECT_SIZE(_currentThread, arrayref);
U_32 index = *(U_32*)_sp;
/* By using U_32 for index, we also catch the negative case, as all negative values are
* greater than the maximum array size (31 bits unsigned).
*/
if (index >= arrayLength) {
_currentThread->tempSlot = (UDATA)index;
rc = THROW_AIOB;
} else {
j9object_t value = VM_ValueTypeHelpers::loadFlattenableArrayElement(_currentThread, _objectAccessBarrier, _objectAllocate, arrayref, index, true);
#if defined(J9VM_OPT_VALHALLA_VALUE_TYPES)
J9Class *arrayrefClass = J9OBJECT_CLAZZ(_currentThread, arrayref);
if ((NULL == value) && J9_IS_J9CLASS_FLATTENED(arrayrefClass)) {
/* We only get here due to an allocation failure */
buildGenericSpecialStackFrame(REGISTER_ARGS, 0);
pushObjectInSpecialFrame(REGISTER_ARGS, arrayref);
updateVMStruct(REGISTER_ARGS);
value = VM_ValueTypeHelpers::loadFlattenableArrayElement(_currentThread, _objectAccessBarrier, _objectAllocate, arrayref, index, false);

arrayRef (=0xfffb5030) is read from the stack at line 6154, However, at the time of crash, the value on the stack is already updated to 0xffe17400

<1aa00> Initial values: walkSP = 0x00000000000E3310, PC = 0x0000000000000001, literals = 0x0000000000000008, A0 = 0x00000000000E3330, j2iFrame = 0x0000000000000000, ELS = 0x00007F99D2419A90, decomp = 0x0000000000000000
<1aa00> Generic special frame: bp = 0x00000000000E3330, sp = 0x00000000000E3310, pc = 0x0000000000000001, cp = 0x0000000000000000, arg0EA = 0x00000000000E3330, flags = 0x0000000000000000
<1aa00>         Object pushes starting at 0x00000000000E3310 for 1 slots
<1aa00>                 Push[0x00000000000E3310] = 0x00000000FFE17400
<1aa00> Bytecode frame: bp = 0x00000000000E3360, sp = 0x00000000000E3338, pc = 0x00007F99AB85523E, cp = 0x00000000001596C0, arg0EA = 0x00000000000E3398, flags = 0x0000000000000000
<1aa00>         Method: TestSingleFieldPrimitive.main([Ljava/lang/String;)V !j9method 0x0000000000159798
<1aa00>         Bytecode index = 98
<1aa00>         Using local mapper
<1aa00>         Locals starting at 0x00000000000E3398 for 0x0000000000000007 slots
<1aa00>                 I-Slot: a0[0x00000000000E3398] = 0x00000000FFF6F0C8
<1aa00>                 I-Slot: t1[0x00000000000E3390] = 0x00000000000E33B8
<1aa00>                 I-Slot: t2[0x00000000000E3388] = 0x0000000000000028
<1aa00>                 I-Slot: t3[0x00000000000E3380] = 0x0000000000000002
<1aa00>                 I-Slot: t4[0x00000000000E3378] = 0x0000000000000001
<1aa00>                 O-Slot: t5[0x00000000000E3370] = 0x00000000FFE17400
<1aa00>                 I-Slot: t6[0x00000000000E3368] = 0x00000000000007F9
<1aa00>         Pending stack starting at 0x00000000000E3348 for UDATA(0x0000000000000003) slots
<1aa00>                 I-Slot: p0[0x00000000000E3348] = 0x0000000000000000
<1aa00>                 O-Slot: p1[0x00000000000E3340] = 0x00000000FFE17400         <-------------------   new value
<1aa00>                 I-Slot: p2[0x00000000000E3338] = 0x00000000000007F9

!findstackvalue 0xfffb5030 returns nothing, so 0xfffb5030 is not on the stack at the time of crash. This thread has vm access:

!threads flags | grep 0x1aa00
    !j9vmthread 0x1aa00 publicFlags=20 privateFlags=1008 inNative=0 // main

0xfffb5030 and 0xffe17400 are in different regions.

!dumpallregions
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
|    region      |     start      |      end       |    subspace    | flags  |      size      |      region type
+----------------+----------------+----------------+----------------+--------+----------------+----------------------
 00007f99cc0812f0 00000000a2e30000 00000000a3430000 00007f99cc06e9b0 00000009           600000 ADDRESS_ORDERED
 00007f99cc080d20 00000000ffe00000 00000000fff00000 00007f99cc07a9f0 0000000a           100000 ADDRESS_ORDERED
 00007f99cc080750 00000000fff00000 0000000100000000 00007f99cc074f90 0000000a           100000 ADDRESS_ORDERED
+----------------+----------------+----------------+----------------+--------+----------------+----------------------

@hangshao0
Copy link
Contributor

So at line BytecodeInterpreter.hpp:6154, the arrayref read from the stack is 0xfffb5030 (stall pointer), when crash inside BytecodeInterpreter.hpp:6175. the value on the stack becomes 0xffe17400, which is the forward address.

Concurrent Scavenge seems to be off:
0x5c41: bool concurrentScavenger = false

It worth mentioning we entered loadFlattenableArrayElement() twice, first time calling_objectAllocate.inlineAllocateObject(), it failed. Then we allocated through memoryManagerFunctions->J9AllocateObject(). I guess it won't drop/reacquire VM access inside these 2 calls ? @dmitripivkine

@dmitripivkine
Copy link
Contributor

Allocate APIs should return valid at the moment object address. The problem should occur after object pointer is received and stored somewhere where next GC can not fix it up (not a root) and invalidate it.

@dmitripivkine
Copy link
Contributor

So if understand this code correctly another GC occur between point arrayref is read and point where it is used (second call of loadFlattenableArrayElement(), is not it? It means VM Access is dropped or object allocation requested between these two points.

				j9object_t value = VM_ValueTypeHelpers::loadFlattenableArrayElement(_currentThread, _objectAccessBarrier, _objectAllocate, arrayref, index, true);  <---- might this trigger GC and invalidate arrayref?
#if defined(J9VM_OPT_VALHALLA_VALUE_TYPES) 
 				J9Class *arrayrefClass = J9OBJECT_CLAZZ(_currentThread, arrayref); 
 				if ((NULL == value) && J9_IS_J9CLASS_FLATTENED(arrayrefClass)) { 
 					/* We only get here due to an allocation failure */ 
 					buildGenericSpecialStackFrame(REGISTER_ARGS, 0); 
 					pushObjectInSpecialFrame(REGISTER_ARGS, arrayref); 
 					updateVMStruct(REGISTER_ARGS); 
 					value = VM_ValueTypeHelpers::loadFlattenableArrayElement(_currentThread, _objectAccessBarrier, _objectAllocate, arrayref, index, false); 

If this is correct refreshing arrayref before second call of loadFlattenableArrayElement() should help.

@dmitripivkine
Copy link
Contributor

And yes, this is the bug: loadFlattenableArrayElement() might call J9AllocateObject() which might trigger GC. And it makes arrayref to be stall - object has been relocated and 0-slot has been updated but not in the native code where arrayref is stored

@dmitripivkine
Copy link
Contributor

Please note that j9object_t receiverObject passed to loadFlattenableArrayElement() might be invalidated after object allocation call (for example GC occurred and object moved by Scavenger or Compact). It might be safer to pass stack slot location (fixed up by GC) as j9object_t * directly. Otherwise the another way of refreshing this object should be added

@hangshao0
Copy link
Contributor

@a7ehuo, as it is only reproducible on your machine, can you try with #13935 to see if the crash goes away ?

@a7ehuo
Copy link
Contributor Author

a7ehuo commented Nov 16, 2021

I ran the change from #13935. I no longer see the crash with TestSingleFieldPrimitive. Thanks for the fix Hang!

==== #13935 ====

# ./run.sh
++ /root/home/ahuo/src/openj9-openjdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -version
openjdk version "18-internal" 2022-03-15
OpenJDK Runtime Environment (build 18-internal+0-adhoc.root.openj9-openjdk-jdk)
Eclipse OpenJ9 VM (build pr-13935-hang-42a967899, JRE 18 Linux amd64-64-Bit Compressed References 20211116_000000 (JIT enabled, AOT enabled)
OpenJ9   - 71c7f386b
OMR      - b95fd53e2
JCL      - fb264efdb4f based on jdk-18+23)
++ /root/home/ahuo/src/openj9-openjdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -Xint -Xshareclasses:none -Xverify:none -XX:+EnableValhalla -XX:+EnableArrayFlattening -XX:ValueTypeFlatteningThreshold=999999 -Xint TestSingleFieldPrimitive
JVMJ9VM193W Since Java 13 -Xverify:none and -noverify were deprecated for removal and may not be accepted options in the future.

==== master branch ====

# ./run.sh
++ /root/home/ahuo/src/openj9-openjdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -version
openjdk version "18-internal" 2022-03-15
OpenJDK Runtime Environment (build 18-internal+0-adhoc.root.openj9-openjdk-jdk)
Eclipse OpenJ9 VM (build master-71c7f386b, JRE 18 Linux amd64-64-Bit Compressed References 20211116_000000 (JIT enabled, AOT enabled)
OpenJ9   - 71c7f386b
OMR      - b95fd53e2
JCL      - fb264efdb4f based on jdk-18+23)
++ /root/home/ahuo/src/openj9-openjdk-jdk/build/linux-x86_64-server-release/images/jdk/bin/java -Xint -Xshareclasses:none -Xverify:none -XX:+EnableValhalla -XX:+EnableArrayFlattening -XX:ValueTypeFlatteningThreshold=999999 -Xint TestSingleFieldPrimitive
JVMJ9VM193W Since Java 13 -Xverify:none and -noverify were deprecated for removal and may not be accepted options in the future.
Unhandled exception
Type=Floating point error vmState=0x00000000
...

@hangshao0
Copy link
Contributor

I am wondering if the issue here will cause problem to the JIT, that is old_slow_jitLoadFlattenableArrayElement() could trigger GC so that the array reference may become stale after JIT calls old_slow_jitLoadFlattenableArrayElement(). @a7ehuo @hzongaro

hangshao0 added a commit to hangshao0/openj9 that referenced this issue Nov 17, 2021
In the slow path, loadFlattenableArrayElement() could call into
J9AllocateObject() which might trigger GC. Push/pop the array object
into/from special frame so that we always have the correct array object.

Closes eclipse-openj9#13848

Signed-off-by: Hang Shao <hangshao@ca.ibm.com>
@a7ehuo
Copy link
Contributor Author

a7ehuo commented Nov 17, 2021

if the issue here will cause problem to the JIT, that is old_slow_jitLoadFlattenableArrayElement() could trigger GC so that the array reference may become stale after JIT calls old_slow_jitLoadFlattenableArrayElement().

When we create LoadFlattenableArrayElementSymbolRef, canGCandReturn and canGCandExcept are set to true. I guess JIT should be okay. @hzongaro could you confirm?

J9::SymbolReferenceTable::findOrCreateLoadFlattenableArrayElementSymbolRef(TR::ResolvedMethodSymbol * owningMethodSymbol)
{
return findOrCreateRuntimeHelper(TR_ldFlattenableArrayElement, true, true, true);

@hzongaro
Copy link
Member

hzongaro commented Nov 18, 2021

When we create LoadFlattenableArrayElementSymbolRef, canGCandReturn and canGCandExcept are set to true. I guess JIT should be okay.

To the best of my knowledge that is correct

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug comp:vm project:valhalla Used to track Project Valhalla related work segfault Issues that describe segfaults / JVM crashes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants