Implement compressed refs override and split BytecodeInterpreter #8936

sharon-wang · 2020-03-20T20:17:52Z

Related: #8878

Implement compressed refs override and split BytecodeInterpreter

Add new override macro J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES.
Split BytecodeInterpreter.cpp into BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp.
Rename BytecodeInterpreter.cpp to BytecodeInterpreter.inc; only compile BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp
Move bytecode LOOP_NAME definitions from BytecodeInterpreter.inc and INTERPRETER_CLASS definitions into individual wrapper files (Debug, Full and Compressed).
In BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp, define the override macro as:
TRUE if OMR_GC_COMPRESSED_POINTERS is defined
FALSE if OMR_GC_FULL_POINTERS is defined
If the override macro is defined, set J9VMTHREAD_COMPRESS_OBJECT_REFERENCES
and J9JAVAVM_COMPRESS_OBJECT_REFERENCES to be the same value as the override macro.

Split BytecodeInterpreter included class names into Full/Compressed

Of the classes that BytecodeInterpreter.hpp includes, #include "objectreferencesmacros_undefine.inc" and #include "objectreferencesmacros_define.inc" were added to determine whether or not to split the class names into Full/Compressed versions. If compile/runtime errors occurred with the undefine+define, then the appropriate class name would be split.

Split into Full/Compressed (failed undefine+define objectreferencesmacros)

ArrayCopyHelpers.hpp
ObjectAccessBarrierAPI.hpp
ObjectAllocationAPI.hpp
ObjectHash.hpp
ObjectMonitor.hpp
UnsafeAPI.hpp
VMHelpers.hpp

Not Split (did not fail undefine+define objectreferencesmacros; the undefine+define remain in case future changes require these to be split)

OutOfLineINL.hpp
VMAccess.hpp
ValueTypeHelpers.hpp

Untouched

AtomicSupport.hpp (in OMR)
BytecodeAction.hpp (no inline functions)
JITInterface.hpp (JIT-related)
MHInterpreter.hpp (further investigation needed)

Move bytecodeLoop prototypes from vm_api.h to jvminit.c

bytecodeLoop prototypes were relocated to jvminit.c since they do not need to be exposed to anything else.

Split DebugBytecodeInterpreter into Full/Compressed

Split DebugBytecodeInterpreter.cpp into DebugBytecodeInterpreterFull.cpp and DebugBytecodeInterpreterCompressed.cpp.
Remove DebugBytecodeInterpreter.cpp

Add dummy define in split files to avoid OSX compile issue

OSX complains about empty object files. As a temporary solution, #define LOOP_NAME 0 in split interpreter files so that the resulting object files are not empty if their respective code is not enabled.

To actually address this problem, these files should not be compiled if their code is disabled. Work will need to be done in makefiles, i.e. runtime/makelib/targets.mk.ftl, to skip compiling the appropriate files.

Flags in omrcfg.h, i.e. OMR_GC_COMPRESSED_POINTERS and OMR_GC_FULL_POINTERS would be helpful in identifying which files should be compiled. Need to look into if these can be accessed in OpenJ9 makefiles or if there is an OpenJ9 equivalent to the flags.

runtime/oti/j9nonbuilder.h

gacholio · 2020-03-22T20:09:17Z

I think it would be better to remove BytecodeInterpreter.cpp (rename it to .inc perhaps), and compile both the full and compressed versions.

Each of the two specialized versions should ifdef on the correct pointer flag, and the code which selects the loop will also need to be ifdeffed.

This will work in the single-mode and mixed builds.

runtime/vm/BytecodeInterpreterFull.cpp

runtime/makelib/targets.mk.zos.inc.ftl

runtime/vm/jvminit.c

gacholio · 2020-03-23T17:25:22Z

jenkins test sanity xlinux,xlinuxlargeheap jdk8

gacholio · 2020-03-23T20:16:30Z

I'm going to try this in mixed mode and make sure the .inc file doesn't affect debuggability before I merge.

runtime/vm/jvminit.c

gacholio · 2020-03-24T02:32:44Z

This all seems to work in normal and mixed mode. The .inc file does not mess up gdb.

I did notice one problem - if I compile with -fno-inline, the non-compressed mode fails (it looks like some of the shared .hpp files are being compiled in compressed mode, and being used in the generic and non-compressed paths).

@youngar @rwy0717 Any ideas how to avoid this issue (preferrably without templates!)?

gacholio · 2020-03-24T02:33:09Z

@DanHeidinga Does this approach seem reasonable?

DanHeidinga · 2020-03-24T02:57:26Z

Does this approach seem reasonable?

Seems reasonable to me. Debug mode has me a little concerned though given we've put a lot of effort into debug perf these last few years as it matters for developer experience and the fast "inner loop" development.

Has anyone done experiments to measure the debug overhead here? If not, I'd be more comfortable seeing both a compressed and full debug interpreter being created as well.

rwy7 · 2020-03-24T03:30:28Z

Shooting from the hip, coalescing of functions across translation units (TUs, aka object files) is probably what's killing you. You already have multiple TUs (which would be the first thing I'd try). Next step is multiple libraries :-)

I think you're violating one definition rule (ODR), by providing differing implementations of a single inline function. Its usually safe to provide multiple versions of an inline function, but they have to be equivalent. Maybe you can mangle the names of inline functions (or classes would be easier) using token pasting, depending on the context.

You probably don't need a guide on token pasting, but:

#define J9_TOKEN_PASTE_NOEXPAND(x, y) x##y
#define J9_TOKEN_PASTE(x, y) J9_TOKEN_PASTE_NOEXPAND(x, y)
#define J9_MANGLE(x) J9_TOKEN_PASTE(J9_PTR_MODE, J9_TOKEN_PASTE(_, x))

#define J9_PTR_MODE j9_compressed
int J9_MANGLE(test)() {
    return 0;
}
#undef J9_PTR_MODE

#define J9_PTR_MODE j9_full
int J9_MANGLE(test)() {
    return 1;
}
#undef J9_PTR_MODE

Gives you the function names j9_compressed_test and j9_full_test.

Ultimately, I think it's better to address the ODR violations directly in the code, rather than finding some combination of link order or build flag that makes it go away. An ODR violation is undefined behaviour, and really tricky to track down.

cc @dnakamura, check my math

gacholio · 2020-03-24T14:41:15Z

@rwy0717 Could this be done by renaming the class, rather than the functions?

This would allow the source code to remain unchanged (both the user and declaration would get the updated name).

We could do this in the header files themselves, gated by the new override flag that Sharon has added.

Something like:

#if defined(J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES)
#if J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES
#define VM_ObjectMonitor VM_ObjectMonitorCompressed
#else
#define VM_ObjectMonitor VM_ObjectMonitorFull
#endif
#endif

@DanHeidinga Let me know if this is starting to offend your sensibilities.

rwy7 · 2020-03-24T15:03:11Z

Definitely, you can rename the class. Don't forget, if the name appears in any other file, those other files will also have to be fixed. As well, any out-of-line member function will have to be defined for each version of the class.

For example, here's an out-of-line reference to the VM_ObjectMonitor class: link. Luckily, this particular case is easy, you can just move VM_ObjectMonitor::incrementCancelCounter out of VM_ObjectMonitor, maybe to a new base class.

DanHeidinga · 2020-03-24T15:05:02Z

@rwy0717 Could this be done by renaming the class, rather than the functions?
....
@DanHeidinga Let me know if this is starting to offend your sensibilities.

I prefer this to mangling all the method calls. This is something that needs to only be done in the header files and not worried about by every user

gacholio · 2020-03-24T15:07:34Z

@rwy0717 I'm not sure that even needs to be changed. If it's in the .cpp file, the .hpp file will have been included without the override. Almost every .cpp file in the VM isn't really C++ - the functions are all extern "C".

gacholio · 2020-03-24T15:10:21Z

@sharon-wang Can you please start work on the solution above? At the very least, we will need VM_ObjectMonitor and MM_ObjectAllocationAPI fixed.

It may also be worth adding something to catch errors, i.e. define something in the .hpp declaring that it does not care about compressed refs, and if we hit one of the decision macros when this is defined, error out (indicating that the class should have been split with the above solution). I haven't thought this through.

rwy7 · 2020-03-24T15:12:11Z

OK, so there would actually be three implementations, not just two. That should be fine then.

gacholio · 2020-03-24T15:28:53Z

#if defined(J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES)
#if defined(J9_DONT_SPLIT_REFERENCE_MODE)
#define J9VMTHREAD_COMPRESS_OBJECT_REFERENCES(vmThread) ERROR
#else /* J9_DONT_SPLIT_REFERENCE_MODE */
#define J9VMTHREAD_COMPRESS_OBJECT_REFERENCES(vmThread) J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES
#endif /* J9_DONT_SPLIT_REFERENCE_MODE */
#else /* J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES */
#define J9VMTHREAD_COMPRESS_OBJECT_REFERENCES(vmThread) (0 != (vmThread)->compressObjectReferences)
#endif /* J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES */

This might work for the error catching case - for example, I'd define J9_DONT_SPLIT_REFERENCE_MODE in VMAccess.hpp. I do not want to put a #error in j9nonbuilder.h, as it's perfectly OK to include the file if you don't use the compressed queries, so the error has to occur at the point where the macro is invoked.

gacholio · 2020-03-24T15:30:22Z

It's also worth noting that when inlining is allowed (i.e. production builds), this will not generate any unnecessarily duplicated code, since the functions will be inlined into the caller in the appropriate compression mode.

gacholio · 2020-04-09T19:03:41Z

ObjectAccessBarrier.hpp as well:

0x00007ffff5a65cfb in VM_BytecodeInterpreterFull::arraylength (
    this=0x7ffff6d61f50, _sp=@0x7ffff6d61ec0, _pc=@0x7ffff6d61ec8)
    at BytecodeInterpreter.hpp:6101
6101				*(U_32*)_sp = J9INDEXABLEOBJECT_SIZE(_currentThread, arrayref);

runtime/oti/objectreferencesmacros_define.h

gacholio · 2020-04-09T19:29:00Z

With all of the changes I suggested, I can get through -version in both compressed and non (using the same VM). I'll do some more extensive testing now.

DanHeidinga · 2020-04-09T19:50:01Z

I'm OK with this approach. It seems like the best option and something we can continue to support in the future.

I would like the DebugInterpreter to get the same compile-twice treatment as well. Debug time performance matters to developers and we don't want to unnecessarily slow down the debug experience.

As I was discussing with Gac on slack, one of the reasons J9 succeeded inside IBM initially was as the "debug vm" since it could boot certain app servers for debug in a reasonable amount of time.

gacholio · 2020-04-09T20:48:29Z

Ignoring invalid debug-mode JIT assertions, a no-inline debug build passes internal sanity with a few possible problems that I'll look into, so I'm confident in this approach.

gacholio · 2020-04-14T00:42:37Z

I would like the DebugInterpreter to get the same compile-twice treatment

Please do the same for this as for the normal interperer - hoist the loop name, etc. out into the new .cpp files.

gacholio · 2020-04-15T05:01:28Z

Due to some technical issues, we're not splitting MHInterpreter.cpp in this PR. I'd like to get this committed sooner rather than later, so the MH experiments can take place later.

sharon-wang · 2020-04-15T15:40:24Z

An update:

ObjectAccessBarrierAPI.hpp and ObjectAllocationAPI.hpp have been split
requested fixes have been added

In progress:

split Debug interpreter into Full/Compressed

TODO:

address osx build issue - Implement compressed refs override and split BytecodeInterpreter #8936 (comment)

gacholio · 2020-04-15T18:32:48Z

Confirmed working with my mix/debug scripts.

gacholio · 2020-04-16T04:24:19Z

The debug interpreter changes work fine for me (again, with my rebuild scripts). I think any perf issue that was noticed (but not measured?) is a result of internal network traffic.

sharon-wang · 2020-04-16T15:26:52Z

Debug interpreter split and temporary OSX compile fix are now pushed.

MHInterpreter.hpp class name split and a better OSX compile fix will be addressed in a separate PR.

gacholio · 2020-04-16T15:49:21Z

Please move the override defines as close to the top of each file as possible (only j9cfg.h need be included beforehand). As things stand, some files that need splitting may be included before the split define.

sharon-wang · 2020-04-16T17:22:51Z

The override defines have been moved up for the split files

gacholio · 2020-04-16T19:00:55Z

Sorry, ignore that previous comment - it's all fine.

gacholio

Looks good to me - please squash and remove the WIP and I'll fire off some testing.

Add new override macro J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES. Split BytecodeInterpreter.cpp into BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp. Split DebugBytecodeInterpreter.cpp into DebugBytecodeInterpreterFull.cpp and DebugBytecodeInterpreterCompressed.cpp. If the override macro is defined, set J9VMTHREAD_COMPRESS_OBJECT_REFERENCES and J9JAVAVM_COMPRESS_OBJECT_REFERENCES to be the same value as the override macro. Split BytecodeInterpreter.hpp included class names: ArrayCopyHelpers, ObjectHash, ObjectMonitor, ObjectAccessBarrierAPI, ObjectAllocationAPI, UnsafeAPI, VMHelpers into Full/Compressed Signed-off-by: Sharon Wang <sharon-wang-cpsc@outlook.com>

sharon-wang · 2020-04-16T19:29:34Z

@gacholio squashed and WIP removed!

gacholio · 2020-04-16T19:32:18Z

jenkins test sanity all jdk8

DanHeidinga · 2020-04-17T13:40:40Z

Congrats on getting this landed! It's a major step on the way to the single package with both compressed & large-heap!

gacholio self-assigned this Mar 20, 2020

gacholio self-requested a review March 20, 2020 20:23

gacholio reviewed Mar 20, 2020

View reviewed changes

runtime/oti/j9nonbuilder.h Outdated Show resolved Hide resolved

sharon-wang force-pushed the split-interpreter branch from de9ab8f to aa7b565 Compare March 20, 2020 21:17

sharon-wang force-pushed the split-interpreter branch from 4602244 to 86defc7 Compare March 23, 2020 16:18

gacholio reviewed Mar 23, 2020

View reviewed changes

runtime/vm/BytecodeInterpreterFull.cpp Outdated Show resolved Hide resolved

gacholio reviewed Mar 23, 2020

View reviewed changes

runtime/makelib/targets.mk.zos.inc.ftl Outdated Show resolved Hide resolved

fjeremic added comp:build comp:vm labels Mar 23, 2020

sharon-wang force-pushed the split-interpreter branch 2 times, most recently from 7f50fda to 843405d Compare March 23, 2020 17:01

gacholio reviewed Mar 23, 2020

View reviewed changes

runtime/vm/jvminit.c Outdated Show resolved Hide resolved

sharon-wang force-pushed the split-interpreter branch from 843405d to 7dbd86a Compare March 23, 2020 17:21

gacholio reviewed Mar 23, 2020

View reviewed changes

runtime/vm/jvminit.c Outdated Show resolved Hide resolved

sharon-wang force-pushed the split-interpreter branch from 7dbd86a to b07f31c Compare March 23, 2020 21:17

gacholio reviewed Apr 9, 2020

View reviewed changes

runtime/oti/objectreferencesmacros_define.h Outdated Show resolved Hide resolved

sharon-wang changed the title ~~Implement compressed refs override and split BytecodeInterpreter~~ WIP: Implement compressed refs override and split BytecodeInterpreter Apr 14, 2020

sharon-wang force-pushed the split-interpreter branch from a30cd51 to c293083 Compare April 15, 2020 15:33

sharon-wang force-pushed the split-interpreter branch from c293083 to 9d61aa7 Compare April 15, 2020 15:43

sharon-wang force-pushed the split-interpreter branch from 9d61aa7 to 9b365df Compare April 16, 2020 15:23

sharon-wang force-pushed the split-interpreter branch from 9b365df to f5e0f15 Compare April 16, 2020 15:33

sharon-wang force-pushed the split-interpreter branch from f5e0f15 to 78c23d8 Compare April 16, 2020 17:21

gacholio approved these changes Apr 16, 2020

View reviewed changes

sharon-wang force-pushed the split-interpreter branch from 78c23d8 to afa8db4 Compare April 16, 2020 19:29

sharon-wang changed the title ~~WIP: Implement compressed refs override and split BytecodeInterpreter~~ Implement compressed refs override and split BytecodeInterpreter Apr 16, 2020

gacholio merged commit 6db7a24 into eclipse-openj9:master Apr 16, 2020

tajila mentioned this pull request Apr 20, 2020

JDK15 valhalla compilation failure : ValueTypeHelpers.hpp:86:41: error: '_currentThread' was not declared in this scope #9287

Closed

sharon-wang deleted the split-interpreter branch January 12, 2021 16:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement compressed refs override and split BytecodeInterpreter #8936

Implement compressed refs override and split BytecodeInterpreter #8936

sharon-wang commented Mar 20, 2020 •

edited

Loading

gacholio commented Mar 22, 2020

gacholio commented Mar 23, 2020

gacholio commented Mar 23, 2020

gacholio commented Mar 24, 2020

gacholio commented Mar 24, 2020

DanHeidinga commented Mar 24, 2020

rwy7 commented Mar 24, 2020 •

edited

Loading

gacholio commented Mar 24, 2020 •

edited

Loading

rwy7 commented Mar 24, 2020

DanHeidinga commented Mar 24, 2020

gacholio commented Mar 24, 2020

gacholio commented Mar 24, 2020

rwy7 commented Mar 24, 2020

gacholio commented Mar 24, 2020

gacholio commented Mar 24, 2020

gacholio commented Apr 9, 2020

gacholio commented Apr 9, 2020

DanHeidinga commented Apr 9, 2020

gacholio commented Apr 9, 2020

gacholio commented Apr 14, 2020

gacholio commented Apr 15, 2020

sharon-wang commented Apr 15, 2020

gacholio commented Apr 15, 2020

gacholio commented Apr 16, 2020 •

edited

Loading

sharon-wang commented Apr 16, 2020

gacholio commented Apr 16, 2020

sharon-wang commented Apr 16, 2020

gacholio commented Apr 16, 2020

gacholio left a comment

sharon-wang commented Apr 16, 2020

gacholio commented Apr 16, 2020

DanHeidinga commented Apr 17, 2020

Implement compressed refs override and split BytecodeInterpreter #8936

Implement compressed refs override and split BytecodeInterpreter #8936

Conversation

sharon-wang commented Mar 20, 2020 • edited Loading