Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement compressed refs override and split BytecodeInterpreter #8936

Merged
merged 1 commit into from
Apr 16, 2020

Conversation

sharon-wang
Copy link
Contributor

@sharon-wang sharon-wang commented Mar 20, 2020

Related: #8878

Implement compressed refs override and split BytecodeInterpreter

  • Add new override macro J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES.
  • Split BytecodeInterpreter.cpp into BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp.
  • Rename BytecodeInterpreter.cpp to BytecodeInterpreter.inc; only compile BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp
  • Move bytecode LOOP_NAME definitions from BytecodeInterpreter.inc and INTERPRETER_CLASS definitions into individual wrapper files (Debug, Full and Compressed).
  • In BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp, define the override macro as:
    TRUE if OMR_GC_COMPRESSED_POINTERS is defined
    FALSE if OMR_GC_FULL_POINTERS is defined
  • If the override macro is defined, set J9VMTHREAD_COMPRESS_OBJECT_REFERENCES
    and J9JAVAVM_COMPRESS_OBJECT_REFERENCES to be the same value as the override macro.

Split BytecodeInterpreter included class names into Full/Compressed

Of the classes that BytecodeInterpreter.hpp includes, #include "objectreferencesmacros_undefine.inc" and #include "objectreferencesmacros_define.inc" were added to determine whether or not to split the class names into Full/Compressed versions. If compile/runtime errors occurred with the undefine+define, then the appropriate class name would be split.

Split into Full/Compressed (failed undefine+define objectreferencesmacros)

  • ArrayCopyHelpers.hpp
  • ObjectAccessBarrierAPI.hpp
  • ObjectAllocationAPI.hpp
  • ObjectHash.hpp
  • ObjectMonitor.hpp
  • UnsafeAPI.hpp
  • VMHelpers.hpp

Not Split (did not fail undefine+define objectreferencesmacros; the undefine+define remain in case future changes require these to be split)

  • OutOfLineINL.hpp
  • VMAccess.hpp
  • ValueTypeHelpers.hpp

Untouched

  • AtomicSupport.hpp (in OMR)
  • BytecodeAction.hpp (no inline functions)
  • JITInterface.hpp (JIT-related)
  • MHInterpreter.hpp (further investigation needed)

Move bytecodeLoop prototypes from vm_api.h to jvminit.c

bytecodeLoop prototypes were relocated to jvminit.c since they do not need to be exposed to anything else.

Split DebugBytecodeInterpreter into Full/Compressed

  • Split DebugBytecodeInterpreter.cpp into DebugBytecodeInterpreterFull.cpp and DebugBytecodeInterpreterCompressed.cpp.
  • Remove DebugBytecodeInterpreter.cpp

Add dummy define in split files to avoid OSX compile issue

OSX complains about empty object files. As a temporary solution, #define LOOP_NAME 0 in split interpreter files so that the resulting object files are not empty if their respective code is not enabled.

To actually address this problem, these files should not be compiled if their code is disabled. Work will need to be done in makefiles, i.e. runtime/makelib/targets.mk.ftl, to skip compiling the appropriate files.

Flags in omrcfg.h, i.e. OMR_GC_COMPRESSED_POINTERS and OMR_GC_FULL_POINTERS would be helpful in identifying which files should be compiled. Need to look into if these can be accessed in OpenJ9 makefiles or if there is an OpenJ9 equivalent to the flags.

@gacholio gacholio self-assigned this Mar 20, 2020
@gacholio gacholio self-requested a review March 20, 2020 20:23
@gacholio
Copy link
Contributor

I think it would be better to remove BytecodeInterpreter.cpp (rename it to .inc perhaps), and compile both the full and compressed versions.

Each of the two specialized versions should ifdef on the correct pointer flag, and the code which selects the loop will also need to be ifdeffed.

This will work in the single-mode and mixed builds.

runtime/vm/jvminit.c Outdated Show resolved Hide resolved
@gacholio
Copy link
Contributor

jenkins test sanity xlinux,xlinuxlargeheap jdk8

@gacholio
Copy link
Contributor

I'm going to try this in mixed mode and make sure the .inc file doesn't affect debuggability before I merge.

runtime/vm/jvminit.c Outdated Show resolved Hide resolved
@gacholio
Copy link
Contributor

This all seems to work in normal and mixed mode. The .inc file does not mess up gdb.

I did notice one problem - if I compile with -fno-inline, the non-compressed mode fails (it looks like some of the shared .hpp files are being compiled in compressed mode, and being used in the generic and non-compressed paths).

@youngar @rwy0717 Any ideas how to avoid this issue (preferrably without templates!)?

@gacholio
Copy link
Contributor

@DanHeidinga Does this approach seem reasonable?

@DanHeidinga
Copy link
Member

Does this approach seem reasonable?

Seems reasonable to me. Debug mode has me a little concerned though given we've put a lot of effort into debug perf these last few years as it matters for developer experience and the fast "inner loop" development.

Has anyone done experiments to measure the debug overhead here? If not, I'd be more comfortable seeing both a compressed and full debug interpreter being created as well.

@rwy7
Copy link
Contributor

rwy7 commented Mar 24, 2020

Shooting from the hip, coalescing of functions across translation units (TUs, aka object files) is probably what's killing you. You already have multiple TUs (which would be the first thing I'd try). Next step is multiple libraries :-)

I think you're violating one definition rule (ODR), by providing differing implementations of a single inline function. Its usually safe to provide multiple versions of an inline function, but they have to be equivalent. Maybe you can mangle the names of inline functions (or classes would be easier) using token pasting, depending on the context.

You probably don't need a guide on token pasting, but:

#define J9_TOKEN_PASTE_NOEXPAND(x, y) x##y
#define J9_TOKEN_PASTE(x, y) J9_TOKEN_PASTE_NOEXPAND(x, y)
#define J9_MANGLE(x) J9_TOKEN_PASTE(J9_PTR_MODE, J9_TOKEN_PASTE(_, x))

#define J9_PTR_MODE j9_compressed
int J9_MANGLE(test)() {
    return 0;
}
#undef J9_PTR_MODE

#define J9_PTR_MODE j9_full
int J9_MANGLE(test)() {
    return 1;
}
#undef J9_PTR_MODE

Gives you the function names j9_compressed_test and j9_full_test.

Ultimately, I think it's better to address the ODR violations directly in the code, rather than finding some combination of link order or build flag that makes it go away. An ODR violation is undefined behaviour, and really tricky to track down.

cc @dnakamura, check my math

@gacholio
Copy link
Contributor

gacholio commented Mar 24, 2020

@rwy0717 Could this be done by renaming the class, rather than the functions?

This would allow the source code to remain unchanged (both the user and declaration would get the updated name).

We could do this in the header files themselves, gated by the new override flag that Sharon has added.

Something like:

#if defined(J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES)
#if J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES
#define VM_ObjectMonitor VM_ObjectMonitorCompressed
#else
#define VM_ObjectMonitor VM_ObjectMonitorFull
#endif
#endif

@DanHeidinga Let me know if this is starting to offend your sensibilities.

@rwy7
Copy link
Contributor

rwy7 commented Mar 24, 2020

Definitely, you can rename the class. Don't forget, if the name appears in any other file, those other files will also have to be fixed. As well, any out-of-line member function will have to be defined for each version of the class.

For example, here's an out-of-line reference to the VM_ObjectMonitor class: link. Luckily, this particular case is easy, you can just move VM_ObjectMonitor::incrementCancelCounter out of VM_ObjectMonitor, maybe to a new base class.

@DanHeidinga
Copy link
Member

@rwy0717 Could this be done by renaming the class, rather than the functions?
....
@DanHeidinga Let me know if this is starting to offend your sensibilities.

I prefer this to mangling all the method calls. This is something that needs to only be done in the header files and not worried about by every user

@gacholio
Copy link
Contributor

@rwy0717 I'm not sure that even needs to be changed. If it's in the .cpp file, the .hpp file will have been included without the override. Almost every .cpp file in the VM isn't really C++ - the functions are all extern "C".

@gacholio
Copy link
Contributor

@sharon-wang Can you please start work on the solution above? At the very least, we will need VM_ObjectMonitor and MM_ObjectAllocationAPI fixed.

It may also be worth adding something to catch errors, i.e. define something in the .hpp declaring that it does not care about compressed refs, and if we hit one of the decision macros when this is defined, error out (indicating that the class should have been split with the above solution). I haven't thought this through.

@rwy7
Copy link
Contributor

rwy7 commented Mar 24, 2020

OK, so there would actually be three implementations, not just two. That should be fine then.

@gacholio
Copy link
Contributor

#if defined(J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES)
#if defined(J9_DONT_SPLIT_REFERENCE_MODE)
#define J9VMTHREAD_COMPRESS_OBJECT_REFERENCES(vmThread) ERROR
#else /* J9_DONT_SPLIT_REFERENCE_MODE */
#define J9VMTHREAD_COMPRESS_OBJECT_REFERENCES(vmThread) J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES
#endif /* J9_DONT_SPLIT_REFERENCE_MODE */
#else /* J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES */
#define J9VMTHREAD_COMPRESS_OBJECT_REFERENCES(vmThread) (0 != (vmThread)->compressObjectReferences)
#endif /* J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES */

This might work for the error catching case - for example, I'd define J9_DONT_SPLIT_REFERENCE_MODE in VMAccess.hpp. I do not want to put a #error in j9nonbuilder.h, as it's perfectly OK to include the file if you don't use the compressed queries, so the error has to occur at the point where the macro is invoked.

@gacholio
Copy link
Contributor

It's also worth noting that when inlining is allowed (i.e. production builds), this will not generate any unnecessarily duplicated code, since the functions will be inlined into the caller in the appropriate compression mode.

@gacholio
Copy link
Contributor

gacholio commented Apr 9, 2020

ObjectAccessBarrier.hpp as well:

0x00007ffff5a65cfb in VM_BytecodeInterpreterFull::arraylength (
    this=0x7ffff6d61f50, _sp=@0x7ffff6d61ec0, _pc=@0x7ffff6d61ec8)
    at BytecodeInterpreter.hpp:6101
6101				*(U_32*)_sp = J9INDEXABLEOBJECT_SIZE(_currentThread, arrayref);

@gacholio
Copy link
Contributor

gacholio commented Apr 9, 2020

With all of the changes I suggested, I can get through -version in both compressed and non (using the same VM). I'll do some more extensive testing now.

@DanHeidinga
Copy link
Member

I'm OK with this approach. It seems like the best option and something we can continue to support in the future.

I would like the DebugInterpreter to get the same compile-twice treatment as well. Debug time performance matters to developers and we don't want to unnecessarily slow down the debug experience.

As I was discussing with Gac on slack, one of the reasons J9 succeeded inside IBM initially was as the "debug vm" since it could boot certain app servers for debug in a reasonable amount of time.

@gacholio
Copy link
Contributor

gacholio commented Apr 9, 2020

Ignoring invalid debug-mode JIT assertions, a no-inline debug build passes internal sanity with a few possible problems that I'll look into, so I'm confident in this approach.

@gacholio
Copy link
Contributor

I would like the DebugInterpreter to get the same compile-twice treatment

Please do the same for this as for the normal interperer - hoist the loop name, etc. out into the new .cpp files.

@sharon-wang sharon-wang changed the title Implement compressed refs override and split BytecodeInterpreter WIP: Implement compressed refs override and split BytecodeInterpreter Apr 14, 2020
@gacholio
Copy link
Contributor

Due to some technical issues, we're not splitting MHInterpreter.cpp in this PR. I'd like to get this committed sooner rather than later, so the MH experiments can take place later.

@sharon-wang
Copy link
Contributor Author

An update:

  • ObjectAccessBarrierAPI.hpp and ObjectAllocationAPI.hpp have been split
  • requested fixes have been added

In progress:

  • split Debug interpreter into Full/Compressed

TODO:

@gacholio
Copy link
Contributor

Confirmed working with my mix/debug scripts.

@gacholio
Copy link
Contributor

gacholio commented Apr 16, 2020

The debug interpreter changes work fine for me (again, with my rebuild scripts). I think any perf issue that was noticed (but not measured?) is a result of internal network traffic.

@sharon-wang
Copy link
Contributor Author

Debug interpreter split and temporary OSX compile fix are now pushed.

MHInterpreter.hpp class name split and a better OSX compile fix will be addressed in a separate PR.

@gacholio
Copy link
Contributor

Please move the override defines as close to the top of each file as possible (only j9cfg.h need be included beforehand). As things stand, some files that need splitting may be included before the split define.

@sharon-wang
Copy link
Contributor Author

The override defines have been moved up for the split files

@gacholio
Copy link
Contributor

Sorry, ignore that previous comment - it's all fine.

Copy link
Contributor

@gacholio gacholio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - please squash and remove the WIP and I'll fire off some testing.

Add new override macro J9_OVERRIDE_COMPRESS_OBJECT_REFERENCES.

Split BytecodeInterpreter.cpp into BytecodeInterpreterFull.cpp and BytecodeInterpreterCompressed.cpp.

Split DebugBytecodeInterpreter.cpp into DebugBytecodeInterpreterFull.cpp and DebugBytecodeInterpreterCompressed.cpp.

If the override macro is defined, set J9VMTHREAD_COMPRESS_OBJECT_REFERENCES and J9JAVAVM_COMPRESS_OBJECT_REFERENCES to be the same value as the override macro.

Split BytecodeInterpreter.hpp included class names: ArrayCopyHelpers, ObjectHash, ObjectMonitor, ObjectAccessBarrierAPI, ObjectAllocationAPI, UnsafeAPI, VMHelpers into Full/Compressed

Signed-off-by: Sharon Wang <sharon-wang-cpsc@outlook.com>
@sharon-wang sharon-wang changed the title WIP: Implement compressed refs override and split BytecodeInterpreter Implement compressed refs override and split BytecodeInterpreter Apr 16, 2020
@sharon-wang
Copy link
Contributor Author

@gacholio squashed and WIP removed!

@gacholio
Copy link
Contributor

jenkins test sanity all jdk8

@gacholio gacholio merged commit 6db7a24 into eclipse-openj9:master Apr 16, 2020
@DanHeidinga
Copy link
Member

Congrats on getting this landed! It's a major step on the way to the single package with both compressed & large-heap!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants