-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8288477: nmethod header size reduction #9165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back bulasevich! A progress list of the required criteria for merging this PR into |
@bulasevich The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
4e9cce5
to
aa74775
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, lgtm.
@@ -647,7 +648,6 @@ nmethod::nmethod( | |||
_nmethod_end_offset = _nul_chk_table_offset; | |||
#endif | |||
_compile_id = compile_id; | |||
_comp_level = CompLevel_none; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to move it to line 626?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put it back. Thanks.
@@ -54,7 +54,7 @@ enum MethodCompilation { | |||
}; | |||
|
|||
// Enumeration to distinguish tiers of compilation | |||
enum CompLevel { | |||
enum CompLevel : s1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hope it won't cause gcc to generate inefficient code manupulating bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AARCH and AMD have load byte instructions (ldr:ldrb, mov:movzx), I believe method::comp_level() code takes the same number of instructions before/after the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enum change might be hard to back-port to jdk11, which still uses an older toolchain, at least for Oracle builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
313ca2e
to
c009f44
Compare
It would be worth to add a comment to |
dbb334c
to
1c1e359
Compare
…gnment - use single byte to hold CompilerType and CompLevel values Сleanup work: use strict CompLevel type in the code
77ef013
to
644be41
Compare
Webrevs
|
It's also a good idea to have 64 bytes in between volatiles so they are not in the same cache-line |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about CompiledMethod?
Would be interesting to profile fields access. I assume hot fields should be in first cache line. Or it is not important?
Most of the files changed are because of CompLevel. It feels a little disruptive. I'd rather do the minimal changes. There is also a lot of unnecessary space used by these addresses: Now that AOT has been removed, we could go back to 3 int fields like in jdk8. |
@@ -281,6 +264,25 @@ class nmethod : public CompiledMethod { | |||
ByteSize _native_receiver_sp_offset; | |||
ByteSize _native_basic_lock_sp_offset; | |||
|
|||
CompLevel _comp_level; // compilation level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To minimize changes in other files, how about just making this:
int8_t _comp_level;
I didn't find a way to reduce the size of CompiledMethod:
I don't have a heatmap of the nmethod structure. If the nmethod data is being actively accessed on a critical code path, it probably makes sense to move the hot fields together (it doesn't have to be the first cache line, I guess). Yes, my concern was that reordering the fields as a side effect might impact performance due to a cache miss or something like that. I ran Renaissance Suite to make sure that performance is not affected. |
Actually, there is not much space in nmethod. Can you suggest a better layout?:
|
Do you mind using the CompilerType? Since we have this type defined, I think it should be used. |
I was going to suggest doing it as a separate change after this one. |
There is Leyden project for which we may need it. |
OK, but the X_end pointers could probably be 32-bit size fields relative to X_start. |
I agree with Dean. Lets change int->CompLevel in separate changes. CompilerType is not the same as CompLevel. |
I agree with that. I was also thinking about it. |
@bulasevich Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information. |
485b725
to
2d2a07a
Compare
@bulasevich Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information. |
Ok. I removed the unnecessary changes. I will bring them as a separate change. |
I created JDK-8290818 for that. I do not know profile tools to grab the access statistics automatically. I think I need to patch all the nmethod getters (check if it is done from asm), and run some apps to collect the data with the patched VM. Please let me know if you know a better way. |
@ericcaspole can you suggest how we can do that without patching VM code? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This version looks good to me. I will test it.
@dougxc these changes affect JVMCI. Please, review it. They need to be uploaded to Graal's JVMCI after pushed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doug is out but I took a look. The changes look fine and since Graal doesn't access nmethod::_comp_level directly we don't need to adjust to these changes.
@@ -255,7 +255,7 @@ | |||
nonstatic_field(MethodData, _jvmci_ir_size, int) \ | |||
\ | |||
nonstatic_field(nmethod, _verified_entry_point, address) \ | |||
nonstatic_field(nmethod, _comp_level, int) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should declare CompLevel in this file as well. I think it might be missing the sanity checking that detect missing type declarations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Actually most of the types used in vmStructs_jvmci.cpp are not declared in VM_TYPES:
- int, intptr_t, jbyte, jint, jlong, juint, u1, u2, u4, uint, uint64_t, uintptr_t, unsigned int, void*
- AccessFlags, Annotations*, ClassLoaderData*, CollectedHeap*, CompiledMethod*, ConstMethod*,
- JavaFrameAnchor, JavaThread*, MethodCounters*, MethodData*, ObjectWaiter*, OopHandle, OSThread*, Thread*
is it an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it's fine to leave it out. Vladimir had added some sanity checks to the JVMCI vmStructs in https://bugs.openjdk.org/browse/JDK-8237497 and I'd assumed it was all of the checks that the regular vmStructs does but it's not. The regular VMStructs requires that all reachable types are fully described but JVMCI actually doesn't use the VMTypeEntry/declare_toplevel_type stuff at all. It's not exposed API so it should be dropped I think. A debug build will catch the field type mismatches at compile time which is the primary thing we care about, though I think that should enabled in all builds instead of being debug only. The required checks should fold away or produce a compile time error. You can see that jvmci_vmStructs_init compiles into an empty method in the fastdebug build. Anyway, I filed https://bugs.openjdk.org/browse/JDK-8291513 to remove those declarations completely.
@bulasevich This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 288 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
Why not be more aggressive about using offsets instead of explicit addresses? Why store both _begin and _end fields? The end of one section is usually the beginning of the another section. These explicit fields aren't very nice either. Why not use an enum with arrays for the various values? Then this stuff can all be written in a consistent pattern using offsets and getting the end of a section using the beginning of the next section. Section alignments could also be declared in a consistent way using the enum. You could probably smush _is_compiled, _caller_must_gc_arguments, _type and _header_size into a bitfield. Certainly _header_size doesn't need to be a full int since sizeof(nmethod) is only around 400 bytes. |
Using offset will make impossible to separate non-code data from nmethod. We are working on prototypes to separate non-code data to improve code density in CodeCache. |
That's good to hear. I've always thought that would be a good idea. It seems like you'd still be able to separate it into two chunks and use offsets within those. |
Mailing list message from John Rose on hotspot-dev: I wish we had a more systematic way to handle packed structs with many variable-sized segments. It?s a problem in metadata as well. If there were something (with enums? offset arrays? iterators? bitmaps for conditional elements? all of the above?) that could handle such structs, that would be the place to put all of our heroics, rather than one place at at time, as with nmethods in this case. On 21 Jul 2022, at 16:40, Tom Rodriguez wrote: |
Thank you all! /integrate |
Going to push as commit e052d7f.
Your commit was automatically rebased without conflicts. |
@bulasevich Pushed as commit e052d7f. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Each compiled method contains an nmethod header. In trivial case, the header takes up half the method payload: ~350 bytes. Over time, the header gets bigger. With this change, I suggest sorting the header data fields from largest to smallest to minimize header paddings, and using one byte for the CompilerType and CompLevel values.
Cleanup work: apply CompLevel type where applicable.
The change tested with jtreg tier1-3, :hotspot_compiler :hotspot_gc :hotspot_serviceability :hotspot_runtime
Renaissance benchmarks shows no performance regressions on x86 and aarch.
BEFORE:
AFTER:
BEFORE:
AFTER:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/9165/head:pull/9165
$ git checkout pull/9165
Update a local copy of the PR:
$ git checkout pull/9165
$ git pull https://git.openjdk.org/jdk pull/9165/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 9165
View PR using the GUI difftool:
$ git pr show -t 9165
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/9165.diff