Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8305895: Implementation: JEP 450: Compact Object Headers (Experimental) #13844

Merged
merged 87 commits into from
May 12, 2023

Conversation

rkennke
Copy link
Contributor

@rkennke rkennke commented May 5, 2023

This is the main body of the JEP 450: Compact Object Headers (Experimental).

Main changes:

  • Introduction of the (experimental) flag UseCompactObjectHeaders. All changes in this PR are protected by this flag.
  • The compressed Klass* can now be stored in the mark-word of objects. In order to be able to do this, we are building on 8291555: Implement alternative fast-locking scheme #10907, 8305896: Alternative full GC forwarding #13582 and 8305898: Alternative self-forwarding mechanism #13779 to protect the relevant (upper 32) bits of the mark-word. Significant parts of this PR deal with loading the compressed Klass* from the mark-word, and dealing with (monitor-)locked objects. When the object is monitor-locked, we load the displaced mark-word from the monitor, and load the compressed Klass* from there. This PR also changes some code paths (mostly in GCs) to be more careful when accessing Klass* (or mark-word or size) to be able to fetch it from the forwardee in case the object is forwarded, and/or reach through to the monitor when the object is locked by a monitor.
  • The identity hash-code is narrowed to 25 bits.
  • Instances can now have their base-offset (the offset where the field layouter starts to place fields) at offset 8 (instead of 12 or 16).
  • Arrays will can now store their length at offset 8. Due to alignment restrictions, array elements will still start at offset 16. 8139457: Relax alignment of array elements #11044 will resolve that restriction and allow array elements to start at offset 12 (except for long, double and uncompressed oops, which are still required to start at an element-aligned offset).
  • CDS can now write and read archives with the compressed header. However, it is not possible to read an archive that has been written with an opposite setting of UseCompactObjectHeaders.

Testing:
(+UseCompactObjectHeaders tests are run with the flag hard-patched into the build, to also catch @flagless tests, and to avoid mismatches with CDS - see above.)

  • tier1 (x86_64)
  • tier2 (x86_64)
  • tier3 (x86_64)
  • tier4 (x86_64)
  • tier1 (aarch64)
  • tier2 (aarch64)
  • tier3 (aarch64)
  • tier4 (aarch64)
  • tier1 (x86_64) +UseCompactObjectHeaders
  • tier2 (x86_64) +UseCompactObjectHeaders
  • tier3 (x86_64) +UseCompactObjectHeaders
  • tier4 (x86_64) +UseCompactObjectHeaders
  • tier1 (aarch64) +UseCompactObjectHeaders
  • tier2 (aarch64) +UseCompactObjectHeaders
  • tier3 (aarch64) +UseCompactObjectHeaders
  • tier4 (aarch64) +UseCompactObjectHeaders

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires CSR request JDK-8306000 to be approved

Integration blocker

 ⚠️ Dependency #13779 must be integrated first

Issues

  • JDK-8305895: Implementation: JEP 450: Compact Object Headers (Experimental)
  • JDK-8306000: Add experimental -XX:+UseCompactObjectHeaders flag (CSR)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/13844/head:pull/13844
$ git checkout pull/13844

Update a local copy of the PR:
$ git checkout pull/13844
$ git pull https://git.openjdk.org/jdk.git pull/13844/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 13844

View PR using the GUI difftool:
$ git pr show -t 13844

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/13844.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 5, 2023

👋 Welcome back rkennke! A progress list of the required criteria for merging this PR into pr/13779 will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label May 5, 2023
@openjdk
Copy link

openjdk bot commented May 5, 2023

@rkennke The following labels will be automatically applied to this pull request:

  • hotspot
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels May 5, 2023
@rkennke rkennke changed the title 8305895: Implementation: JEP: Compact Object Headers (Experimental) 8305895: Implementation: JEP 450: Compact Object Headers (Experimental) May 7, 2023
@openjdk
Copy link

openjdk bot commented May 8, 2023

⚠️ @rkennke This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@rkennke rkennke marked this pull request as ready for review May 8, 2023 15:09
@openjdk openjdk bot added the rfr Pull request is ready for review label May 8, 2023
@mlbridge
Copy link

mlbridge bot commented May 8, 2023

Tom Rodriguez and others added 2 commits May 8, 2023 18:40
@merykitty
Copy link
Member

I'm not sure if this is trivial or significant, but if you limit the class pointer to 30 bit, and use the upper 2 bits for locking, then you can obtain the class pointer in less instructions:

movl dst, [obj + 4]
andl dst, 0xBFFFFFFF
jl slow_path

This exploits the fact that the most significant bit represents a negative number, so it clears the unrelated bit and checks for valid header at the same time, the sequence is only 2 instructions long after macro fusion, compared to the current value of 3.

This also allows quick class comparisons against constants, assuming that most instance is in unlock state, the comparison when equality is likely can be done:

cmpl [obj + 4], con | 0x40000000
jne slow_path

This can be matched on an If so that the slow_path can branch to the IfTrue label directly, and the fast path has only 1 comparison and 1 conditional jump.

Thanks.

@rkennke
Copy link
Contributor Author

rkennke commented May 9, 2023

I'm not sure if this is trivial or significant, but if you limit the class pointer to 30 bit, and use the upper 2 bits for locking, then you can obtain the class pointer in less instructions:

movl dst, [obj + 4]
andl dst, 0xBFFFFFFF
jl slow_path

This exploits the fact that the most significant bit represents a negative number, so it clears the unrelated bit and checks for valid header at the same time, the sequence is only 2 instructions long after macro fusion, compared to the current value of 3.

This also allows quick class comparisons against constants, assuming that most instance is in unlock state, the comparison when equality is likely can be done:

cmpl [obj + 4], con | 0x40000000
jne slow_path

This can be matched on an If so that the slow_path can branch to the IfTrue label directly, and the fast path has only 1 comparison and 1 conditional jump.

Thanks.

These are great suggestions! I would shy away from doing it in this PR, though, because this also affects the locking subsystem and would cause quite intrusive changes and invalidate all the testing that we've done. Let's consider this in the Lilliput project and upstream the optimization separately, ok?

Thanks!
Roman

Reviewed-by: kvn, chagedorn, thartmann
@forax
Copy link
Member

forax commented May 9, 2023

This seems great as an intermediary step toward a 32 bits header.

if (UseCompactObjectHeaders) {
// The copy above is not atomic. Make sure we have seen the proper mark
// and re-install it into the copy, so that Klass* is guaranteed to be correct.
markWord mark = o->mark_acquire();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the acquire here, do we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. An atomic load would be sufficient, which is what oopDesc::mark() already does. I change those code paths to use just mark().

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any comments on the compiler code or gc code, but some other comments and questions. Some of the LP64 preprocessor conditionals are inconsistent in the assembly code.

assert(!MacroAssembler::needs_explicit_null_check(oopDesc::mark_offset_in_bytes()), "must add explicit null check");
} else {
assert(!MacroAssembler::needs_explicit_null_check(oopDesc::klass_offset_in_bytes()), "must add explicit null check");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We put this check in Universe::genesis() so it's not needed here for one less conditional. Maybe that check should be this one instead of what we have there.

void MacroAssembler::load_nklass_compact(Register dst, Register src) {
assert(UseCompactObjectHeaders, "expects UseCompactObjectHeaders");

if (!UseCompactObjectHeaders) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused, why is this conditional here if you asserted it before? I can't imagine this being an untested code path and you need this for safety. If so, this doesn't take CompressedKlassPointers into account. I think it would be better to remove it. If I'm reading this right. Maybe change this assert to a guarantee for testing if you think this is likely.

I see why this is. This is inconsistent with x86. You should fix this to match x86 and make it load_narrow_klass().

assert_different_registers(obj, klass, len);
movptr(Address(obj, oopDesc::mark_offset_in_bytes()), checked_cast<int32_t>(markWord::prototype().value()));
assert_different_registers(obj, klass, len, t1, t2);
if (UseCompactObjectHeaders) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be in _LP64 too like the code just above?

}

if (len->is_valid()) {
movl(Address(obj, arrayOopDesc::length_offset_in_bytes()), len);
if (UseCompactObjectHeaders) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be in _LP64 and not have && !UseCompactObjectHeaders. You should restrict this to LP64 in this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will put it in _LP64 (even though it is not strictly needed - UseCompactObjectHeaders is hard-wired constant false, so compiler will not include the code, I would expect), but why not check UseCompactObjectHeaders here? The new code is only sensible with compact headers.

src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp Outdated Show resolved Hide resolved
src/hotspot/share/memory/universe.cpp Outdated Show resolved Hide resolved
#endif
return prototype;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a useful change without UseCompactObjectHeaders as an enhancement and to remove some conditional code. Since we have storage in Klass for it anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This code used to be there with BiasedLocking, and has been removed. I've re-instated it for compact object headers, because the prototype mark for an object now depends on its Klass, but other than that, why would it be useful? The prototype would be just markWord::prototype().

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought not to waste a 64 bit field in Klass and to maybe eliminate some if (CompactObjectHeaders) use the one in Klass else use MarkWord::prototype(), just always use the one in Klass. Minimizing if (CompactObjectHeaders) would be a good thing. At any case, this isn't for this change, just an idea to try to use this field unconditionally. I recognize it from BiasedLocking.

@@ -156,7 +156,8 @@ ObjArrayKlass::ObjArrayKlass(int n, Klass* element_klass, Symbol* name) : ArrayK
}

size_t ObjArrayKlass::oop_size(oop obj) const {
assert(obj->is_objArray(), "must be object array");
// In this assert, we cannot safely access the Klass* with compact headers.
assert(UseCompactObjectHeaders || obj->is_objArray(), "must be object array");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there code that checks oop->is_objArray() before calling this? Would it return true when it's not an objArray?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is. We're a bit excessive with asserting the klass here. I tried to remain as close as possible with that, so I disabled it only for compact object headers.

return _metadata._klass;
}
}

Klass* oopDesc::klass_or_null() const {
if (UseCompressedClassPointers) {
#ifdef _LP64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like all these #ifdef _LP64 here. Maybe markWord.inline.hpp can be refactored to not require callers to have this conditional inclusion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is with 32bits, in markWord, we only have 32bits in the header, and no place to stick in the Klass* in the upper 32 bits. That's why I put all those #ifdefs, there.
If you take a step back, you'll notice that compact object headers mostly aligns the layout headers of 64bit and 32bit JVMs. There would be a great opportunity here to consolidate all this code, make the whole header a union/struct/bitfield that looks the same both on 32bit and 64bit builds. But this conflicts with the current implementation where we want to be able to switch between compact and legacy header layout.

Also, going forward, we want to shrink the header even more to just 32bits, and still have it switchable with the old layout. Eventually all this stuff will be the same in 32bit and 64bit JVMs, but for the time being I think we need to keep it slightly messy to support the legacy layout.

}
if (UseCompactObjectHeaders && !UseCompressedClassPointers) {
FLAG_SET_DEFAULT(UseCompressedClassPointers, true);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a function like set_compact_object_headers_flags(), that checks for FLAG_IS_CMDLINE for the related options and give a warning for them too, and add a test.

@coleenp
Copy link
Contributor

coleenp commented May 12, 2023

@rkennke Can you merge up with the GenerationalZGC changes because some of our test definitions need it.

@openjdk
Copy link

openjdk bot commented May 12, 2023

@rkennke Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

@openjdk-notifier openjdk-notifier bot merged commit f1ad342 into openjdk:pr/13779 May 12, 2023
@rkennke
Copy link
Contributor Author

rkennke commented May 12, 2023

I'm sorry, I think I butchered this PR while trying to merge latest upstream through all the dependent PRs. Let's continue the discussion the new PR #13961. I hope I haven't caused anything breakage (for some reason, this PR now shows as "Merged" which worries me. I believe the Skara bot did that. I wonder where it has been merged to.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
csr Pull request needs approved CSR before integration hotspot hotspot-dev@openjdk.org rfr Pull request is ready for review shenandoah shenandoah-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.