-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8299125: UnifiedOopRef in JFR leakprofiler should treat thread local oops correctly #11741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Stefan Karlsson <stefank@openjdk.org>
👋 Welcome back aboldtch! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. So we have IN_NATIVE accesses tagged as encode_in_native, and essentially AS_RAW accesses encoded as encode_non_barriered. I usually don't pick on naming, but the term "non-barriered" seems to stray away from already established terminology. I would prefer if the mentions of "non_barriered" changed to "as_raw" for consistency.
I agree, changed the terms used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
@xmas92 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 171 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
This is introducing some new, very complicated tagging scheme. I don't understand it. What is 0b011 ? |
@@ -48,7 +48,7 @@ template <typename Delegate> | |||
void RootSetClosure<Delegate>::do_oop(oop* ref) { | |||
assert(ref != NULL, "invariant"); | |||
assert(is_aligned(ref, HeapWordSize), "invariant"); | |||
if (*ref != NULL) { | |||
if (NativeAccess<>::oop_load(ref) != nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GC requires that load barriers to access oops in native storage. Currently the raw load works on all GCs as it only checks for null and all GCs encode null as NULL. But it is semantically wrong, and will not work with generational ZGC where there exist colored null which is not encoded as NULL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Should not the same construct be used in RawRootClosure? Or is it not needed there because the raw references to oops are not loaded through the Access Api?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raw storage has no GC barriers so a null oop there must be NULL. The RawAccess API is really only used when dealing with narrowOop or templated generic code which can be either. RawAccess<>::oop_load(oop* ref)
will just become a *ref
. I think I used RawAccess everywhere initially ,but @fisk thought that it should not be used for raw oop*, not sure if there were some other reason behind it than it being unnecessary / obfuscating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I might have said is that RawAccess helps with compressed oop conversion and memory ordering sprinklings. If you don't need to do anything special with either of those and you just want a plain dereference of an oop*, then just doing a plain dereference achieves the same thing with a lot less templates. Using RawAccess can make it look like there is more to it than there really is. It's perfectly correct, but maybe just not needed and makes you scratch your head a bit when really there is nothing special to see at all.
@@ -29,21 +29,33 @@ | |||
#include "utilities/globalDefinitions.hpp" | |||
|
|||
struct UnifiedOopRef { | |||
static const uintptr_t tag_mask = LP64_ONLY(0b111) NOT_LP64(0b011); | |||
static const uintptr_t native_tag = 0b001; | |||
static const uintptr_t raw_tag = 0b010; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is "raw", compared to "native"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Native is native storage (not in java heap) which requires GC barriers. Raw is raw storage which does not require GC barriers. This is thread local storage, such as the handle area, stack, etc.
@@ -42,47 +42,60 @@ inline T UnifiedOopRef::addr() const { | |||
// this specialization provides a workaround. | |||
template<> | |||
inline uintptr_t UnifiedOopRef::addr<uintptr_t>() const { | |||
return _value & ~uintptr_t(3); | |||
return (_value & ~tag_mask) LP64_ONLY(>> 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why a shift now as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not remember exactly. But I think there was some ref which on 64 bit which had the third bit set. That is this assert failed:
assert((reinterpret_cast<uintptr_t>(ref) & UnifiedOopRef::tag_mask) == 0, "Unexpected low-order bits");
I have to remove the shift and test again. I think it is strange given that ObjectAlignmentInBytes >= 8
but we added the shift for some reason. Or maybe I am just missing something obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ObjectAlignmentInBytes is 4 by default on 32 bit VMs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fisk Why are the shifts needed on 64-bits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the shift is needed in order to handle unaligned oops, as for code blob oops? The existing root processing logic has up to now skipped code blob oops, mainly because they were hard to encode. The rationale was that they are not roots in and of themselves. Now, we have seen several instances where chains are not found on iterations, so maybe the assumption on skipping code blob oops is wrong. With this encoding scheme in place, it would be possible to pass a closure for processing the code blob oops as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the tardy response.
I think I forgot what the code was actually doing at the time of writing my previous answer. We are encoding narrowOop*
and oop*
here not narrowOop
/oop
. And narrowOop*
are 4 byte aligned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation, makes sense now.
assert(((reinterpret_cast<uintptr_t>(ref) LP64_ONLY(<< 1)) & UnifiedOopRef::tag_mask) == 0, "Unexpected low-order bits"); | ||
LP64_ONLY(assert(((reinterpret_cast<uintptr_t>(ref)) & (1ull << 63)) == 0, "Unexpected high-order bit")); | ||
UnifiedOopRef result = { (reinterpret_cast<uintptr_t>(ref) LP64_ONLY(<< 1)) | tag }; | ||
assert(result.addr<T>() == ref, "sanity"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is crazy hard to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Refactoring out reinterpret_cast<uintptr_t>(ref)
into a variable might help a bit at least. The assert just want to check that the lowest two bits are not set (the tagging does not destroy information) and that on 64 bit platforms the highest bit is not set (the shift does not destroy information).
Not sure if there is a better way to express those two invariants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be easier to read the code if it was rewritten as (uncompiled):
uintptr_t value = reinterpret_cast<uintptr_t>(ref);
#ifdef _LP64
// tag_mask is 3 bits. When ref is a narrowOop* we only have 2 alignment
// bits, because of the 4 byte alignment of compressed oops addresses.
// Shift up to make way for one more bit.
assert((value & (1ull << 63)) == 0, "Unexpected high-order bit");
value <<= 1;
#endif
UnifiedOopRef result = { value | tag };
assert(result.addr<T>() == ref, "sanity");
@@ -1256,6 +1256,14 @@ namespace AccessInternal { | |||
inline bool operator !=(const T& other) const { | |||
return load<decorators | INTERNAL_VALUE_IS_OOP, P, T>(_addr) != other; | |||
} | |||
|
|||
inline bool operator ==(std::nullptr_t) const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why std::nullptr_t?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is only here to facilitate being able to compare the OopLoadProxy object directly to nullptr instead of having to create and use an oop()
/ oop(NULL)
.
We have three storage/access types in the vm, Native, Raw, Heap. The two lower bits encode this. The only valid low bits are Heap: 0b00, Native: 0b01 and Raw: 0b10. Then each storage area can be use compressed oops or not. Which is the third bit.
|
I see now it is the mask on 32-bit. |
The following is not blocking this review, but could maybe be a starting point for a future discussion: FWIW, I think that as_raw doesn't explain "why" we are not using a normal IN_NATIVE access. Maybe that's also why Markus said:
It's also an "AS" property and not an "IN" property, so it talks about "how" to perform the access and not what kind of root it is. I wanted to hint about what kind of root it was when I chose the name non-barriered. Taking a step back, ZGC deals with two categories of roots:
We use different terminology for these, like: (a) is ZGC centric, and (b) tries to be more generic. Neither of these are good names, but I think it would be beneficial to have good names for these. It would help explain why we can use "raw" access for the latter, and hopefully help the readability of the code. Maybe this is taking it too far, but if we have good names we could split IN_NATIVE into two, and tag each call site appropriately depending on root type. |
/integrate |
Going to push as commit 89d3f3d.
Your commit was automatically rebased without conflicts. |
In a similar vain to JDK-8299089 there are different semantics for OopStorage and thread local oops. UnifiedOopRef in the jfr leakprofiler must be able to distinguish these and treat them appropriately.
Testing: Running test at the moment. Has been running on the generational branch of ZGC for over three months. Not many tests exercise the leakprofiler. x86_32 has mostly been tested with GHA.
Note: The accessBackend changes are only there to facilitate comparing OopLoadProxy objects directly with nullptr instead of creating and comparing with an
oop()
/oop(NULL)
. Can be backed out of this PR and put in a different RFE instead.Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11741/head:pull/11741
$ git checkout pull/11741
Update a local copy of the PR:
$ git checkout pull/11741
$ git pull https://git.openjdk.org/jdk pull/11741/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 11741
View PR using the GUI difftool:
$ git pr show -t 11741
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11741.diff