8269897: Shenandoah: Treat UNKNOWN refs access as strong #219
Conversation
|
Webrevs
|
It looks okay to me, and it fixes the jcstress failures. But I see one of the tier1
tests failing:
$ make run-test TEST=gc/TestReferenceClearDuringReferenceProcessing.java TEST_VM_OPTS="-XX:+UseShenandoahGC"
...
TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.RuntimeException: ref not enqueued
I wonder if that failure indicates that fix is still incomplete or too conservative.
Changing behaviour for oop_load_in_heap() seems to break it (although it is not quite obvious to me how - the test does not seem to access Reference.referent itself, especially not via reflection/unsafe/etc. Maybe something internal in Reference queue does, and expects it to return NULL on unreachable.) |
Need synopsis change after limiting a fix?
Need to handle xchg
case too.
This patch passes tier1
with Shenandoah for me, but fails jcstress bundle when run with C1. A relevant ShenandoahBarrierSetC1
change is quite probably missing.
Also, probably Unsafe
get
and set
require the same handling: under concurrent weak roots, it is safer to return overly strong value, not NULL for strong refs. Let it be users' problem accessing weak values with Unsafe?
@@ -640,7 +640,7 @@ Node* ShenandoahBarrierSetC2::atomic_cmpxchg_val_at_resolved(C2AtomicParseAccess | |||
load_store = kit->gvn().transform(new DecodeNNode(load_store, load_store->get_ptr_type())); | |||
} | |||
#endif | |||
load_store = kit->gvn().transform(new ShenandoahLoadReferenceBarrierNode(NULL, load_store, access.decorators())); | |||
load_store = kit->gvn().transform(new ShenandoahLoadReferenceBarrierNode(NULL, load_store, access.decorators() & ~ON_UNKNOWN_OOP_REF)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment:
// Convert ON_UNKNOWN_OOP_REF to strong, so that we don't return NULL for weak values.
// See ShenandoahBarrierSet::See oop_atomic_cmpxchg_not_in_heap for explanation.
@@ -267,7 +267,7 @@ inline oop ShenandoahBarrierSet::AccessBarrier<decorators, BarrierSetT>::oop_ato | |||
|
|||
// Note: We don't need a keep-alive-barrier here. We already enqueue any loaded reference for SATB anyway, | |||
// because it must be the previous value. | |||
res = ShenandoahBarrierSet::barrier_set()->load_reference_barrier<decorators, T>(res, NULL); | |||
res = ShenandoahBarrierSet::barrier_set()->load_reference_barrier<decorators & ~ON_UNKNOWN_OOP_REF, T>(res, NULL); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment:
// Note: We don't need a keep-alive-barrier here. We already enqueue any loaded reference for SATB anyway,
// because it must be the previous value. Convert ON_UNKNOWN_OOP_REF to strong, so that we don't return
// NULL for weak values, breaking CAS contract. The downside is this method effectively resurrecting
// weak values without GC knowing about it. If Unsafe/JNI users are using this method on weak values
// (i.e. doing unsafe accesses to Reference.referent), they are on their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also believe "xchg" needs the same fix and a comment:
// Note: We don't need a keep-alive-barrier here. We already enqueue any loaded reference for SATB anyway,
// because it must be the previous value. Convert ON_UNKNOWN_OOP_REF to strong, so that we don't return
// NULL for weak values. ShenandoahBarrierSet::See oop_atomic_cmpxchg_not_in_heap for explanation.
@@ -640,7 +640,7 @@ Node* ShenandoahBarrierSetC2::atomic_cmpxchg_val_at_resolved(C2AtomicParseAccess | |||
load_store = kit->gvn().transform(new DecodeNNode(load_store, load_store->get_ptr_type())); | |||
} | |||
#endif | |||
load_store = kit->gvn().transform(new ShenandoahLoadReferenceBarrierNode(NULL, load_store, access.decorators())); | |||
load_store = kit->gvn().transform(new ShenandoahLoadReferenceBarrierNode(NULL, load_store, access.decorators() & ~ON_UNKNOWN_OOP_REF)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This load should go through slow path when GC is in progress, right? Should let slow path to resolve the strength? sounds there is possibility to resurrect dead oop.
Fixing this in jdk18 first, should backport to jdk17 later, after some testing in 18. See: openjdk/jdk#4697 |
We've observed test failures in jcstress, see:
https://bugs.openjdk.java.net/browse/JDK-8269897
We used to treat UNKNOWN reference accesses like weak accesses. UNKNOWN is used for Unsafe, reflection and JNI accesses, where it cannot be determined at compilation-time if we are accessing a regular field or a Reflection.referent field. The rationale for treating UNKNOWN as weak was that if the reference is a regular reference, then the value would be strongly reachable anyway, and only if it is a referent field would reachability matter. However, it turns out that this assumption is wrong: the test shows that a reference that is only weakly reachable can be legitimately written into a field, thus resurrecting the reference, and when that weakly reachable reference is loaded, it would be (wrongly) filtered as NULL.
A fix is to treat UNKNOWN accesses as strong. Accessing Reference.referent via reflection, JNI or Unsafe is Bad Idea anyway.
This test shows the problem with CAS, but I believe it affects all accesses via reflection, JNI, etc.
Testing:
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk17 pull/219/head:pull/219
$ git checkout pull/219
Update a local copy of the PR:
$ git checkout pull/219
$ git pull https://git.openjdk.java.net/jdk17 pull/219/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 219
View PR using the GUI difftool:
$ git pr show -t 219
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk17/pull/219.diff