Skip to content

Conversation

@bchristi-git
Copy link
Member

@bchristi-git bchristi-git commented Sep 6, 2024

From the bug description:
ForceGC would be improved by moving the Reference.reachabilityFence() calls for 'obj' and 'ref'.

Reference.reachabilityFence(obj) is currently placed after 'obj' has been set to null, so effectively does nothing. It should occur before obj = null;

For Reference.reachabilityFence(ref): 'ref' is a PhantomReference to 'obj', and is registered with 'queue'. ForceGC.waitFor() later remove()s the reference from the queue, as an indication that some GC and reference processing has taken place (hopefully causing the BooleanSupplier to return true).

The code expects the PhantomReference to be cleared and be put on the queue. But recall that a Reference refers to its queue, and not the other way around. If a Reference becomes unreachable and is garbage collected, it will never be enqueued.

I argue that the VM/GC could determine that 'ref' is not used by waitFor() and collect it before the call to queue.remove(). Moving Reference.reachabilityFence(ref) after the for() loop would prevent this scenario.

While this is only a very minor deficiency in ForceGC, I believe it would be good to ensure that the code behaves as expected.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8339687: Rearrange reachabilityFence()s in jdk.test.lib.util.ForceGC (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20898/head:pull/20898
$ git checkout pull/20898

Update a local copy of the PR:
$ git checkout pull/20898
$ git pull https://git.openjdk.org/jdk.git pull/20898/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20898

View PR using the GUI difftool:
$ git pr show -t 20898

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20898.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 6, 2024

👋 Welcome back bchristi! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Sep 6, 2024

@bchristi-git This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8339687: Rearrange reachabilityFence()s in jdk.test.lib.util.ForceGC

Reviewed-by: dholmes, smarks, kbarrett

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 39 new commits pushed to the master branch:

  • a6faf82: 8339714: Delete tedious bool type define
  • 0764323: 8225049: Bad -Xlog example in -Xlog:help, online documentation, JEP
  • 9785e19: 8339638: Update vmTestbase/nsk/jvmti/FieldWatch tests to use virtual thread factory
  • 6fd043f: 8339789: Use index and definition tags in AnnotatedElement
  • 30645f3: 8338395: Add test coverage for instantiating NativePRNG with SecureRandomParameters
  • c8e64cb: 8283779: Clarify API documentation of NetworkInterface with respect to configuration changes
  • 9243104: 8335444: Generalize implementation of AndNode mul_ring
  • 3352522: 8338894: Deprecate jhsdb debugd for removal
  • be0dca0: 8339698: x86 unused andw/orw/xorw/addw encoding could be removed
  • 64a79d8: 8335625: Update Javadoc for GetCpuLoad
  • ... and 29 more: https://git.openjdk.org/jdk/compare/fbe2629303bcee5855673b7e37d8c49f19dc9849...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 6, 2024
@openjdk
Copy link

openjdk bot commented Sep 6, 2024

@bchristi-git The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Sep 6, 2024
@mlbridge
Copy link

mlbridge bot commented Sep 6, 2024

Webrevs

obj = null;
Reference.reachabilityFence(obj);
Reference.reachabilityFence(ref);
obj = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right to question the utility of calling reachabilityFence(obj) after obj has been nulled out. But I'm still questioning the utility of calling RF(obj) at all. We don't care when obj is determined to be unreachable; what we care about is that the GC has done some reference processing. Seems to me we can simplify the above lines to

PhantomReference<Object> ref = new PhantomReference<>(new Object(), queue);

and get rid of the local variable obj entirely.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the explicit reference and RF, as I recall, is to guard against the allocation of the new object being elided entirely, with the PhantomReference constructor being passed null (or itself being elided) and no reference processing ever actually happening.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What David says ;-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dholmes-ora Is this really possible? The obj ref is passed to the PhantomReference constructor, which stores it in a field, the constructed PhantomReference is returned, and it's then used in a reachabilityFence call below. So obj should remain reachable the entire time, right?

Copy link
Member

@stuart-marks stuart-marks Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As an aside, I wasn't able to determine what any of the Reference classes do if they're created with a null reference. Possibly a spec bug?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuart-marks My recollection, which I can't confirm is that this pattern was discussed internally and there was a lot of uncertainty about what was actually needed. Interestingly there was zero discussion of this in the actual PR that added it - #8979

Thinking it through now, I tend to agree with you that the RF for ref suffices to prevent obj from being elided

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with @stuart-marks that the moved to near the ned RF for ref is sufficient.

// ignore, the loop will try again
}
}
Reference.reachabilityFence(ref);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everything from the creation of ref to the line above needs to enclosed in a try-statement, with the finally-clause including RF(ref).

Copy link
Member

@dfuch dfuch Sep 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably the same might also apply to the other call to reachability fence: that is - we might need two try-finally to keep things by-the-book?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a requirement for any try-finally's here, since I don't care about the queue/ref/referent/enqueuing
for any exits other than running to the end of the function. Adding some might make the code less fragile
against future changes, but adds clutter that might never provide any benefit.

@stuart-marks
Copy link
Member

I added a couple specific comments on the code that I thought ought to be addressed in this PR.

There is a broader issue with the timeout logic that we should be aware of, however, we might or might not choose to address it in this PR. The main issue is that the caller has requested a particular amount of time as the timeout, and the timeout loop divides by 200ms to determine the maximum number of retries. This assumes that each loop will take 200ms. However, this might not be true, because we don't know how long the booleanSupplier takes, we don't know how long System.gc() takes, and we don't know how long queue.remove() takes. This isn't an idle concern. Somebody might pass in a booleanSupplier that itself has a timeout (say of 1 second) which will cause this method to take about six times longer than expected to time out.

The usual approach for timeout logic is to take the initial System.nanoTime() value and compare subsequent return values of nanoTime() to the timeout duration, and exit the loop if the timeout duration has been exceeded. See the nanoTime() javadoc for an example.

Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reachability changes look good to me.

The sketchy timeout handling mentioned by @stuart-marks seems to me to be out
of scope for this issue.

// ignore, the loop will try again
}
}
Reference.reachabilityFence(ref);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a requirement for any try-finally's here, since I don't care about the queue/ref/referent/enqueuing
for any exits other than running to the end of the function. Adding some might make the code less fragile
against future changes, but adds clutter that might never provide any benefit.

obj = null;
Reference.reachabilityFence(obj);
Reference.reachabilityFence(ref);
obj = null;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with @stuart-marks that the moved to near the ned RF for ref is sufficient.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 10, 2024
@dholmes-ora
Copy link
Member

My understanding is that try/finally is needed to ensure the RF is guaranteed to be seen to be executed at the expected location. Otherwise the RF can in theory be moved around.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Sep 10, 2024
@bchristi-git
Copy link
Member Author

I've added a try/finally block for ref. I've also kept the obj = null assignment.

This test library is used in many places, to confirm behavior reliant on GC action. IMO it's best to code it in a way that's sure to behave as expected, and to keep it clear what it's meant to do.

@bchristi-git
Copy link
Member Author

The sketchy timeout handling mentioned by @stuart-marks seems to me to be out of scope for this issue.

I also would prefer that updates to timeout handling be done as a separate issue.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you hide whitespace this change becomes trivial to review :)

LGTM.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 11, 2024
Copy link
Member

@stuart-marks stuart-marks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, timeout stuff can be handled separately.

@bchristi-git
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Sep 11, 2024

Going to push as commit 51b85a1.
Since your change was applied there have been 46 commits pushed to the master branch:

  • d9fdf69: 8333446: Add tests for hierarchical container support
  • bfe7f92: 8339741: RISC-V: C ABI breakage for integer on stack
  • 55a7cf1: 8322420: [Linux] cgroup v2: Limits in parent nested control groups are not detected
  • 5977888: 8339686: java/foreign/TestMappedHandshake.java fails with assert(depth < max_critical_stack_depth) failed: can't have more than 10 critical frames
  • 0b3f2e6: 8339242: Fix overflow issues in AdlArena
  • ceef161: 8339661: ZGC: Move some page resets and verification to callsites
  • 8fce527: 8339810: Clean up the code in sun.tools.jar.Main to properly close resources and use ZipFile during extract
  • a6faf82: 8339714: Delete tedious bool type define
  • 0764323: 8225049: Bad -Xlog example in -Xlog:help, online documentation, JEP
  • 9785e19: 8339638: Update vmTestbase/nsk/jvmti/FieldWatch tests to use virtual thread factory
  • ... and 36 more: https://git.openjdk.org/jdk/compare/fbe2629303bcee5855673b7e37d8c49f19dc9849...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Sep 11, 2024
@openjdk openjdk bot closed this Sep 11, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Sep 11, 2024
@openjdk
Copy link

openjdk bot commented Sep 11, 2024

@bchristi-git Pushed as commit 51b85a1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

5 participants