Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8287596: Reorg jdk.test.lib.util.ForceGC #8979

Closed
wants to merge 15 commits into from

Conversation

XueleiFan
Copy link
Member

@XueleiFan XueleiFan commented Jun 1, 2022

This is a follow up update per comments in JDK-8287384 PR. The tier1 and tier2 test in open part looks good to me. Please help to run Mach5 just case the closed test cases are impacted.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/8979/head:pull/8979
$ git checkout pull/8979

Update a local copy of the PR:
$ git checkout pull/8979
$ git pull https://git.openjdk.org/jdk pull/8979/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8979

View PR using the GUI difftool:
$ git pr show -t 8979

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/8979.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 1, 2022

👋 Welcome back xuelei! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@XueleiFan XueleiFan marked this pull request as ready for review June 1, 2022 19:09
@openjdk
Copy link

openjdk bot commented Jun 1, 2022

@XueleiFan The following labels will be automatically applied to this pull request:

  • core-libs
  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added security security-dev@openjdk.org core-libs core-libs-dev@openjdk.org rfr Pull request is ready for review labels Jun 1, 2022
@mlbridge
Copy link

mlbridge bot commented Jun 1, 2022

@mlchung
Copy link
Member

mlchung commented Jun 1, 2022

JDK-8287384 causes test/jdk/java/lang/invoke/defineHiddenClass/UnloadingTest.java to timeout when running with fastdebug VM. I think this might be caused by more frequent GCs.

I tried your patch and the test fails.

@XueleiFan
Copy link
Member Author

JDK-8287384 causes test/jdk/java/lang/invoke/defineHiddenClass/UnloadingTest.java to timeout when running with fastdebug VM. I think this might be caused by more frequent GCs.

I tried your patch and the test fails.

I updated the waiting time back to 1 second. Would you please check again?

1 similar comment
@XueleiFan
Copy link
Member Author

JDK-8287384 causes test/jdk/java/lang/invoke/defineHiddenClass/UnloadingTest.java to timeout when running with fastdebug VM. I think this might be caused by more frequent GCs.

I tried your patch and the test fails.

I updated the waiting time back to 1 second. Would you please check again?

@mlchung
Copy link
Member

mlchung commented Jun 1, 2022

No, it doesn't work. You can build a fastdebug build with configure --enable-debug. I reproduce it on macOS. If I restore to the previous version without 8287384, the test passes.

@XueleiFan
Copy link
Member Author

XueleiFan commented Jun 2, 2022

My macOS (M1) and Linux (CentOS) may be too powerful to reproduce the failure. I tried to set the JTREG time out to 60 seconds, and the timeout issue could be reproduced. In test/jdk/java/lang/invoke/defineHiddenClass/UnloadingTest.java, there are 6 calls to the method assertFalse(unloader.tryUnload()). Each call to tryUnload() takes at least 10 seconds. So the 6 calls takes 60 seconds at least. If I set the regression timeout value to 70 seconds on Linux, the test still can pass. It implies the rest part other than tryUnload() is pretty fast. On MacOS, the elapsed time of the test is 93.425 seconds.

It looks like a regression introduced with the update for JDK-8287384. I did not fine the cause yet. I will have more checking tomorrow.

@XueleiFan XueleiFan closed this Jun 2, 2022
@XueleiFan XueleiFan reopened this Jun 3, 2022
@XueleiFan XueleiFan closed this Jun 3, 2022
@XueleiFan
Copy link
Member Author

XueleiFan commented Jun 6, 2022

Inspired by @mlchung (See #9021), use less waiting time for UnloadingTest and re-open this PR.

@XueleiFan XueleiFan reopened this Jun 6, 2022
@openjdk
Copy link

openjdk bot commented Jun 6, 2022

@XueleiFan this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8287596
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jun 6, 2022
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Jun 6, 2022
* if did not complete after 10 Seconds
*/
public static boolean wait(BooleanSupplier booleanSupplier) {
return wait(booleanSupplier, 10000L);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the max waiting time, instead of a fixed value I strongly recommend choosing some base waiting time, say 1 second, and scaling it by the jtreg timeout factor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also have similar feedback. The timeout factor can be found in the test.timeout.factor system property, if present. When running tests with -Xcomp in a fastdebug build, the timeout factor is currently configured to 10. Setting the base waiting time to 1 second seems reasonable.

Copy link
Member Author

@XueleiFan XueleiFan Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I changed to use 1 second as the base waiting time. I was not very sure if 1 second is big enough for the GC cleanup, but the testing result on my laptop and VMs, which are super fast, looks good to me. Please help to check mach5 result.

throw new RuntimeException("loader " + " not unloaded!");
}
}

boolean tryUnload() {
ForceGC gc = new ForceGC();
return gc.await(() -> weakRef.get() == null);
return ForceGC.wait(() -> weakRef.get() == null, 1000L);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This verifies that the reference will never become weakly unreachable, i.e. it remains strongly reachable.

I suggest to add a method default to 200ms (?) timeout scaled with the timeout factor and expect the specified condition will never met. Maybe call it ForceGC.neverTrue(BooleanSupplier)? Need to tune the base waiting time; for example if timeout factor is 4, then total wait would be 800 ms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your points.

When I changed the waiting time to 1 second scaled with timeout factor, it looks like safe to use the ForceGC.wait() method for either cleanup and non-cleanup checking. The default JTREG timeout for a test case is 120 seconds. 1 seconds may be small enough so that we could use it for both cases.

@openjdk openjdk bot removed the rfr Pull request is ready for review label Jun 15, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 16, 2022
@@ -61,8 +61,7 @@ public void testClassLoaderLeak() throws Exception {
con = null;
assertNotNull(myOwnClassLoaderWeakReference.get());

ForceGC gc = new ForceGC();
assertTrue(gc.await(() -> myOwnClassLoaderWeakReference.get() == null));
assertTrue(ForceGC.wait(() -> myOwnClassLoaderWeakReference.get() == null));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to ref.get() can create a strong reference to the object; depending on the timing it might interfere with the GC in process.
ref.refersTo(null) is preferred over ref.get() == null

Here and all subsequent changes.

Copy link
Member Author

@XueleiFan XueleiFan Jun 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I learned a lot these days from PRs comments! Thank you!

The patch is changed to use refersTo().

@XueleiFan
Copy link
Member Author

ping ...

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay.

}
// The jtreg testing timeout factor.
private static final double TIMEOUT_FACTOR = Double.valueOf(
System.getProperty("test.timeout.factor", "1.0"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't mind the dependency, you can use jdk.test.lib.Utils.TIMEOUT_FACTOR
or Utils.adjustTimeout(xx) in the usage below.

* unless the thread is interrupted or a predefined waiting time elapses.
* Causes the current thread to wait until the {@code booleanSupplier}
* returns true, or a specific waiting time elapses. The waiting time
* is 1 second scalled with the jtreg testing timeout factor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "scalled" -> "scaled"

Reference.reachabilityFence(obj);
Reference.reachabilityFence(ref);

for (int retries = (int)(timeout / 200); retries >= 0; retries--) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic around the timeout might be clearer if it was only based on the number of retries,
and can be scaled by the TIMEOUT_FACTOR too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just curious if the factor could be set to some unusual values like "1.25". Scaling on timeout, and then calculating the retires could be more accuracy for such circumstances, although it may be not necessary. I moved the retries calculation close to the for-loop. Hope it is better for readers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Utils.adjustTimeout(long x) uses floating point and then converts back.

// if the reference has already been removed from the queue.
// But it is fine. For most cases, the 1st GC is sufficient
// to trigger and complete the cleanup.
queue.remove(200L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If remove() returns a non-null value, then it is safe to break out of the loop.
The GC has cleared the ref. And the final getAsBoolean() below will return the condition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if all unused object will be collected in one GC call always. The gc() specification says "When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all unused objects. ... There is also no guarantee that this effort will determine the change of reachability in any particular number of objects, or that any particular number of {@link java.lang.ref.Reference Reference} objects will be cleared and enqueued." But from the spec, I did not get a clear answer for the question.

If the booleanSupplier object could be cleared in a collection other than the ref collection, the current code may be safer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, knowing when GC is 'done' is not deterministic except for a specify Reference to a specific object.
System.gc is just a request, the checking for an object can more quickly exit the loop.
The code is as is, and already commented, might wait an extra cycle, but only in the case of a race between the requested predicate and the object being reclaimed. Ok as it.

Copy link
Member

@dfuch dfuch Jul 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not for this PR - but it could be useful to have a version of ForceGC that takes as parameter a ReferenceQueue<T> and a Reference<T>. That would be more efficient than creating a new object and waiting for a different cleaner thread to cleanup that object.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dfuch Taking a reference as parameter could simplify the use of ForceGC. I though about this idea as well, when I had to check lambada expressions in each call. I would like to do it in the future so that this PR could focus on less things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @XueleiFan, that's fine by me!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When using Reference/ReferenceQueue there is no Cleaner, only normal reference processing (via GC).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tests that use Reference/ReferenceQueue to check that an object that uses a Cleaner has been garbage collected. I don't use ForceGC in those tests because it's much more efficient to wait on my ReferenceQueue rather than having ForceGC wait on something else.

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openjdk
Copy link

openjdk bot commented Jun 30, 2022

@XueleiFan This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8287596: Reorg jdk.test.lib.util.ForceGC

Reviewed-by: rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 30, 2022
@XueleiFan
Copy link
Member Author

Could someone in Oracle help me run Mach 5 testing?

@AlanBateman
Copy link
Contributor

Does this address JDK-8288286 and allow ReflectionCallerCacheTest.java to be removed from ProblemList-Xcomp.txt?

@RogerRiggs
Copy link
Contributor

Could someone in Oracle help me run Mach 5 testing?
The CI Passed for Tiers 1-3.

@XueleiFan
Copy link
Member Author

XueleiFan commented Jul 1, 2022

Could someone in Oracle help me run Mach 5 testing?

The CI Passed for Tiers 1-3.

Thanks a lot!

@XueleiFan
Copy link
Member Author

Does this address JDK-8288286 and allow ReflectionCallerCacheTest.java to be removed from ProblemList-Xcomp.txt?

I think JDK-8288286 should be addressed, but I would like to have it further evaluated via more Mach5 testing before remove the case from problem list.

@XueleiFan
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Jul 6, 2022

Going to push as commit 82a8bd7.
Since your change was applied there have been 48 commits pushed to the master branch:

  • cc2b792: 8288992: AArch64: CMN should be handled the same way as CMP
  • 75c0a5b: 8288824: [arm32] Display isetstate in register output
  • 83a5d59: 8278479: RunThese test failure with +UseHeavyMonitors and +VerifyHeavyMonitors
  • cbaf6e8: 8288022: c2: Transform (CastLL (AddL into (AddL (CastLL when possible
  • 8341895: 8289739: Add G1 specific GC breakpoints for testing
  • ac6be16: 8289695: [TESTBUG] TestMemoryAwareness.java fails on cgroups v2 and crun
  • 4ad18cf: 8289730: Deprecated code sample in java.lang.ClassCastException
  • d8f4e97: 8289146: containers/docker/TestMemoryWithCgroupV1.java fails on linux ppc64le machine with missing Memory and Swap Limit output
  • f783244: 8289706: (cs) Avoid redundant TreeMap.containsKey call in AbstractCharsetProvider
  • fafe8b3: 8289604: compiler/vectorapi/VectorLogicalOpIdentityTest.java failed on x86 AVX1 system
  • ... and 38 more: https://git.openjdk.org/jdk/compare/d260a4e794681c6f4be4767350702754cfc2035c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 6, 2022
@openjdk openjdk bot closed this Jul 6, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 6, 2022
@openjdk
Copy link

openjdk bot commented Jul 6, 2022

@XueleiFan Pushed as commit 82a8bd7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@XueleiFan XueleiFan deleted the JDK-8287596 branch July 6, 2022 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated security security-dev@openjdk.org
6 participants