Skip to content

Conversation

@jsikstro
Copy link
Member

@jsikstro jsikstro commented Dec 3, 2025

Hello,

If the initial heap size is set too low in serviceability/sa/ClhsdbScanOops.java, a GC migh run, which will interfere with the test and might cause it to fail.

The test is scanning the oops in a region of the heap, and after a GC that region appears to be empty, so the output that the test expects is not present. Running the test with a larger explicit InitialHeapSize gives enough headroom to not run a GC.

Testing:

  • serviceability/sa/ClhsdbScanOops.java originally failed when run with -XX:InitialRAMPercentage=0 (which is the new default). We now explicitly set -XX:InitialHeapSize=100M. I've rerun the test 10 times with Serial and Parallel for each test and they all pass.

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8373022: serviceability/sa/ClhsdbScanOops.java assumes no GC should occur (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28637/head:pull/28637
$ git checkout pull/28637

Update a local copy of the PR:
$ git checkout pull/28637
$ git pull https://git.openjdk.org/jdk.git pull/28637/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28637

View PR using the GUI difftool:
$ git pr show -t 28637

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28637.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 3, 2025

👋 Welcome back jsikstro! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 3, 2025

@jsikstro This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8373022: serviceability/sa/ClhsdbScanOops.java assumes no GC should occur

Reviewed-by: cjplummer, stefank, ayang, tschatzl

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the serviceability serviceability-dev@openjdk.org label Dec 3, 2025
@openjdk
Copy link

openjdk bot commented Dec 3, 2025

@jsikstro The following label will be automatically applied to this pull request:

  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 3, 2025
@mlbridge
Copy link

mlbridge bot commented Dec 3, 2025

Webrevs

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 3, 2025
@plummercj
Copy link
Contributor

Most of the SA tests probably have this requirement. What is it that made these two tests be noticed? Why was a small initial heap size being used?

@jsikstro
Copy link
Member Author

jsikstro commented Dec 3, 2025

Most of the SA tests probably have this requirement. What is it that made these two tests be noticed? Why was a small initial heap size being used?

We've seen that these tests fail when changing the value of InitialRAMPercentage to 0, which results in a very small initial heap size. I'm about to add a few more tests which also needs to be run with a "non-small" initial heap size.

@plummercj
Copy link
Contributor

It's probably just the timing of the GC that determines whether the initial small heap is a problem or not. If you want the SA tests to be reliable with something like InitialRAMPercentage=0, probably all of the tests should be updated. However, personally I don't think this type of fix should be necessary unless you feel testing in the manner is something we want to support. There are plenty of tests that start failing when non-standard command line options are used.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Dec 4, 2025
* @run build TestScaffold VMConnection TargetListener TargetAdapter
* @run compile -g MethodInvokeWithTraceOnTest.java
* @run driver MethodInvokeWithTraceOnTest
* @run driver MethodInvokeWithTraceOnTest -XX:InitialHeapSize=100M
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you seeing failures due to an ObjectCollectionException? If so, avoiding GC is not the proper way to fix it. Even with a larger initial heap, there can still be an object collected with ZGC. We shouldn't have any debugger tests that rely on a GC not happening in the debugee. The proper fix is usually calling ObjectReference.disableCollection(), although sometimes even that is not enough (the call can happen too late if the debugee is not suspended).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we're seeing a failure due to ObjectCollectionException. This is because the heap size is now much smaller, causing more frequent GCs. "Reverting" to a larger initial heap size is a straightforward fix so that we can continue running this test.

I think a more "robust" approach is better, which we could do in a follow-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell me how to run this test with a small GC that reproduces the ObjectCollectionException?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've only seen this on windows-x64-debug so far, it should fail on mainline right now, given it is removed from the ProblemList it's in. I haven't updated this PR to remove the ProblemList just yet, but will do.

If you want to be explicit you can run it with -XX:InitialRAMPercentage=0, but this is the new default in mainline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a JDI bug, not a test bug. I'll fix this with #28666.

@jsikstro
Copy link
Member Author

jsikstro commented Dec 4, 2025

It's probably just the timing of the GC that determines whether the initial small heap is a problem or not. If you want the SA tests to be reliable with something like InitialRAMPercentage=0, probably all of the tests should be updated. However, personally I don't think this type of fix should be necessary unless you feel testing in the manner is something we want to support. There are plenty of tests that start failing when non-standard command line options are used.

FYI we just integrated a change that sets InitialRAMPercentage=0 for JDK 26 that we've been working on (see #28641). We've run up to Oracle's tier8 twice now, and apart from the tests that are included in this PR, we've not seen any other SA failures.

Of course there might be other intermittent failures in the future, in which case I see two approaches moving forward: problem listing or bumping the initial heap size for the affected tests, or going over all SA tests and making sure that they all run with a "large" initial heap size (like 100MB). Unless we start seing many (for some definition of many) test failures from now, a pragmatic compromise is to selectively bump the initial heap size of such tests, like I do in this PR.

Of course, the optimal approach would be to make any affected SA tests more robst to GC timings. But, since I'm not sure how much time we want to invest in improving SA tests, bumping the heap size is likely a good compromise here.

@plummercj
Copy link
Contributor

It's probably just the timing of the GC that determines whether the initial small heap is a problem or not. If you want the SA tests to be reliable with something like InitialRAMPercentage=0, probably all of the tests should be updated. However, personally I don't think this type of fix should be necessary unless you feel testing in the manner is something we want to support. There are plenty of tests that start failing when non-standard command line options are used.

FYI we just integrated a change that sets InitialRAMPercentage=0 for JDK 26 that we've been working on (see #28641). We've run up to Oracle's tier8 twice now, and apart from the tests that are included in this PR, we've not seen any other SA failures.

Of course there might be other intermittent failures in the future, in which case I see two approaches moving forward: problem listing or bumping the initial heap size for the affected tests, or going over all SA tests and making sure that they all run with a "large" initial heap size (like 100MB). Unless we start seing many (for some definition of many) test failures from now, a pragmatic compromise is to selectively bump the initial heap size of such tests, like I do in this PR.

Of course, the optimal approach would be to make any affected SA tests more robst to GC timings. But, since I'm not sure how much time we want to invest in improving SA tests, bumping the heap size is likely a good compromise here.

For the most part the SA tests are fine if there is a GC. The way they usually work is to launch the debuggee, wait for it to reach a stable point (all threads idle), and then start to query the debuggee. If a GC happens before reaching stability, that should be fine, and after stability we wouldn't expect any GCs no matter what the heap size is. There are some SA tests that run on active processes where GCs can happen, but they are written to allow for errors.

// This test assumes that no GC should happen during its execution.
// Setting the initial heap size to a reasonably high number avoids
// running a GC.
theApp = LingeredApp.startApp(gc, "-XX:InitialHeapSize=100M");
Copy link
Contributor

@plummercj plummercj Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test tries to scan a section of the java heap. When a GC happens, this section becomes empty, so the expected output is not present. Your fix seems reasonable.

@jsikstro jsikstro changed the title 8373022: Increase initial heap size for test which assumes no GC 8373022: serviceability/sa/ClhsdbScanOops.java assumes no GC should occur Dec 5, 2025
@jsikstro
Copy link
Member Author

jsikstro commented Dec 5, 2025

Thank you @plummercj for analysing the tests. I've removed the explicit InitialHeapSize from test/jdk/com/sun/jdi/MethodInvokeWithTraceOnTest.java in favor of #28666. I've also removed test/hotspot/jtreg/serviceability/sa/ClhsdbScanOops.java from the ProblemList.

The two sets of tests that are associated with this issue but not addressed in this PR are about to be solved in #28666 and #28655. I'm holding off on this PR until those changes are integrated.

I've updated both the issue and this PR to reflect the new changes.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 5, 2025
@jsikstro
Copy link
Member Author

jsikstro commented Dec 5, 2025

After merging with master and considering the change in this PR, no more tests referencing this issue are problem listed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready Pull request is ready to be integrated rfr Pull request is ready for review serviceability serviceability-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

5 participants