Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8346834: Tests failing with -XX:+UseNUMA due to "NUMA support disabled" warning #22948

Closed
wants to merge 3 commits into from

Conversation

swati-sha
Copy link
Contributor

@swati-sha swati-sha commented Jan 7, 2025

Hi All,

A number of tests launch VMs and read the output of the sub-process. The changes in JDK-8205051 mean the warning message "NUMA support disabled: Only a single NUMA node is available" is printed when running the tests -XX:+UseNUMA on system that only have one node, this breaks several tests. After update in some tests, so far, the failures are with:

java/util/logging/LoggingDeadlock2.java
tools/jar/modularJar/Basic.java

As a fix have changed logging level from "log_warning" to "log_info" when UseNUMA flag is disabled.

Thanks,
Swati Sharma
Intel


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8346834: Tests failing with -XX:+UseNUMA due to "NUMA support disabled" warning (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22948/head:pull/22948
$ git checkout pull/22948

Update a local copy of the PR:
$ git checkout pull/22948
$ git pull https://git.openjdk.org/jdk.git pull/22948/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22948

View PR using the GUI difftool:
$ git pr show -t 22948

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22948.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 7, 2025

👋 Welcome back swati-sha! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 7, 2025

@swati-sha This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8346834: Tests failing with -XX:+UseNUMA due to "NUMA support disabled" warning

Reviewed-by: dholmes, sjohanss

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 154 new commits pushed to the master branch:

  • be1cdd9: 8344140: Refactor the discovery of AOT cache artifacts
  • 973c630: 8342466: Improve API documentation for java.lang.classfile.attribute
  • 9782bfd: 8347620: Shenandoah: Use 'free' tag for free set related logging
  • 35be4a4: 8347173: java/net/DatagramSocket/InterruptibleDatagramSocket.java fails with virtual thread factory
  • 36b7abd: 8225763: Inflater and Deflater should implement AutoCloseable
  • d6d45c6: 8303884: jlink --add-options plugin does not allow GNU style options to be provided
  • 0ee6ba9: 8347596: Update HSS/LMS public key encoding
  • ec2aaaa: 8326236: assert(ce != nullptr) failed in Continuation::continuation_bottom_sender
  • 02d2493: 8347613: Remove leftover doPrivileged call in Currency test: CheckDataVersion.java
  • 10d08db: 8346142: [perf] scalability issue for the specjvm2008::xml.validation workload
  • ... and 144 more: https://git.openjdk.org/jdk/compare/de0250368edbf4e9bebf326778f8f8773b69b84c...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dholmes-ora, @kstefanj) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 7, 2025
@openjdk
Copy link

openjdk bot commented Jan 7, 2025

@swati-sha The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Jan 7, 2025
@mlbridge
Copy link

mlbridge bot commented Jan 7, 2025

Webrevs

@kstefanj
Copy link
Contributor

kstefanj commented Jan 7, 2025

I think this is a bit unfortunate. I saw the comment in the bug around the wording for the flag UseNUMA:

Use NUMA if available

I don't fully agree that it's wrong to issue a warning (just because is says "if available"), it would be wrong to issue an error and terminate the process. I see the warning as a way to inform the user that the performance feature they configured the process to use couldn't be used. If this is instead communicated as a log statement on info-level almost nobody will see it.

Especially for the newly added case, where we disable NUMA when the cpu and memory nodes mismatch, I think the warning could be helpful to users.

Looking at tools/jar/modularJar/Basic.java I see that it already has code to handle "VM warnings", so adding support for "UL warnings" would likely be fine there. The other test listed above can't handle any type of unexpected logging and also fails with -Xlog:gcso adding something like: -Xlog:all=off to the child process in that test would avoid a lot of failures caused by unexpected logs.

I don't oppose this change but wanted to share my view on it.

@dwhite-intel
Copy link

Those are good points @kstefanj. This is touching a lot of areas including testing, logging (it would be handy to run tests with GC logs I'd think), as well as the "principle of least surprise" for the user.

We'll leave this topic to you and @AlanBateman, @dholmes-ora and others with more experience in the wider issues. We've implemented the message both as a warning and as "info" and are happy to have either one.

@dholmes-ora
Copy link
Member

Just to reinforce what I've already stated, if I ask for something "if available" I don't expect to get a warning when it is not available - I knew it might not be available. If using NUMA is a potential performance boost then naturally people will want to use it, if it is available. My concern is about the potential impact on end-users who set this in their deployment settings, not the tests that shone the light on the potential problem (for which I'm grateful the tests did in fact fail!).

Warning about a NUMA misconfiguration is a different matter - that is a problem that needs someone's attention to fix.

log_warning(os)("NUMA support disabled: %s", reason);
log_info(os)("NUMA support disabled: %s", reason);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is too coarse. Some reasons for disabling NUMA may require a warning because something is actually wrong. I only want to see the new warning for "Only a single NUMA node is available" not be a warning.

@AlanBateman
Copy link
Contributor

Are you going to remove com/sun/jdi/ProcessAttachTest.java and com/sun/jdi/ReattachStressTest.java from the ProblemList.txt as part of this? They were excluded due to this issue but there may be changes coming to make these tests more robust.

@dholmes-ora
Copy link
Member

Are you going to remove com/sun/jdi/ProcessAttachTest.java and com/sun/jdi/ReattachStressTest.java from the ProblemList.txt as part of this? They were excluded due to this issue but there may be changes coming to make these tests more robust.

If they are listed against this bug id then they must be removed, or else updated to a new bug id, as part of this change.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing further from me. Thanks

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 10, 2025
@swati-sha
Copy link
Contributor Author

Are you going to remove com/sun/jdi/ProcessAttachTest.java and com/sun/jdi/ReattachStressTest.java from the ProblemList.txt as part of this? They were excluded due to this issue but there may be changes coming to make these tests more robust.

If they are listed against this bug id then they must be removed, or else updated to a new bug id, as part of this change.
I think they are already listed with a new bug-id 8346827
@AlanBateman : Let me know if any other update needed from my side in this PR ?

@AlanBateman
Copy link
Contributor

Are you going to remove com/sun/jdi/ProcessAttachTest.java and com/sun/jdi/ReattachStressTest.java from the ProblemList.txt as part of this? They were excluded due to this issue but there may be changes coming to make these tests more robust.

If they are listed against this bug id then they must be removed, or else updated to a new bug id, as part of this change.
I think they are already listed with a new bug-id 8346827
@AlanBateman : Let me know if any other update needed from my side in this PR ?

The changes in this PR means that com/sun/jdi/ProcessAttachTest.java and com/sun/jdi/ReattachStressTest.java should pass again. JDK-8346827 was created for these two because they were causing a lot of noise in our CI, then later lots more tests started to fail and JDK-8346834 were created. My comment was just to say that the 2 JDI tests can be removed from the exclude list too. I think Chris wants to make the tests more robust but that shouldn't prevent the tests running in the mean-time.

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jan 10, 2025
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 13, 2025
@dholmes-ora
Copy link
Member

@swati-sha can we please get this integrated. Thanks

@swati-sha
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jan 15, 2025
@openjdk
Copy link

openjdk bot commented Jan 15, 2025

@swati-sha
Your change (at version c7bb3a6) is now ready to be sponsored by a Committer.

@dholmes-ora
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Jan 15, 2025

Going to push as commit afc4529.
Since your change was applied there have been 156 commits pushed to the master branch:

  • a3be97e: 8347761: Test tools/jimage/JImageExtractTest.java fails after JDK-8303884
  • 28e01e6: 8347762: ClassFile attribute specification refers to non-SE modules
  • be1cdd9: 8344140: Refactor the discovery of AOT cache artifacts
  • 973c630: 8342466: Improve API documentation for java.lang.classfile.attribute
  • 9782bfd: 8347620: Shenandoah: Use 'free' tag for free set related logging
  • 35be4a4: 8347173: java/net/DatagramSocket/InterruptibleDatagramSocket.java fails with virtual thread factory
  • 36b7abd: 8225763: Inflater and Deflater should implement AutoCloseable
  • d6d45c6: 8303884: jlink --add-options plugin does not allow GNU style options to be provided
  • 0ee6ba9: 8347596: Update HSS/LMS public key encoding
  • ec2aaaa: 8326236: assert(ce != nullptr) failed in Continuation::continuation_bottom_sender
  • ... and 146 more: https://git.openjdk.org/jdk/compare/de0250368edbf4e9bebf326778f8f8773b69b84c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 15, 2025
@openjdk openjdk bot closed this Jan 15, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jan 15, 2025
@openjdk
Copy link

openjdk bot commented Jan 15, 2025

@dholmes-ora @swati-sha Pushed as commit afc4529.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants