Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8334890: Missing unconditional cross modifying fence in nmethod entry barriers #19990

Closed

Conversation

fisk
Copy link
Contributor

@fisk fisk commented Jul 2, 2024

On x86_64, our nmethod entry barriers use a mix of asynchronous and synchronous code modification. There is a cmp instruction with an immediate. When the immediate value is "incorrect", the nmethod is armed, and when it's "correct", it's disarmed. When we load the immediate with the instruction fetcher, we use asynchronous cross modifying code, and when we load the immediate as data, we use synchronous cross modifying code.

We use asynchronous code modification in the fast path of nmethod entry barriers. If the nmethod is concurrently being disarmed while the nmethod entry barrier is executed, then we are guaranteed that if the updated "correct" immediate is observed by the instruction fetcher, then any code modification to the nmethod prior to disarming it on another thread, is guaranteed to also be observed by the instruction fetcher.

However, in the slow path, when the immediate was observed to have the "incorrect" value by the instruction fetcher, we call a C++ function, BarrierSetNMethod::nmethod_stub_entry_barrier. In this function we check if the nmethod is disarmed or armed, by loading the guard value (from the immediate), as data. If we observe the updated value, indicating that the nmethod has become disarmed, we want to enter the nmethod. However, since we used data to signal that the instruction cross modification has happened, it is not safe to execute the concurrently modified instructions, without enforcing a cross modifying code fence. This is synchronous code modification.

There is some questionable optimization that in the stub slow path entry (which we just got to because the nmethod was observed to be armed by the instruction fetcher). It checks "just one more time" if the nmethod concurrently got disarmed, and then exits without cross modification fence. This is an opportunistic optimization that is very unlikely to be useful, since we got into the slow path because it a couple of instructions ago was armed. This opportunistic optimization breaks the synchronous code modification contract, which is that you have to issue an instruction cross modification fence after reading the data that signalled that cross modification has completed successfully.

This patch removes these kinds of opportunistic optimizations from the nmethod entry barrier code, in order to make it more robust and follow the synchronous cross modification dance correctly.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8334890: Missing unconditional cross modifying fence in nmethod entry barriers (Bug - P3)(⚠️ The fixVersion in this issue is [23] but the fixVersion in .jcheck/conf is 24, a new backport will be created when this pr is integrated.)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19990/head:pull/19990
$ git checkout pull/19990

Update a local copy of the PR:
$ git checkout pull/19990
$ git pull https://git.openjdk.org/jdk.git pull/19990/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19990

View PR using the GUI difftool:
$ git pr show -t 19990

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19990.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 2, 2024

👋 Welcome back eosterlund! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jul 2, 2024

@fisk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8334890: Missing unconditional cross modifying fence in nmethod entry barriers

Reviewed-by: aboldtch, kbarrett

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 39 new commits pushed to the master branch:

  • 38a578d: 8334738: os::print_hex_dump should optionally print ASCII
  • 7b894bc: 8332786: When dumping static CDS archives, explicitly assert that we don't use a CDS archive
  • e01626c: 8335655: ProblemList serviceability/dcmd/vm tests failing after JDK-8322475
  • 3efa93b: 8335588: Fix -Wzero-as-null-pointer-constant warnings in calls to Node ctor
  • 587535c: 8334545: runtime/ClassInitErrors/TestStackOverflowDuringInit.java fails after JDK-8294960
  • 68ffec9: 8335479: JFR: Missing documentation for -XX:StartFlightRecording
  • 9a91865: 8335395: G1: Verification does not detect references into Free regions
  • 13b782c: 8334554: RISC-V: verify & fix perf of string comparison
  • 8aaec37: 8322475: Extend printing for System.map
  • 19a8a2b: 8335618: Serial: Remove unused definitions in SerialHeap
  • ... and 29 more: https://git.openjdk.org/jdk/compare/318d9acadf305f9d7d0cd8bb54b41506dd9914a8...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jul 2, 2024
@openjdk
Copy link

openjdk bot commented Jul 2, 2024

@fisk The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Jul 2, 2024
@mlbridge
Copy link

mlbridge bot commented Jul 2, 2024

Webrevs

@fisk
Copy link
Contributor Author

fisk commented Jul 4, 2024

Performance results are neutral, as expected. Tier1-5 passed.

Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The always fence changes looks good.

The effects w.r.t. DeoptimizeNMethodBarriersALot and our testing is less clear to me.

Comment on lines -200 to +197
// Diagnostic option to force deoptimization 1 in 3 times. It is otherwise
// Diagnostic option to force deoptimization 1 in 10 times. It is otherwise
// a very rare event.
if (DeoptimizeNMethodBarriersALot) {
if (DeoptimizeNMethodBarriersALot && !nm->is_osr_method()) {
static volatile uint32_t counter=0;
if (Atomic::add(&counter, 1u) % 3 == 0) {
if (Atomic::add(&counter, 1u) % 10 == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not good intuition about the frequency of this, and how this affects things. So have hard time commenting on this change.

An alternative would be to just fence when the nmethod is already disarmed and not call bs_nm->nmethod_entry_barrier(nm);. This is what effectively already happens in all implementations of nmethod_entry_barrier after JDK-8331911 / #19285. (Maintaining the current behaviour with respect to DeoptimizeNMethodBarriersALot).

But as long as this new magic constant seems to have similar testing coverage when running our tests that use DeoptimizeNMethodBarriersALot this seems like a sensible solution as well.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jul 4, 2024
Copy link

@kimbarrett kimbarrett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

// a very rare event.
if (DeoptimizeNMethodBarriersALot) {
if (DeoptimizeNMethodBarriersALot && !nm->is_osr_method()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also include may_enter in the conditions, since the effect of this bit of code is to set it false.
But maybe that's a rare thing and not worth checking for here.

@fisk
Copy link
Contributor Author

fisk commented Jul 4, 2024

Thanks for the reviews @kimbarrett and @xmas92!
/integrate

@openjdk
Copy link

openjdk bot commented Jul 4, 2024

Going to push as commit c0604fb.
Since your change was applied there have been 43 commits pushed to the master branch:

  • cf1be87: 8335663: Fix simple -Wzero-as-null-pointer-constant warnings in C2 code
  • 0bb9c76: 8324089: Fix typo in the manual page for "jcmd" (man jcmd)
  • 3e3f83f: 8335385: javac crash on unattributed piece of AST
  • b20e8c8: 8335397: Improve reliability of TestRecursiveMonitorChurn.java
  • 38a578d: 8334738: os::print_hex_dump should optionally print ASCII
  • 7b894bc: 8332786: When dumping static CDS archives, explicitly assert that we don't use a CDS archive
  • e01626c: 8335655: ProblemList serviceability/dcmd/vm tests failing after JDK-8322475
  • 3efa93b: 8335588: Fix -Wzero-as-null-pointer-constant warnings in calls to Node ctor
  • 587535c: 8334545: runtime/ClassInitErrors/TestStackOverflowDuringInit.java fails after JDK-8294960
  • 68ffec9: 8335479: JFR: Missing documentation for -XX:StartFlightRecording
  • ... and 33 more: https://git.openjdk.org/jdk/compare/318d9acadf305f9d7d0cd8bb54b41506dd9914a8...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jul 4, 2024
@openjdk openjdk bot closed this Jul 4, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jul 4, 2024
@openjdk
Copy link

openjdk bot commented Jul 4, 2024

@fisk Pushed as commit c0604fb.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@fisk
Copy link
Contributor Author

fisk commented Jul 4, 2024

/backport :jdk23

@openjdk
Copy link

openjdk bot commented Jul 4, 2024

@fisk Could not automatically backport c0604fb8 to openjdk/jdk due to conflicts in the following files:

  • src/hotspot/share/gc/shared/barrierSetNMethod.cpp

Please fetch the appropriate branch/commit and manually resolve these conflicts by using the following commands in your personal fork of openjdk/jdk. Note: these commands are just some suggestions and you can use other equivalent commands you know.

# Fetch the up-to-date version of the target branch
$ git fetch --no-tags https://git.openjdk.org/jdk.git jdk23:jdk23

# Check out the target branch and create your own branch to backport
$ git checkout jdk23
$ git checkout -b backport-fisk-c0604fb8-jdk23

# Fetch the commit you want to backport
$ git fetch --no-tags https://git.openjdk.org/jdk.git c0604fb823d9f3b2e347a9857b11606b223ad8ec

# Backport the commit
$ git cherry-pick --no-commit c0604fb823d9f3b2e347a9857b11606b223ad8ec
# Resolve conflicts now

# Commit the files you have modified
$ git add files/with/resolved/conflicts
$ git commit -m 'Backport c0604fb823d9f3b2e347a9857b11606b223ad8ec'

Once you have resolved the conflicts as explained above continue with creating a pull request towards the openjdk/jdk with the title Backport c0604fb823d9f3b2e347a9857b11606b223ad8ec.

Below you can find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit c0604fb8 from the openjdk/jdk repository.

The commit being backported was authored by Erik Österlund on 4 Jul 2024 and was reviewed by Axel Boldt-Christmas and Kim Barrett.

Thanks!

@rose00
Copy link
Contributor

rose00 commented Jul 4, 2024

I see you integrated; good job, and thank you to the Reviewers. The narrative at the top of this PR is excellent for motivating and explaining the removal of the extra check. Some of the other diffs are more mysterious, as Axel noted.

There are no API docs in barrierSetNMethod.[ch]pp so I would need to trace through all the code paths to properly educate myself about the effect of this change.

For example, BarrierSetNMethod::nmethod_entry_barrier is a very interesting function, along with its OSR brother, but there are no comments directly visible here that give a clue as to when it is called, or why it must be called. I think that level of non-documentation is often a maintenance problem. I see this file relates to the larger API in barrierSet.hpp but that file has sparse comments also.

Ideally, I’d hope to read The Narrative of the Barrier Set at the top of barrierSet.hpp, and maybe have a brief pointer to The Narrative from less-commented related files like barrierSetNMethod.cpp.

Also, if I did this change, and was feeling chatty and cautious, I’d leave behind an informative comment to the effect that “you might want to double-check the barrier state here, but don’t, because races”. It’s nice not to leave a seam from past history, but sometimes the absence of a warning leads people to repeat history.

None of the above critiques would have stopped me from approving the change as another Reviewer, but the lack of documentation would have made me hesitate to review quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants