Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8315373: Change VirtualThread to unmount after freezing, re-mount before thawing #245

Closed
wants to merge 9 commits into from

Conversation

shipilev
Copy link
Member

@shipilev shipilev commented Feb 8, 2024

This is a series of Loom backports that brings in the fixes from JDK 22 and mainline into JDK 21, where Loom is production-ready. This batch fixes a few thread state and JVMTI problems.

Unfortunately, these backports are tied together, as they change the same code a few times. It is very hard to make them into separate backports without breaking stuff along the way, so instead I clobbered them together in this PR. They follow the application order from mainline. All patches together touch only the Virtual Thread parts, so the risk for the rest of the code should be minimal.

Brief explanation of fixes:

  • JDK-8315373: In mainline since Sep 2023, there is no bug tail. Changes how VTs interact with JVMTI. We want this fix to provide the ground for the rest of backports. Notably, it makes JDK-8312777 backport clean.
  • JDK-8312498: In mainline since Sep 2023. There is a bug tail, JDK-8322818. Fixes what thread state VT reports when timed-parked. This fixes the actual Loom bug.
  • JDK-8312777: In mainline since Oct 2023, there is no bug tail. Fixes the JVMTI race with mount/unmount events. Makes JDK-8322818 backport clean. This fixes the actual Loom bug.
  • JDK-8321270: In mainline since Dec 2023, there is no bug tail. Makes JDK-8322818 a clean backport. This fixes the actual Loom performance problem.
  • JDK-8322818: In mainline since Jan 2023. Fixes the bug introduced by JDK-8312498. Has two additional follow-ups: JDK-8323002 and JDK-8323296.
  • JDK-8323002: In mainline since Jan 2023. Follow-up test fix for JDK-8322818. Test-only change.
  • JDK-8323296: In mainline since Jan 2023. Follow-up test fix for JDK-8322818. Test-only change.
  • JDK-8316924: In mainline since Sep 2023. Follow-up test fix for JDK-8322818. Test-only change.

Before I undraft this, which would trigger JBS updates:

@AlanBateman, does this make sense to you?

@jerboaa, how do you feel about this?

Additional testing:

  • MacOS AArch64 server fastdebug, jdk_loom hotspot_loom
  • Linux AArch64 server fastdebug, jdk_loom hotspot_loom
  • Linux x86_64 server fastdebug, jdk_loom hotspot_loom
  • Linux x86_64 server fastdebug, tier{1,2,3}

Progress

Issues

  • JDK-8315373: Change VirtualThread to unmount after freezing, re-mount before thawing (Enhancement - P4 - Approved)
  • JDK-8312498: Thread::getState and JVM TI GetThreadState should return TIMED_WAITING virtual thread is timed parked (Bug - P3 - Approved)
  • JDK-8312777: notifyJvmtiMount before notifyJvmtiUnmount (Bug - P3 - Approved)
  • JDK-8321270: Virtual Thread.yield consumes parking permit (Bug - P3 - Approved)
  • JDK-8322818: Thread::getStackTrace can fail with InternalError if virtual thread is timed-parked when pinned (Bug - P3 - Approved)
  • JDK-8323002: test/jdk/java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java times out on macosx-x64 (Bug - P4 - Approved)
  • JDK-8323296: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id1 timed out (Bug - P4 - Approved)
  • JDK-8316924: java/lang/Thread/virtual/stress/ParkALot.java times out (Bug - P4 - Approved)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk21u-dev.git pull/245/head:pull/245
$ git checkout pull/245

Update a local copy of the PR:
$ git checkout pull/245
$ git pull https://git.openjdk.org/jdk21u-dev.git pull/245/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 245

View PR using the GUI difftool:
$ git pr show -t 245

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk21u-dev/pull/245.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 8, 2024

👋 Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title Backport 9a83d55887e5e3a0a2e1e020c6ccb91604672358 8315373: Change VirtualThread to unmount after freezing, re-mount before thawing Feb 8, 2024
@openjdk
Copy link

openjdk bot commented Feb 8, 2024

This backport pull request has now been updated with issue from the original commit.

@openjdk openjdk bot added the backport label Feb 8, 2024
@shipilev
Copy link
Member Author

shipilev commented Feb 8, 2024

/issue add JDK-8312498

@openjdk
Copy link

openjdk bot commented Feb 8, 2024

@shipilev
Adding additional issue to issue list: 8312498: Thread::getState and JVM TI GetThreadState should return TIMED_WAITING virtual thread is timed parked.

@shipilev
Copy link
Member Author

shipilev commented Feb 8, 2024

/issue add JDK-8312777
/issue add JDK-8321270
/issue add JDK-8322818
/issue add JDK-8323002
/issue add JDK-8323296

@openjdk
Copy link

openjdk bot commented Feb 8, 2024

@shipilev
Adding additional issue to issue list: 8312777: notifyJvmtiMount before notifyJvmtiUnmount.

@openjdk
Copy link

openjdk bot commented Feb 8, 2024

@shipilev
Adding additional issue to issue list: 8321270: Virtual Thread.yield consumes parking permit.

@openjdk
Copy link

openjdk bot commented Feb 8, 2024

@shipilev
Adding additional issue to issue list: 8322818: Thread::getStackTrace can fail with InternalError if virtual thread is timed-parked when pinned.

@openjdk
Copy link

openjdk bot commented Feb 8, 2024

@shipilev
Adding additional issue to issue list: 8323002: test/jdk/java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java times out on macosx-x64.

@openjdk
Copy link

openjdk bot commented Feb 8, 2024

@shipilev
Adding additional issue to issue list: 8323296: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id1 timed out.

@shipilev
Copy link
Member Author

shipilev commented Feb 8, 2024

All these backports are clean.

/clean

@openjdk openjdk bot added the clean label Feb 8, 2024
@openjdk
Copy link

openjdk bot commented Feb 8, 2024

@shipilev This backport pull request is now marked as clean

@jerboaa
Copy link
Contributor

jerboaa commented Feb 8, 2024

@jerboaa, how do you feel about this?

This is porting work, so it would be advisable to get reviews from loom devs on this before considering for approval (even if it's clean).

@shipilev
Copy link
Member Author

shipilev commented Feb 8, 2024

@jerboaa, how do you feel about this?

This is porting work, so it would be advisable to get reviews from loom devs on this before considering for approval (even if it's clean).

Yes, @AlanBateman and me talked briefly about this during FOSDEM. Assuming Alan is fine with it, do you see other problems with the PR like this?

@AlanBateman
Copy link

AlanBateman commented Feb 9, 2024

Assuming Alan is fine with it, do you see other problems with the PR like this?

Normally I think some of these changes would only be backport if really needed but given the churn in this newish code then it is safer to backport these as a group rather than cherry pick. I looked through the diffs and everything looks okay. You may have to add JDK-8316924 to the list as the test ParkALot had to be dialed down for larger test systems.

@shipilev
Copy link
Member Author

shipilev commented Feb 9, 2024

Assuming Alan is fine with it, do you see other problems with the PR like this?

Normally I think some of these changes would only be backport if really need but given the churn in this newish code then it is safer to backport these as a group. I looked through the diffs and everything looks okay.

Thank you!

@jerboaa, anything else you want to be covered before I start doing the formal fix-requests for 21u-dev?

You may have to add JDK-8316924 to the list as the test ParkALot had to be dialed down for larger test systems.

Right. Added.

@shipilev
Copy link
Member Author

shipilev commented Feb 9, 2024

/issue add JDK-8316924

@openjdk
Copy link

openjdk bot commented Feb 9, 2024

@shipilev
Adding additional issue to issue list: 8316924: java/lang/Thread/virtual/stress/ParkALot.java times out.

@jerboaa
Copy link
Contributor

jerboaa commented Feb 9, 2024

@jerboaa, anything else you want to be covered before I start doing the formal fix-requests for 21u-dev?

I'll probably reach out to fellow maintainers and gather their opinion. Lets wait on that before the formal approval. From what I've seen so far it should be OK in principle as it's fairly contained code to the loom area.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
8315373: The approval request has been updated successfully.
8312498: The approval request has been updated successfully.
8312777: The approval request has been updated successfully.
8321270: The approval request has been updated successfully.
8322818: The approval request has been updated successfully.
8323002: The approval request has been updated successfully.
8323296: The approval request has been updated successfully.
8316924: The approval request has been updated successfully.

@openjdk openjdk bot added the approval label Feb 15, 2024
@shipilev
Copy link
Member Author

OK, I think I misread how approval command works. Let me try again.

/approval JDK-8315373 request Changes how VTs interact with JVMTI. In mainline since Sep 2023, there is no bug tail. We want this fix to provide the ground for the rest of backports. Notably, it makes JDK-8312777 backport clean. The fix is part of large atomic change, see 21u PR.

/approval JDK-8312498 request Fixes what thread state VT reports when timed-parked. This fixes the actual Loom bug.
In mainline since Sep 2023. There is a bug tail, JDK-8322818. The fix is part of large atomic change, see 21u PR.

/approval JDK-8312777 request Fixes the JVMTI race with mount/unmount events. This fixes the actual Loom bug. In mainline since Oct 2023, there is no bug tail. Makes JDK-8322818 backport clean. The fix is part of large atomic change, see 21u PR.

/approval JDK-8321270 request This fixes the actual Loom performance problem. In mainline since Dec 2023, there is no bug tail. Makes JDK-8322818 a clean backport. The fix is part of large atomic change, see 21u PR.

/approval JDK-8322818 request Fixes the bug introduced by JDK-8312498. In mainline since Jan 2023. Has two additional follow-ups: JDK-8323002 and JDK-8323296. The fix is part of large atomic change, see 21u PR.

/approval JDK-8323002 request Follow-up test fix for JDK-8322818. In mainline since Jan 2023. Test-only change. The fix is part of large atomic change, see 21u PR.

/approval JDK-8323296 request Follow-up test fix for JDK-8322818. In mainline since Jan 2023. Test-only change. The fix is part of large atomic change, see 21u PR.

/approval JDK-8316924 request Follow-up test fix for JDK-8322818. In mainline since Sep 2023. Test-only change. The fix is part of large atomic change, see 21u PR.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8315373: The approval request has been updated successfully.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8312498: The approval request has been updated successfully.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8312777: The approval request has been updated successfully.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8321270: The approval request has been updated successfully.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8322818: The approval request has been updated successfully.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8323002: The approval request has been updated successfully.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8323296: The approval request has been updated successfully.

@openjdk
Copy link

openjdk bot commented Feb 15, 2024

@shipilev
JDK-8316924: The approval request has been updated successfully.

@jerboaa
Copy link
Contributor

jerboaa commented Feb 16, 2024

/approve yes

@openjdk
Copy link

openjdk bot commented Feb 16, 2024

@jerboaa
8315373: The approval request has been approved.
8312498: The approval request has been approved.
8312777: The approval request has been approved.
8321270: The approval request has been approved.
8322818: The approval request has been approved.
8323002: The approval request has been approved.
8323296: The approval request has been approved.
8316924: The approval request has been approved.

@openjdk
Copy link

openjdk bot commented Feb 16, 2024

@shipilev This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8315373: Change VirtualThread to unmount after freezing, re-mount before thawing
8312498: Thread::getState and JVM TI GetThreadState should return TIMED_WAITING virtual thread is timed parked
8312777: notifyJvmtiMount before notifyJvmtiUnmount
8321270: Virtual Thread.yield consumes parking permit
8322818: Thread::getStackTrace can fail with InternalError if virtual thread is timed-parked when pinned
8323002: test/jdk/java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java times out on macosx-x64
8323296: java/lang/Thread/virtual/stress/GetStackTraceALotWhenPinned.java#id1 timed out
8316924: java/lang/Thread/virtual/stress/ParkALot.java times out

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added ready Pull request is ready to be integrated and removed approval labels Feb 16, 2024
@shipilev
Copy link
Member Author

I see the approval is granted. Unless there are other comments, I would integrate this PR soon, and it would then land in 21.0.3.

@jerboaa
Copy link
Contributor

jerboaa commented Feb 19, 2024

Yes, integrating it sooner rather than later would be preferred. Ramp down starts next week for JDK 21.0.3. It's not a low risk change. Having said that, I concur it's fairly well isolated to loom code and there were no objections in principle from other maintainers. We'll have to see and if worse comes to worst we'll back it out before April.

So in a nutshell, the question was whether to allow for 21.0.3 or push to 21.0.4. It'll have about 2 months before the release in April (21.0.3). If it was pushed to 21.0.4 we'd have some more testing in the jdk21u-dev tree, but to me the first merge to jdk21u matters. Going by that, we'd have similarly about 2 months of testing with public EA builds. So the characteristics don't change much. Therefore, approved for 21.0.3.

@gnu-andrew
Copy link
Member

So in a nutshell, the question was whether to allow for 21.0.3 or push to 21.0.4. It'll have about 2 months before the release in April (21.0.3).

It's five weeks until code freeze for 21.0.3 and so five weeks to both find any issues and get any follow-on fixes in. The release happens on 2024-04-16 so in two months time it will be out.

If it was pushed to 21.0.4 we'd have some more testing in the jdk21u-dev tree, but to me the first merge to jdk21u
matters. Going by that, we'd have similarly about 2 months of testing with public EA builds. So the characteristics don't change much. Therefore, approved for 21.0.3.

That would be a full two months. Even going from the initial build promotion, you've lost three weeks of development with 21.0.3 compared with 21.0.4 (the fourth build promotion is due to happen tomorrow).

If you want this in 21.0.3, I think it's imperative it gets merged as soon as possible to maximise the time available, ideally today. Personally, I would still prefer 21.0.4.

@shipilev
Copy link
Member Author

After discussing this with other engineers, and seeing more of the soft push back here, I am now leaning to keep this PR out of 21.0.3 and push it early into 21.0.4, some time next week. This would be less stressful for everyone involved.

The reason to have these patches is to make 21u conveniently usable for Loom adopters. However, there are other fixes (e.g. JDK-8320275) that would not make it to 21.0.3. Plus, there are dependent fixes (e.g. JDK-8322846) that we also want to get in after this lands. So backporting the performance fixes when there are not yet backported functional fixes does not gain us much ground for Loom stabilization for 21.0.3.

@shipilev
Copy link
Member Author

Heads-up: I remerged from 21.0.4 master, re-testing with all tests now, will push once tests are confirmed green.

@shipilev
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Feb 28, 2024

Going to push as commit 3eb5517.
Since your change was applied there has been 1 commit pushed to the master branch:

  • cea29fe: 8325213: Flags introduced by configure script are not passed to ADLC build

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Feb 28, 2024
@openjdk openjdk bot closed this Feb 28, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Feb 28, 2024
@openjdk
Copy link

openjdk bot commented Feb 28, 2024

@shipilev Pushed as commit 3eb5517.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport clean integrated Pull request has been integrated
4 participants