Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8322846: Running with -Djdk.tracePinnedThreads set can hang #17221

Closed
wants to merge 3 commits into from

Conversation

AlanBateman
Copy link
Contributor

@AlanBateman AlanBateman commented Jan 2, 2024

-Djdk.tracePinnedThreads is a debugging option that dates from early development in the loom repo to identify pinned threads. It has several issues and this tracing option will eventually be removed (use the JFR events instead). Several hangs have been reported when running with the system property set. The "hangs" stem from the onPinned callback executing while the virtual thread is in a transition state (typically parking). If the virtual parks while printing the stack trace then it works like a nested park where the thread state is never restored. Contention on the System.out can also lead to deadlock when there are platform and pinned virtual threads printing to System.out around the same time.

This PR brings over the changes from the loom repo to avoid these hangs. The changes mean the stack trace is only printed to System.out when the PrintStream lock can be acquired without blocking. It also restores the thread state after printing. An alternative to not printing traces would of course be to queue the traces so they are printed by another thread but this is just adding complexity for a debugging option that we want to go away.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8322846: Running with -Djdk.tracePinnedThreads set can hang (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/17221/head:pull/17221
$ git checkout pull/17221

Update a local copy of the PR:
$ git checkout pull/17221
$ git pull https://git.openjdk.org/jdk.git pull/17221/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 17221

View PR using the GUI difftool:
$ git pr show -t 17221

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/17221.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 2, 2024

👋 Welcome back alanb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 2, 2024

@AlanBateman The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Jan 2, 2024
@AlanBateman AlanBateman marked this pull request as ready for review January 2, 2024 15:19
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 2, 2024
@mlbridge
Copy link

mlbridge bot commented Jan 2, 2024

Webrevs

@jaikiran
Copy link
Member

jaikiran commented Jan 2, 2024

Hello Alan, I haven't yet looked in detail the changes, so this merely a comment based on first glance:

The changes mean the stack trace is only printed to System.out when the PrintStream lock can be acquired without blocking.

I guess that then means that the some pinned threads might not even appear in this generated thread dumps. I can't think of a way where we could at least print something that would at least hint that some pinned threads weren't dumped in the output. This reduces the utility of this system property which is unfortunate, but I do understand the reason.

@AlanBateman
Copy link
Contributor Author

I guess that then means that the some pinned threads might not even appear in this generated thread dumps.

Just to be clear. The changes here have no impact on thread dumps, the thread dump generated by jcmd Thread.dump_to_file has all virtual threads, including pinned threads.

As regards jdk.tracePinnedThreads. We want this system property to go away, it's just unfortunate that it seems to be widely used.

@jaikiran
Copy link
Member

jaikiran commented Jan 3, 2024

Hello Alan,

I guess that then means that the some pinned threads might not even appear in this generated thread dumps.

Just to be clear. The changes here have no impact on thread dumps, the thread dump generated by jcmd Thread.dump_to_file has all virtual threads, including pinned threads.

I didn't word my previous comment correctly. As you say, the jdk.tracePinnedThreads system property doesn't play a role in the thread dumps that generated through the jcmd command. This system property just enables auto generation of a thread dump of a virtual thread when that virtual thread gets pinned. That thread dump logging happens from within that virtual thread, which is what I understand to be the reason behind the potential hang.

My understanding is that the jcmd Thread.dump_to_file command, when invoked, does the thread dump generation (for all threads) from within the Attach Listener thread which is a platform thread and thus isn't susceptible to this issue of hanging.

As regards jdk.tracePinnedThreads. We want this system property to go away, it's just unfortunate that it seems to be widely used.

I think it was the ease of use (unlike having to start JFR recording) and the "event driven" automated thread dump (it gets logged only when the virtual thread is pinned unlike having to manually generate thread dumps) which made it attractive, especially during the early testing of virtual threads with other parts of the JDK.

Copy link
Member

@jaikiran jaikiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this change looks OK to me. I'm not too familiar with the (virtual) thread states so I can't say if there's any other implication of changing the virtual thread state to RUNNING when we are printing this thread dump.

I've a trivial review comment about the new test method introduced in this PR, which I have added inline.

@openjdk
Copy link

openjdk bot commented Jan 3, 2024

@AlanBateman This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8322846: Running with -Djdk.tracePinnedThreads set can hang

Reviewed-by: jpai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 3 new commits pushed to the master branch:

  • 27d5f5c: 8322781: C1: Debug build crash in GraphBuilder::vmap() when print stats
  • df22fb3: 8322512: StringBuffer.repeat does not work correctly after toString() was called
  • c3cd1f1: 8323000: Parallel: Remove unused class declarations in psScavenge

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 3, 2024
* Test contention writing to System.out when pinned.
*/
@Test
void testContention() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test method expects to have all the launched threads to print to System.out without hanging. If it hangs then we expect the test run to timeout with a jtreg timeout. Given the context of this PR in which this test method is being introduced, I think I can understand what it does. But maybe we should update the test method's comment to say that this method verifies that contention when writing to System.out when pinned doesn't cause any hangs?

@AlanBateman
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jan 6, 2024

Going to push as commit faa9c69.
Since your change was applied there have been 26 commits pushed to the master branch:

  • ace010b: 8319757: java/nio/channels/DatagramChannel/InterruptibleOrNot.java failed: wrong exception thrown
  • be4614e: 8323016: Improve reporting for bad options
  • 35a1b77: 8322636: [JVMCI] HotSpotSpeculationLog can be inconsistent across a single compile
  • 46965a0: 8322981: Fix 2 locations in JDI that throw IOException without using the "Caused by" exception
  • 700c25f: 8322954: Shenandoah: Convert evac-update closures asserts to rich asserts
  • 631a9f6: 8323073: ProblemList gc/g1/TestSkipRebuildRemsetPhase.java on linux-aarch64
  • ed9f324: 8322985: [BACKOUT] 8318562: Computational test more than 2x slower when AVX instructions are used
  • ade21a9: 8310844: [AArch64] C1 compilation fails because monitor offset in OSR buffer is too large for immediate
  • f0cfd36: 8322532: JShell : Unnamed variable issue
  • 78623c9: 8323012: C2 fails with fatal error: no reachable node should have no use
  • ... and 16 more: https://git.openjdk.org/jdk/compare/dd517c64047705d706b095d15d9fd4e0703ab39b...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jan 6, 2024
@openjdk openjdk bot closed this Jan 6, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 6, 2024
@openjdk
Copy link

openjdk bot commented Jan 6, 2024

@AlanBateman Pushed as commit faa9c69.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@AlanBateman
Copy link
Contributor Author

/backport jdk22

@openjdk
Copy link

openjdk bot commented Jan 13, 2024

@AlanBateman the backport was successfully created on the branch backport-AlanBateman-faa9c690 in my personal fork of openjdk/jdk22. To create a pull request with this backport targeting openjdk/jdk22:master, just click the following link:

➡️ Create pull request

The title of the pull request is automatically filled in correctly and below you find a suggestion for the pull request body:

Hi all,

This pull request contains a backport of commit faa9c690 from the openjdk/jdk repository.

The commit being backported was authored by Alan Bateman on 6 Jan 2024 and was reviewed by Jaikiran Pai.

Thanks!

If you need to update the source branch of the pull then run the following commands in a local clone of your personal fork of openjdk/jdk22:

$ git fetch https://github.com/openjdk-bots/jdk22.git backport-AlanBateman-faa9c690:backport-AlanBateman-faa9c690
$ git checkout backport-AlanBateman-faa9c690
# make changes
$ git add paths/to/changed/files
$ git commit --message 'Describe additional changes made'
$ git push https://github.com/openjdk-bots/jdk22.git backport-AlanBateman-faa9c690

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

2 participants