-
Notifications
You must be signed in to change notification settings - Fork 6.2k
8285416: [LOOM] Some nsk/jdi tests fail due to needing too many virtual threads #11735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ugh carrier threads if the test pins a lot of threads.
|
/issue JDK-8282383 |
|
👋 Welcome back cjplummer! A progress list of the required criteria for merging this PR into |
|
@plummercj |
|
@plummercj The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
|
This sounds like a bug with the underlying executor. If the VT's pin their carrier threads then the executor should increase its parallelism level automatically to compensate for that. |
It's probably best if @AlanBateman explains this. In the meantime, you might find the following code in VirtualThread.createDefaultScheduler() useful: Also, Alan mentioned the following to me: "There are two configuration knobs. One is parallelism, the other is maxPoolSize. maxPoolSize is the maximum number of carrier threads. parallelism is really the target parallelism. It's value is the number of hardware threads but it might be increased temporarily during operations that pin threads. So if you are monitoring the number of carriers on an 8 core system then you might see 9 or 10 threads periodically, only to compensate for threads that are pinned." What's not clear of me is if the "pinning" that happens during synchronization is taken into account with this strategy. I think it might not actually be considered pinning (from an implementation point of view), but does have the same affect of occupying the carrier thread. |
|
The JEP defines pinning as expected:
But then also says:
which contradicts what you quoted from Alan above - though I prefer that behaviour as the JEP's behaviour seems a design flaw to me. |
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of the actual changes what you have seems fine. I wonder whether the tests themselves should be checking the parallelism value is large enough? At the moment there is a disconnect between the number of vthreads a test may create versus the setting of the parallelism level in the launcher/binder code.
|
@dholmes-ora I initially took the approach of making sure each test that was ever running short of carrier threads explicitly requested the minimum it needed. It resulted in quite a few changes, and I still had one test that was occasionally failing. Possibly I just came up one short in the carrier thread calculation. I abandoned it for this PR because it's much simpler. Here's a PR I just created for those changes in case you want to have a look at what was involved: #11762. |
Yeah that isn't nice. I was thinking a more simple check of the property value against the required number of threads. Though based on your comment "nthreads+1" is not always enough so the check would not be sufficient. |
I think I can get past the nthreads + 1 issue. I believe in one test there is another vthread that I was not accounting for, so probably nthreads + 2 would work for that test. However, I don't quite understand your suggestion. Are you suggesting that after checking the property, if too small then I don't run the test and just have it pass? |
|
I'm suggesting the test reports failure if the parallelism level is too low - with a message to go and change the launcher/binder code. |
Are saying that in addition to the changes in this PR I should also change each of the tests to add a check to make sure parallelism is set high enough? |
sspitsyn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good besides what David is requesting.
Thanks,
Serguei
|
@plummercj This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 102 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
dholmes-ora
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are saying that in addition to the changes in this PR I should also change each of the tests to add a check to make sure parallelism is set high enough?
I was, but now I see the tests involved and the fact this problem is just an artifact of running those tests in virtual threads, then I really don't want to see those tests polluted with VT specific code. I know more now about the issue with pinning on monitor entry and not being able to increase parallelism for that case - but ideally that would indeed by the fix and I'll look into that some more.
But this fix is approved. Thanks.
|
The proposed change looks okay and a lot more maintainable than adjusting specific tests. |
|
/integrate |
|
Going to push as commit d6e9f01.
Your commit was automatically rebased without conflicts. |
|
@plummercj Pushed as commit d6e9f01. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
There are a few nsk debugger tests that pin multiple virtual threads to carrier threads when synchronizing. Sometime the default number of carrier threads (which equals the number of CPUs) is not enough, and the test deadlocks because virtual threads start to wait forever for an available carrier thread. This PR fixes this problem by using the
jdk.virtualThreadScheduler.parallelismproperty to change the default number of carrier threads. I believe the largest number of carrier threads any test needs is 11, so I chose 15 just to be safe.I had initially tried to fix each individual test by using the test support in
VThreadRunner.setParallism(). The advantage of this was limiting the scope of the change to just a few tests, and also being able to specify the exact number of needed carrier threads. The disadvantage was having to make quite a few changes to quite a few tests, plus I had one troublesome test that was still failing, I believe because I didn't fully understand how many carrier threads it needed. Just giving every test 15 carrier threads in the end was a lot easier.Progress
Issues
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11735/head:pull/11735$ git checkout pull/11735Update a local copy of the PR:
$ git checkout pull/11735$ git pull https://git.openjdk.org/jdk pull/11735/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 11735View PR using the GUI difftool:
$ git pr show -t 11735Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11735.diff