8282379: [LOOM] vmTestbase/nsk/jdi/ClassType/invokeMethod/invokemethod011 sometimes fails#12420
8282379: [LOOM] vmTestbase/nsk/jdi/ClassType/invokeMethod/invokemethod011 sometimes fails#12420plummercj wants to merge 5 commits intoopenjdk:masterfrom
Conversation
|
👋 Welcome back cjplummer! A progress list of the required criteria for merging this PR into |
|
@plummercj The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
|
Relying on |
I could just remove the sleep(0) call altogether, or I could make the original sleep(400) call conditional on not being a virtual thread, although I don't like that solution so much. In either case is sounds like you want a comment. A more general one would simply state not to call anything that might block since other threads are suspended. Now that I think of it, it seems like the two display() calls could block since they do I/O, although the way this test is run it seems they never do. I bet if you had some other thread(s) doing display() calls at the same time, and one of them got suspended in the middle of the display, then you might see issues. |
|
I updated one of the tests to get rid of the sleep and added a comment. Let me know what you think. Once it's ok'd I'll apply the same change to the rest of the tests: |
| while(!doExit) { | ||
| l--; l++; | ||
| Thread.currentThread().sleep(400); | ||
| Thread.currentThread().sleep(0); |
There was a problem hiding this comment.
BTW this is an anti-pattern - use Thread.sleep(n) - it always applies to the current thread.
There was a problem hiding this comment.
What about replacing sleep(0) with yield()?
Do we expect it also causing deadlocks? Was it considered?
There was a problem hiding this comment.
sleep(0) on a virtual thread ends up doing a tryYield(), which is the same as what is done when calling sleep(0), so it should work, although I'm not so sure it is any better.
|
Okay so these tests are incredibly fragile - as you note display() may be a problem in theory, as could any potential class-loading or initialization. But without the sleep you have a busy-wait loop that may cause problems of its own (e.g. it might trigger on-stack-replacement but the compiler thread may require a resource held by one of the suspended threads!). It is impossible to know that something is 100% safe to do in this kind of situation. |
Yes, the more that is done int the called method the more fragile it becomes. The javadoc is pretty clear about the risks of using INVOKE_SINGLE_THREADED, and I think most users (such as IntelliJ) will quickly timeout the invoke, do a vm.resume() to unstick it, and then try the invoke again. The common use of JDI invoke by IDEs is to call toString() on local variables that are references to objects, which means even during common usage the invoke has the potential to do just about anything.
That's not possible since the thread will be in interpOnly mode (because invokes can only be done when suspended at an event). |
|
Good to know interpOnly mode at least places some limits on what may happen. The busy-loop is not likely to trigger any deadlock in that case. I will leave it to you whether to select sleep(0) or no sleep at all. |
|
All the tests now have the sleep() call removed and I added the warning comment. |
|
@plummercj This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 66 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
|
Looks good. |
|
Thanks for the reviews Serguei and David! /integrate |
|
Going to push as commit f4b72df.
Your commit was automatically rebased without conflicts. |
|
@plummercj Pushed as commit f4b72df. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
A number of tests that use JDI invokeMethod() support occasionally fail when run using virtual threads. The tests fail with:
Although as explained later, this is a misleading message, and does not accurately reflect why the test is failing. More on that below.
The root cause of the failure is due to these tests using JDI invokeMethod support with the INVOKE_SINGLE_THREADED flag. This is not always going to work with virtual threads because the invoked method is doing a Thread.sleep(400), and that is at risk of blocking indefinitely if all other threads are blocked form making progress. Note that technically platform threads could fail in the same manner. However, the reason the failure only happens now with virtual threads is because the implementation of Thread.sleep() differs for virtual threads, and may require ownership of a monitor that sometimes is held by another thread.
Another issue is that the tests do not do a very good job of error handling when this happens, and give the misleading failure reason of the invoked thread not being suspended after the invoke completed. The reason it is not suspended is because the invoke has actually not completed. There was a timeout that the test did not properly note as the cause of the failure. The test (debugger side) spawns a thread to do the JDI invokeMethod with, and then waits for it with:
This join() times out, but the test assumes once it returns the invoke is complete, even though the invoked thread is actually still in the middle of the invoke. So that is the reason debuggee invokemethod thread is not currently suspended. I've fixed this by having a test check if invThr is still alive after the join. If it is, then the test is made to fail at that point, rather than continuing on and checking the debuggee threads status. The failure then becomes:
nsk.share.TestFailure: TEST FAILED: invoke never completed
At that point a vm.resume() is done to allow the invoke to complete, and the test will exit with this failure.
As for avoiding the failure in the first place (the deadlock in the debuggee during the invoke), this is really a test bug for relying on INVOKE_SINGLE_THREADED and assuming that the invoked thread won't become deadlocked. Since there is a Thread.sleep() call in the invoked method, it can't make this assumption. From the ObjectReference.invoke() spec:
"By default, all threads in the target VM are resumed while the method is being invoked if they were previously suspended by an event or by VirtualMachine.suspend() or ThreadReference.suspend(). This is done to prevent the deadlocks that will occur if any of the threads own monitors that will be needed by the invoked method."
"The resumption of other threads during the invocation can be prevented by specifying the INVOKE_SINGLE_THREADED bit flag in the options argument; however, there is no protection against or recovery from the deadlocks described above, so this option should be used with great caution."
For platform threads, sleep() doesn't require any monitors, so these tests never ran into problems before. For virtual threads however there is some synchronization done, and potential reliance on other threads not being suspended. A way around this is to always use sleep(0), which will at least attempt to yield the thread. For platform threads an actual yield is likely. For a virtual thread it will not yield in this particular case because the virtual thread is pinned to the carrier thread due to the jvmti breakpoint callback that is currently in the call chain of the invoked thread. So for virtual threads this effectively the same as sitting in a spin loop with no yielding. This is ok. CPU wasting is not a concern.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/12420/head:pull/12420$ git checkout pull/12420Update a local copy of the PR:
$ git checkout pull/12420$ git pull https://git.openjdk.org/jdk pull/12420/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 12420View PR using the GUI difftool:
$ git pr show -t 12420Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/12420.diff