Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor VirtualThread synchronization design #16855

Merged
merged 5 commits into from
Mar 31, 2023

Conversation

fengxue-IS
Copy link
Contributor

@fengxue-IS fengxue-IS commented Mar 8, 2023

Removes use of liveVirtualThreadList/liveVirtualThreadListMutex in the VM and rely on GC's Continuation list.
This design uses an atomic based RW lock with busy waiting for vthread access during mounted/unmount transition.
Global access in JVMTI is a slow path and done through Exclusive VM Access which doesn't require additional locking.

mutual dependency:
jdk19: ibmruntimes/openj9-openjdk-jdk19#83
jdk20: ibmruntimes/openj9-openjdk-jdk20#32
jdknext: ibmruntimes/openj9-openjdk-jdk#574

Fixes: #16728

Signed-off-by: Jack Lu Jack.S.Lu@ibm.com

@fengxue-IS
Copy link
Contributor Author

FYI @babsingh @tajila @gacholio

@fengxue-IS
Copy link
Contributor Author

Update

  1. Per discussion with @babsingh, will merge design concept from JVMTI suspend and resume rework to remove the virtual thread list #16846 to use EVMA instead of a RW-lock for controlling the global vthread access. This should completely remove the mount/unmount contention seen in stress testing.

  2. Refactor getStackTraceImpl to redo the check after calling haltThreadForInspection so it will not require using VirtualThread.inspectorCount for sync

  3. Dynamically set VirtualThread.notifyJvmtiEvents to remove all dependency on the notifyJvmtiMount/Unmount[Begin/End] API if JVMTI is not used.

@gacholio
Copy link
Contributor

  1. use EVMA instead of a RW-lock for controlling the global vthread access

That's fine for the JVMTI code but will probably be unuseably slow for any hot code path. EVMA is an extremely expensive operation.

@fengxue-IS
Copy link
Contributor Author

  1. use EVMA instead of a RW-lock for controlling the global vthread access

That's fine for the JVMTI code but will probably be unuseably slow for any hot code path. EVMA is an extremely expensive operation.

Currently JVMTI suspend/resume all is the only case where the RW lock is using writer access, so I assume by switching to EMVA will just remove the need for any locking for non-jvmti path

@gacholio
Copy link
Contributor

Have you measured CPU usage before and after this change? Spinning is OK for short-lived operations, but I suspect the CPU usage will spike heavily with this solution.

runtime/oti/VMHelpers.hpp Outdated Show resolved Hide resolved
@fengxue-IS
Copy link
Contributor Author

updated to the EVMA design, perf result 7x better on nocompressedrefs, 20% perf better on compressedrefs

…ed thread

Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
@fengxue-IS
Copy link
Contributor Author

addressed all review comments, locally compile/tested against JTReg stress test on x86 linux.

runtime/oti/VMHelpers.hpp Show resolved Hide resolved
@tajila
Copy link
Contributor

tajila commented Mar 29, 2023

jenkins test sanity.functional,extended.functional,sanity.openjdk,sanity.system,extended.system xlinux jdk19

@fengxue-IS
Copy link
Contributor Author

fengxue-IS commented Mar 30, 2023

test failure due to assertion added in #16713

j9vm.224    *   ** ASSERTION FAILED ** at /home/jenkins/workspace/Build_JDK19_x86-64_linux_Personal/openj9/runtime/vm/ContinuationHelpers.cpp:264: ((VM_VMHelpers::isFinished(continuation->state)))

This code is incorrect since when a yielded continuation gets freed by GC due to being unreachable, this assertion will fail (though state actually correctly reflected). This assertion made the assumption that unfinished vthread object will be kept alive by the VM's liveVirtualThreadList even if it is no longer reachable from user code which is no longer the case. FYI @LinHu2016

PS: I think the only thing we can assert during freeContinuation is that the continuation cannot be mounted (otherwise we are deleting a carrierThread's data)

@fengxue-IS
Copy link
Contributor Author

jenkins test sanity.functional,extended.functional,sanity.openjdk,sanity.system,extended.system xlinux jdk19

PR testing would need to depends on extension repo changes (something like)

jenkins test sanity.functional,extended.functional,sanity.openjdk,sanity.system,extended.system xlinux jdk19 depends ibmruntimes/openj9-openjdk-jdk19#83

@gacholio
Copy link
Contributor

@fengxue-IS Are you going to fix the assertion here?

Is there any reason ibmruntimes/openj9-openjdk-jdk19#83 can't be merged?

@fengxue-IS
Copy link
Contributor Author

@fengxue-IS Are you going to fix the assertion here?

I can fix the assertion in this pr, just want to confirm with lin that i correctly understand the reason for having the assertion originally.

Is there any reason ibmruntimes/openj9-openjdk-jdk19#83 can't be merged?

the extension change is co-dependent on this pr (continuation.vthread field is added in this change)

Continuation must be unmounted/not being scanned when free.

Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
@gacholio
Copy link
Contributor

jenkins test sanity.functional,extended.functional,sanity.openjdk,sanity.system,extended.system xlinux jdk19 depends ibmruntimes/openj9-openjdk-jdk19#83

@gacholio
Copy link
Contributor

I won't merge this if the testing passes due to the dependency. Someone who is a committer in both repos will have to do it.

@tajila
Copy link
Contributor

tajila commented Mar 30, 2023

Sure Ill do it

@fengxue-IS
Copy link
Contributor Author

build failed due to javadoc compile error, pending fix #17077

@gacholio
Copy link
Contributor

jenkins test sanity.functional,extended.functional,sanity.openjdk,sanity.system,extended.system xlinux jdk19 depends ibmruntimes/openj9-openjdk-jdk19#83

@tajila tajila merged commit 55b9267 into eclipse-openj9:master Mar 31, 2023
j9object_t *liveVirtualThreadList;
omrthread_monitor_t liveVirtualThreadListMutex;
volatile BOOLEAN inspectingLiveVirtualThreadList;
UDATA virtualThreadLinkNextOffset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see an update to DDR code to deal with the removal of virtualThreadLinkNextOffset; as a result, the !vthreads command is now broken: #17084 tracks that problem.

fengxue-IS added a commit to fengxue-IS/aqa-tests that referenced this pull request Apr 3, 2023
VThreadList lock contention fixed by
eclipse-openj9/openj9#16855

Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
renfeiw pushed a commit to adoptium/aqa-tests that referenced this pull request Apr 3, 2023
VThreadList lock contention fixed by
eclipse-openj9/openj9#16855

Signed-off-by: Jack Lu <Jack.S.Lu@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:vm jdk19 perf project:loom Used to track Project Loom related work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

jdk19 OpenJDK java/lang/Thread/virtual/stress/Skynet.java hang / timeout
8 participants