Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle GC thread shutdown during checkpoint #16607

Merged
merged 1 commit into from
Jan 28, 2023

Conversation

tajila
Copy link
Contributor

@tajila tajila commented Jan 25, 2023

Handle GC thread shutdown during checkpoint

Add synchronzation to the delayedLockingOperations path to handle the
case where the GC is shutting down its threads simulatneously.

Signed-off-by: Tobi Ajila atobia@ca.ibm.com

@tajila tajila force-pushed the criu_2 branch 2 times, most recently from fc37783 to 74ca318 Compare January 25, 2023 22:02
@tajila tajila requested a review from gacholio January 25, 2023 22:02
@tajila
Copy link
Contributor Author

tajila commented Jan 25, 2023

@gacholio please review

@tajila
Copy link
Contributor Author

tajila commented Jan 25, 2023

jenkins test sanity alinux64 jdk17

Copy link
Contributor

@gacholio gacholio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm amazed this has been wrong for so long.

@gacholio
Copy link
Contributor

18:25:57  FAILED: test_enumerate
18:25:57  java.lang.IllegalThreadStateException: Has threads
18:25:57  	at java.base/java.lang.ThreadGroup.destroyImpl(ThreadGroup.java:273) from jrt:/java.base
18:25:57  	at java.base/java.lang.ThreadGroup.destroy(ThreadGroup.java:255) from jrt:/java.base
18:25:57  	at org.openj9.test.java.lang.Test_Thread.test_enumerate(Test_Thread.java:381) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/jvmtest/functional/Java8andUp/GeneralTest.jar)
18:25:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) from jrt:/java.base
18:25:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) from jrt:/java.base
18:25:57  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) from jrt:/java.base
18:25:57  	at java.base/java.lang.reflect.Method.invoke(Method.java:568) from jrt:/java.base
18:25:57  	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.internal.Invoker.invokeMethod(Invoker.java:580) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:716) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:988) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:125) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.TestRunner.privateRun(TestRunner.java:648) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.TestRunner.run(TestRunner.java:505) from jdk.internal.loader.ClassLoaders$AppClassLoader@3690b7b2(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_1/aqa-tests/TKG/lib/testng.jar)
18:25:57  	at org.testng.SuiteRunner.runTest(SuiteRunner.java:455) from jdk.internal.loader.Class

@gacholio
Copy link
Contributor

18:16:57  FAILED: test_destroy3
18:16:57  java.lang.IllegalThreadStateException: Has threads
18:16:57  	at java.base/java.lang.ThreadGroup.destroyImpl(ThreadGroup.java:273) from jrt:/java.base
18:16:57  	at java.base/java.lang.ThreadGroup.destroy(ThreadGroup.java:255) from jrt:/java.base
18:16:57  	at org.openj9.test.java.lang.Test_ThreadGroup.test_destroy3(Test_ThreadGroup.java:403) from jdk.internal.loader.ClassLoaders$AppClassLoader@d21714a9(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/jvmtest/functional/Java8andUp/GeneralTest.jar)
18:16:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) from jrt:/java.base
18:16:57  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) from jrt:/java.base
18:16:57  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) from jrt:/java.base
18:16:57  	at java.base/java.lang.reflect.Method.invoke(Method.java:568) from jrt:/java.base
18:16:57  	at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:124) from jdk.internal.loader.ClassLoaders$AppClassLoader@d21714a9(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/lib/testng.jar)
18:16:57  	at org.testng.internal.Invoker.invokeMethod(Invoker.java:580) from jdk.internal.loader.ClassLoaders$AppClassLoader@d21714a9(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/lib/testng.jar)
18:16:57  	at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:716) from jdk.internal.loader.ClassLoaders$AppClassLoader@d21714a9(file:/home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/TKG/lib/testng.jar)

@gacholio
Copy link
Contributor

Why can this not be accomplished in the java code (as seems to be the case in the failing tests)? Another random thought - does the native code require synchronization and notification?

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

I see the problem with this code. I am trying to address the fact that GC thread objects cannot run the J9VMInternals::threadCleanup code which sets threadRef to 0. Normally, this wouldn't be a problem, but with CRIU we are shutting them down mid run so we need a way to clean them up. Perhaps the solution is to not call initializeAttachedThread on the GC threads.

@gacholio
Copy link
Contributor

Can you move the hack out of the common code into the GC? If you remove the attach call, the GC threads will not appear in the javacore (among other issues no doubt).

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

Can you move the hack out of the common code into the GC? If you remove the attach call, the GC threads will not appear in the javacore (among other issues no doubt).

The test failures we are seeing are due to the thread not being removed from the ThreadGroup lists.

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

jenkins test sanity alinux64 jdk17

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

Trying a new approach to isolate the changes to non Java threads. It passed all my local testing

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

Although, do we even need to call the ThreadObject constructor on GC Threads? If you enumerate the system thread group on J9 it returns GC and JIT threads, but hotspot does not.

@gacholio
Copy link
Contributor

Not constructing the objects would change (omit) the names in the javacore, I expect.

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

jenkins test sanity alinux64 jdk17

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

jenkins test sanity alinux64 jdk17

@tajila
Copy link
Contributor Author

tajila commented Jan 26, 2023

I dont think this approach is quite right either, ill try something else

@tajila tajila marked this pull request as draft January 26, 2023 22:12
@tajila tajila force-pushed the criu_2 branch 5 times, most recently from a5465d8 to 684cd28 Compare January 27, 2023 16:20
@tajila tajila force-pushed the criu_2 branch 12 times, most recently from 03134ee to 8fdc02b Compare January 27, 2023 18:23
@tajila tajila changed the title Clear threadref when vmthread detaches Handle GC thread shutdown during checkpoint Jan 27, 2023
@tajila tajila marked this pull request as ready for review January 27, 2023 18:25
@tajila
Copy link
Contributor Author

tajila commented Jan 27, 2023

jenkins test sanity alinux64 jdk11

@tajila
Copy link
Contributor Author

tajila commented Jan 27, 2023

jenkins test sanity xlinux jdk17

@gacholio
Copy link
Contributor

13:46:54  /home/jenkins/workspace/Build_JDK11_aarch64_linux_Personal/openj9/runtime/vm/vmthinit.c: In function 'initializeVMThreading':
13:46:54  /home/jenkins/workspace/Build_JDK11_aarch64_linux_Personal/openj9/runtime/vm/vmthinit.c:99:39: error: 'J9JavaVM {aka struct J9JavaVM}' has no member named 'delayedLockingOperationsMutex'
13:46:54     omrthread_monitor_init_with_name(&vm->delayedLockingOperationsMutex, 0, "Delayed locking operations mutex")

Add synchronzation to the delayedLockingOperations path to handle the
case where the GC is shutting down its threads simulatneously.

Signed-off-by: Tobi Ajila <atobia@ca.ibm.com>
@tajila
Copy link
Contributor Author

tajila commented Jan 27, 2023

jenkins test sanity alinux64 jdk11

@tajila
Copy link
Contributor Author

tajila commented Jan 27, 2023

@gacholio gacholio merged commit 6cadc1d into eclipse-openj9:master Jan 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants