Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read-acquire Class Unload Mutex for compilation before checkpoint #17136

Merged
merged 1 commit into from
Apr 10, 2023

Conversation

dsouzai
Copy link
Contributor

@dsouzai dsouzai commented Apr 10, 2023

The following deadlock is possible:

  1. Comp thread read-enters class unload mutex
  2. Checkpointing thread acquires comp monitor
  3. Comp thread blocks trying to acquire comp monitor
  4. Checkpointing thread blocks trying to write-enter class unload mutex

The reason the Checkpointing thread acquires the class unload mutex is to prevent class unloading while it collects methods. However, a read-acquire is sufficient for this task.

Therefore, having the Checkpointing thread do a read-acquire will prevent the deadlock since it will not block waiting to enter.

The following deadlock is possible:

1. Comp thread read-enters class unload mutex
2. Checkpointing thread acquires comp monitor
3. Comp thread blocks trying to acquire comp monitor
4. Checkpointing thread blocks trying to write-enter class unload mutex

The reason the Checkpointing thread acquires the class unload mutex is
to prevent class unloading while it collects methods. However, a
read-acquire is sufficient for this task.

Therefore, having the Checkpointing thread do a read-acquire will
prevent the deadlock since it will not block waiting to enter.

Signed-off-by: Irwin D'Souza <dsouzai.gh@gmail.com>
@dsouzai dsouzai added comp:jit criu Used to track CRIU snapshot related work labels Apr 10, 2023
@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 10, 2023

@mpirvu could you please review?

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 10, 2023

This will need to be backported to 0.38 since this issue was seen in Liberty testing.

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
In the future we may need to add some comments for these methods regarding the monitors they already hold when entered and the monitors they acquire internally.
Can the same effect of blocking GC be achieved by acquiring VM access?

@mpirvu
Copy link
Contributor

mpirvu commented Apr 10, 2023

jenkins test sanity xlinux jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Apr 10, 2023

If the hook static void jitHookPrepareCheckpoint(J9HookInterface * * hookInterface, UDATA eventNum, void * eventData, void * userData) already holds VM access (to be verified) we don't need to acquire the classUnloadMutex later on.

@dsouzai
Copy link
Contributor Author

dsouzai commented Apr 10, 2023

If the hook static void jitHookPrepareCheckpoint(J9HookInterface * * hookInterface, UDATA eventNum, void * eventData, void * userData) already holds VM access (to be verified) we don't need to acquire the classUnloadMutex later on.

The checkpointing thread releases VM access because of #15202. However, at this point we're not actually waiting for comp threads so it may be that we can actually acquire and release VM Access at this point. I will have to look into the code paths to double check, but for 0.38 I think the change in this PR is the safest solution.

@mpirvu mpirvu merged commit 27f0c87 into eclipse-openj9:master Apr 10, 2023
@pshipton
Copy link
Member

pshipton commented Apr 10, 2023

This change doesn't compile everywhere. I spotted the following failure so far, there may be more later.

This is gcc 11.2
https://openj9-jenkins.osuosl.org/job/Build_JDK20_s390x_linux_Nightly/34/console

19:27:30  /home/jenkins/workspace/Build_JDK20_s390x_linux_Nightly/openj9/runtime/compiler/control/CompileBeforeCheckpoint.cpp: In member function 'void TR::CompileBeforeCheckpoint::collectAndCompileMethodsBeforeCheckpoint()':
19:27:30  /home/jenkins/workspace/Build_JDK20_s390x_linux_Nightly/openj9/runtime/compiler/control/CompileBeforeCheckpoint.cpp:116:71: error: empty parentheses were disambiguated as a function declaration [-Werror=vexing-parse]
19:27:30    116 |    TR::ClassUnloadMonitorCriticalSection collectMethodsCriticalSection();
19:27:30        |                                                                       ^~
19:27:30  /home/jenkins/workspace/Build_JDK20_s390x_linux_Nightly/openj9/runtime/compiler/control/CompileBeforeCheckpoint.cpp:116:71: note: remove parentheses to default-initialize a variable
19:27:30    116 |    TR::ClassUnloadMonitorCriticalSection collectMethodsCriticalSection();
19:27:30        |                                                                       ^~
19:27:30        |                                                                       --
19:27:30  /home/jenkins/workspace/Build_JDK20_s390x_linux_Nightly/openj9/runtime/compiler/control/CompileBeforeCheckpoint.cpp:116:71: note: or replace parentheses with braces to value-initialize a variable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jit criu Used to track CRIU snapshot related work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants