Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comp Threads and JIT Server options post restore #16769

Merged

Conversation

dsouzai
Copy link
Contributor

@dsouzai dsouzai commented Feb 23, 2023

  • Add the concept of Number of Allocated Threads
    • In CRIU mode, the number of threads allocated is TR::CompilationInfo::MAX_CLIENT_USABLE_COMP_THREADS (unless it exceeds the number of code caches).
    • In non-CRIU mode, the number of threads allocated is the same as the number of usable threads.
  • Refactor JITServer options processing code so it can be reused post-restore
  • Process JITServer options post restore
  • Enable the post restore options processing code

@dsouzai dsouzai added comp:jit criu Used to track CRIU snapshot related work labels Feb 23, 2023
@dsouzai dsouzai force-pushed the compThreadsJitServerPostRestore branch from 50c709c to 722695c Compare February 23, 2023 22:49
@@ -331,79 +253,45 @@ J9::OptionsPostRestore::iterateOverExternalOptions()
void
J9::OptionsPostRestore::processJitServerOptions()
{
#if defined(J9VM_OPT_JITSERVER)
if (_argIndexUseJITServer >= _argIndexDisableUseJITServer)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth noting that this may need to be changed in the future to account for the fact that in CRIU mode, JITServer is only enabled by default in non-portable mode (#15674)

@dsouzai
Copy link
Contributor Author

dsouzai commented Feb 23, 2023

@mpirvu could you please review?

@mpirvu mpirvu self-requested a review February 23, 2023 23:07
@mpirvu mpirvu self-assigned this Feb 23, 2023
@dsouzai
Copy link
Contributor Author

dsouzai commented Feb 27, 2023

@vijaysun-omr fyi

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some inline comments

runtime/compiler/control/OptionsPostRestore.cpp Outdated Show resolved Hide resolved
@dsouzai dsouzai force-pushed the compThreadsJitServerPostRestore branch from 722695c to 13310a9 Compare February 28, 2023 17:52
@dsouzai
Copy link
Contributor Author

dsouzai commented Feb 28, 2023

@mpirvu good for review again (see force push). I updated setNumUsableCompilationThreadsPostRestore to take in the parameter by reference so that it updates TR::Options::_numUsableCompilationThreads; this ensures that we do the check you mentioned in #16769 (comment).

Also, I added the code to actually run this code post restore so that we start exercising this stuff.

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few more suggestions. Please see inline.

runtime/compiler/control/CompilationThread.cpp Outdated Show resolved Hide resolved
runtime/compiler/control/J9Options.cpp Show resolved Hide resolved
runtime/compiler/runtime/JITServerStatisticsThread.cpp Outdated Show resolved Hide resolved
Signed-off-by: Irwin D'Souza <dsouzai.gh@gmail.com>
Signed-off-by: Irwin D'Souza <dsouzai.gh@gmail.com>
Signed-off-by: Irwin D'Souza <dsouzai.gh@gmail.com>
@dsouzai dsouzai force-pushed the compThreadsJitServerPostRestore branch from 13310a9 to 60bfe7b Compare March 1, 2023 22:53
@dsouzai
Copy link
Contributor Author

dsouzai commented Mar 1, 2023

@mpirvu addressed review comments (see force push).

Also I ran some tests:

Specifying nothing (defaults to localhost) or the machine name:

#CHECKPOINT RESTORE: Done compiling methods for checkpoint
#CHECKPOINT RESTORE: Preparing to suspend threads for checkpoint
#CHECKPOINT RESTORE: Finished suspending threads for checkpoint
#CHECKPOINT RESTORE: Ready for checkpoint
#CHECKPOINT RESTORE: Preparing for restore
#CHECKPOINT RESTORE: Ready for restore
...
 (warm) Compiling HelloInstantOn.jitMe()V  OrdinaryMethod j9m=0000000000191700 remote t=15206 compThreadID=3 memLimit=1572864 KB freePhysicalMemory=1534 MB
#JITServer: Client sending compReq seqNo=1 to server for method HelloInstantOn.jitMe()V @ warm.
#JITServer: t= 15206 Connected to a server (serverUID=17563505271461179976)
#JITServer: Client successfully loaded method HelloInstantOn.jitMe()V @ warm following compilation request. [metaData=00007F3848A86038, startPC=00007F3D812B4058]
+ (warm) HelloInstantOn.jitMe()V @ 00007F3D812B4058-00007F3D812B48E8 OrdinaryMethod - Q_SZ=3 Q_SZI=3 QW=26 j9m=0000000000191700 bcsz=23 OSR remote time=86227us mem=[region=448 system=2048]KB compThreadID=3 CpuLoad=0%(0%avg) JvmCpu=55% queueTime=86978us

Specifying -XX:JITServerAddress=badaddr:

#CHECKPOINT RESTORE: Done compiling methods for checkpoint
#CHECKPOINT RESTORE: Preparing to suspend threads for checkpoint
#CHECKPOINT RESTORE: Finished suspending threads for checkpoint
#CHECKPOINT RESTORE: Ready for checkpoint
#CHECKPOINT RESTORE: Preparing for restore
#CHECKPOINT RESTORE: Ready for restore
...
 (warm) Compiling HelloInstantOn.jitMe()V  OrdinaryMethod j9m=0000000000191700 remote t=9436 compThreadID=4 memLimit=1572864 KB freePhysicalMemory=1523 MB
#FAILURE:  JITServer::StreamFailure: Cannot resolve server name: Name or service not known for HelloInstantOn.jitMe()V @ warm
#JITServer: t=  9436 Could not connect to a server. Next attempt in 2000 ms.
#JITServer: t=  9436 Resetting activation policy to AGGRESSIVE because client has lost connection to server
#JITServer: compThreadID=4 JITServer StreamFailure: Generic stream failure
! (warm) HelloInstantOn.jitMe()V Q_SZ=3 Q_SZI=3 QW=26 j9m=0000000000191700 time=29751us compilationStreamFailure memLimit=1572864 KB freePhysicalMemory=1523 MB mem=[region=64 system=2048]KB compThreadID=4
 (warm) Compiling HelloInstantOn.jitMe()V  OrdinaryMethod j9m=0000000000191700  t=9436 compThreadID=4 memLimit=1572864 KB freePhysicalMemory=1523 MB
+ (warm) HelloInstantOn.jitMe()V @ 00007FEF322B4058-00007FEF322B4801 OrdinaryMethod - Q_SZ=3 Q_SZI=3 QW=26 j9m=0000000000191700 bcsz=23 OSR time=20324us mem=[region=3968 system=4096]KB compThreadID=4 CpuLoad=0%(0%avg) JvmCpu=76% queueTime=50911us

@mpirvu
Copy link
Contributor

mpirvu commented Mar 2, 2023

jenkins test sanity all jdk17

@mpirvu mpirvu merged commit 968e248 into eclipse-openj9:master Mar 3, 2023
@dsouzai dsouzai deleted the compThreadsJitServerPostRestore branch April 3, 2024 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jit criu Used to track CRIU snapshot related work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants