Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize continuation walkframe processing for JIT codecache reclaim #17527

Merged
merged 1 commit into from
Aug 18, 2023

Conversation

LinHu2016
Copy link
Contributor

@LinHu2016 LinHu2016 commented Jun 5, 2023

JOptimize continuation walkframe processing for JIT codecache reclaim

JIT code cache reclaim is generally triggered via end of GC event,
it requires walk the java stack of all live continuation Objects to
keep the related resource in code cache for the virtual threads,
currently JIT uses single thread to iterate the global continuation list
for the processing, it might generate up to 5-30% regression for
throughput performance, use all of GC threads parallel iterating the
list during GC clearable phase could improve CPUs usage and reduce the
processing time(STW time).

  • parallel iterating continuation list at the end of GC clearable phase
    (except realtime)
  • register new J9HOOK_MM_WALKCONTINUATION event to callback jit java
    stack walk for code cache reclaim.
  • update that JIT code cache reclaim is only triggered by every end of
    GC Cycle event, it was triggered by both end of Local GC and end of of
    Global GC events, but some of end of GC events are unnecessary for
    code cache reclaim, such as concurrent scavenge has two STW GC phases
    (initializing at the begin and cleaning up at the end), at the end of
    concurrent scavenge initializing phase, there is not much GC work done
    yet, it would waster time for code cache reclamation at the point,
    also currently there are no different processing specific for local GC
    and global GC(if we need to handle the difference in future, we still
    can use "env->_cycleState->_type" to figure out local and global GC).

Signed-off-by: Lin Hu linhu@ca.ibm.com

@LinHu2016 LinHu2016 force-pushed the Loom_GC_update_21 branch 3 times, most recently from 3393df0 to 6d6b751 Compare June 6, 2023 19:24
@LinHu2016 LinHu2016 marked this pull request as ready for review June 28, 2023 12:37
@LinHu2016 LinHu2016 force-pushed the Loom_GC_update_21 branch 2 times, most recently from 87c35eb to 27bfedb Compare June 29, 2023 16:17
}
}
_extensions->didIterateContinuationListForJIT = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this any different than CopyForward iteration?
would be nice to share.... perhaps put it into IGGC (which can be accesses with ext->globalCollector)
or as a static method in MM_ContinuationObjectBufferVLHGC ?

the latter approach, if works, could be applied for sharing Scavenge and MarkingScheme variants (using BufferStandard), that are also similar

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, a cleaner approach is to put it into HeapRegionManager (it's higher level abstract than Buffer), since iteration assumes some knowledge about how regions are organized, and indeed we do have separate HRMs for Standard and VLHGC configurations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both MM_HeapRegionManager and MM_HeapRegionManagerStandard are in omr, if we put it into HeapRegionManager, we need to update omr code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just realized it. nah, scratch it. we should not be introducing Java spacific code to OMR

{
J9JavaVM *vm = (J9JavaVM*)env->getOmrVM()->_language_vm;
J9JITConfig *jitConfig = vm->jitConfig;
if (NULL == jitConfig->methodsToDelete) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we generalize this with a check like 'anything registered to J9HOOK_MM_WALKCONTINUATION' ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move the check for jitConfig->methodsToDelete to RootScanner

@LinHu2016 LinHu2016 force-pushed the Loom_GC_update_21 branch 6 times, most recently from 076931e to 0e45f59 Compare July 12, 2023 15:29
void
MM_ScavengerRootClearer::iterateContinuationObjects(MM_EnvironmentBase *env)
{
env->getGCEnvironment()->_continuationObjectBuffer->iterateContinuationObjects(env);
Copy link
Contributor

@amicic amicic Jul 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still thinking if Buffer is the right class to add iteration to, but if we do it, it should be a static method, otherwise we may end up calling it with _continuationObjectBuffer NULL. Probably it would still work, since the iteration would never access 'this' pointer, but still it would be cleaner as a static.

Being a static, it could not be a virtual, but that would still be ok, (Scavenger/Marker would explicitly call Standard variant of iteration, and CopyForwad/GlobalMarkingScheme would call VLHGC variant.)

RootScanner's iterate method state in base class as empty/pure virtual, so we can share the jit's condition.

@LinHu2016 LinHu2016 force-pushed the Loom_GC_update_21 branch 3 times, most recently from 8f66b5b to 593b506 Compare July 12, 2023 20:10
@@ -70,7 +70,7 @@ const static char *attributeNames[] = {
"phantomrefscomplete", /* RootScannerEntity_PhantomReferenceObjectsComplete */
"unfinalizedobjectscomplete", /* RootScannerEntity_UnfinalizedObjectsComplete */
"ownablesynchronizerobjectscomplete", /* RootScannerEntity_OwnableSynchronizerObjectsComplete */
"continuationobjectscomplete", /* RootScannerEntity_ContinuationObjectsComplete */
"iteratingcontinuationobjects", /* RootScannerEntity_ContinuationObjectsComplete */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd keep these name consistent (revert it back to 'complete')

@@ -48,7 +48,8 @@
#include "FinalizableReferenceBuffer.hpp"
#include "FinalizeListManager.hpp"
#endif /* J9VM_GC_FINALIZATION*/
#include "ContinuationObjectBuffer.hpp"
//#include "ContinuationObjectBuffer.hpp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this, if really not needed

@@ -1044,6 +1050,12 @@ MM_RootScanner::scanClearable(MM_EnvironmentBase *env)
scanOwnableSynchronizerObjects(env);
scanContinuationObjects(env);

J9JavaVM *vm = (J9JavaVM*)env->getOmrVM()->_language_vm;
J9JITConfig *jitConfig = vm->jitConfig;
if (NULL != jitConfig->methodsToDelete) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could jitConfig be null (if running -Xint)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update it to
if ((NULL != jitConfig) && (NULL != jitConfig->methodsToDelete)) {

@@ -435,6 +435,7 @@ class MM_RootScanner : public MM_BaseVirtual
*/
virtual void scanOwnableSynchronizerObjects(MM_EnvironmentBase *env);
virtual void scanContinuationObjects(MM_EnvironmentBase *env);
virtual void iterateAllContinuationObjects(MM_EnvironmentBase *env);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put some comments in this base RootScanner class for these 2 methods: that one is about pruning C.O. lists within a scope of current GC (for example just for Nursery in Scavenge) and the other one is about iterating all live COs in the heap and doing any processing that other parties may register through a hook.

@@ -434,7 +434,16 @@ class MM_RootScanner : public MM_BaseVirtual
* which modifies elements within the list.
*/
virtual void scanOwnableSynchronizerObjects(MM_EnvironmentBase *env);
/**
* scanContinuationObjects is for pruning continuation Objects, which are dead or last unmounted,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

terminology: I think we prune a list or remove object from list, but not prune objects
there are some other minor mistakes: missing space, missing letter, inconsistent casing...

virtual void scanContinuationObjects(MM_EnvironmentBase *env);
/**
* iterateAllContinuationObjects is for iterating all live Contunuation Objects in the heap and doing any processing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect spelling of Continuation

virtual void scanContinuationObjects(MM_EnvironmentBase *env);
/**
* iterateAllContinuationObjects is for iterating all live Contunuation Objects in the heap and doing any processing
* that other parties may register through a hook (iterateAllContinuationObjects}.
Copy link
Contributor

@amicic amicic Jul 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean to put the name of the hook within the brackets?
the brackets types are inconsistent

if ((OMR_GC_CYCLE_TYPE_SCAVENGE == cycle_type) || (OMR_GC_CYCLE_TYPE_VLHGC_PARTIAL_GARBAGE_COLLECT == cycle_type) ||
(OMR_GC_CYCLE_TYPE_GLOBAL == cycle_type) || (OMR_GC_CYCLE_TYPE_VLHGC_GLOBAL_GARBAGE_COLLECT == cycle_type)) {
if (extensions->didIterateContinuationListForJIT) {
extensions->didIterateContinuationListForJIT = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we can get by without introducing this flag by registering JIT reclamation pass to CYCLE_END for all type of collectors (not just realtime). By doing that, we guarantee that stack walking pass for JIT (that we do now in clearable phase) is always done before reclamation pass. Clearable pass is done in the last increment of a cycle (if a cycle has more than one increment like concurrent scavenge, GMP, or realtime) when cycle end hook is also triggered that could do the reclamation.

@LinHu2016 LinHu2016 force-pushed the Loom_GC_update_21 branch 2 times, most recently from 8f2c1b0 to c5a1d62 Compare July 17, 2023 20:57
@LinHu2016 LinHu2016 force-pushed the Loom_GC_update_21 branch 5 times, most recently from d5a7d08 to 4435f0a Compare August 4, 2023 12:47
@dmitripivkine
Copy link
Contributor

jenkins test sanity aix,win jdk21

@LinHu2016 LinHu2016 force-pushed the Loom_GC_update_21 branch 6 times, most recently from 3e23a77 to 76e6f6b Compare August 16, 2023 16:57
@babsingh
Copy link
Contributor

jenkins test sanity aix,win jdk21

JIT code cache reclaim is generally triggered via end of GC event,
it requires walk the java stack of all live continuation Objects to
keep the related resource in code cache for the virtual threads,
currently JIT uses single thread to iterate the global continuation list
for the processing, it might generate up to 5-30% regression for
throughput performance, use all of GC threads parallel iterating the
list during GC clearable phase could improve CPUs usage and reduce the
processing time(STW time).

 - parallel iterating continuation list at the end of GC clearable phase
 (except realtime)
 - register new J9HOOK_MM_WALKCONTINUATION event to callback jit java
  stack walk for code cache reclaim.
 - update that JIT code cache reclaim is only triggered by every end of
 GC Cycle event, it was triggered by both end of Local GC and end of of
 Global GC events, but some of end of GC events are unnecessary for
 code cache reclaim, such as concurrent scavenge has two STW GC phases
 (initializing at the begin and cleaning up at the end), at the end of
 concurrent scavenge initializing phase, there is not much GC work done
 yet, it would waster time for code cache reclamation at the point,
 also currently there are no different processing specific for local GC
 and global GC(if we need to handle the difference in future, we still
 can use "env->_cycleState->_type" to figure out local and global GC).

Signed-off-by: Lin Hu <linhu@ca.ibm.com>
@babsingh
Copy link
Contributor

jenkins test sanity aix,win jdk21

@babsingh
Copy link
Contributor

babsingh commented Aug 18, 2023

The PR builds have passed. Merging in lieu of @amicic who has approved the PR and is currently away.

@babsingh babsingh merged commit d1eab2b into eclipse-openj9:master Aug 18, 2023
7 checks passed
Copy link
Contributor

@keithc-ca keithc-ca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should either be fixed or reverted.

@@ -4800,6 +4800,7 @@ C|UTILIZATION_WINDOW_SIZE
S|MmhookConstants|MmhookConstantsPointer|
C|J9HOOK_MM_CLASS_UNLOADING_END
C|J9HOOK_MM_INTERRUPT_COMPILATION
C|J9HOOK_MM_WALKCONTINUATION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dmitri is correct: new constants cannot be added to this file.

@@ -4800,6 +4800,7 @@ C|UTILIZATION_WINDOW_SIZE
S|MmhookConstants|MmhookConstantsPointer|
C|J9HOOK_MM_CLASS_UNLOADING_END
C|J9HOOK_MM_INTERRUPT_COMPILATION
C|J9HOOK_MM_WALKCONTINUATION
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't think it was necessary to add it.

@keithc-ca
Copy link
Contributor

#17986 addresses my concerns.

LinHu2016 added a commit to LinHu2016/openj9 that referenced this pull request May 30, 2024
 jitCodeCache reclamation would be triggered by GC cycle end event,
 For scavenge backout case, at the end of cycle, heap would be rolled
 back to before scavenge cycle start, and another global GC would start
 right after the end of backout scavenge, so there is no need to do
 jitCodeCache reclamation at the end of backout scavenge.
 also from Java21 the partial reclamation(related with virtualThread)
processing has been split to move to clearable phase of GC in order to
improve the processing performance
(PR:eclipse-openj9#17527), but for
scavenge backout case, the backout only could happen before the first
part of reclamation processing(the processing would never get a chance
to run for the case), without the first part  reclamation processing,
the second part of reclamation processing, which triggered by GC cycle
end, would be possible to reclaim the "live" stack jit frames, then
cause the crash. so we need to skip jitCodeCache reclamation for
scavenge backout case.

 - pass unsuccessful state via the cycleType of Cycle End event to the
 jitcodecache relamation hook for backout case.
 - check if unsuccessful state is set in the hook, if it is true, skip
  jitcodecache relamation processing(second part).

Signed-off-by: hulin <linhu@ca.ibm.com>
LinHu2016 added a commit to LinHu2016/openj9 that referenced this pull request May 30, 2024
 jitCodeCache reclamation would be triggered by GC cycle end event,
 For scavenge backout case, at the end of cycle, heap would be rolled
 back to before scavenge cycle start, and another global GC would start
 right after the end of backout scavenge, so there is no need to do
 jitCodeCache reclamation at the end of backout scavenge.
 also from Java21 the partial reclamation(related with virtualThread)
processing has been split to move to clearable phase of GC in order to
improve the processing performance
(PR:eclipse-openj9#17527), but for
scavenge backout case, the backout only could happen before the first
part of reclamation processing(the processing would never get a chance
to run for the case), without the first part  reclamation processing,
the second part of reclamation processing, which triggered by GC cycle
end, would be possible to reclaim the "live" stack jit frames, then
cause the crash. so we need to skip jitCodeCache reclamation for
scavenge backout case.

 - pass unsuccessful state via the cycleType of Cycle End event to the
 jitcodecache relamation hook for backout case.
 - check if unsuccessful state is set in the hook, if it is true, skip
  jitcodecache relamation processing(second part).

Signed-off-by: hulin <linhu@ca.ibm.com>
LinHu2016 added a commit to LinHu2016/openj9 that referenced this pull request May 31, 2024
 jitCodeCache reclamation would be triggered by GC cycle end event,
 For scavenge backout case, at the end of cycle, heap would be rolled
 back to before scavenge cycle start, and another global GC would start
 right after the end of backout scavenge, so there is no need to do
 jitCodeCache reclamation at the end of backout scavenge.
 also from Java21 the partial reclamation(related with virtualThread)
processing has been split to move to clearable phase of GC in order to
improve the processing performance
(PR:eclipse-openj9#17527), but for
scavenge backout case, the backout only could happen before the first
part of reclamation processing(the processing would never get a chance
to run for the case), without the first part  reclamation processing,
the second part of reclamation processing, which triggered by GC cycle
end, would be possible to reclaim the "live" stack jit frames, then
cause the crash. so we need to skip jitCodeCache reclamation for
scavenge backout case.

 - pass unsuccessful state via the cycleType of Cycle End event to the
 jitcodecache relamation hook for backout case.
 - check if unsuccessful state is set in the hook, if it is true, skip
  jitcodecache relamation processing(second part).

Signed-off-by: hulin <linhu@ca.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants