Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New eden and heap sizing logic in Balanced GC #12043

Merged
merged 1 commit into from
Sep 22, 2021

Conversation

cedrichansen
Copy link
Contributor

@cedrichansen cedrichansen commented Feb 24, 2021

This change introduces 2 different major components to balanced gc, which are working together.

---------Change 1: New heap sizing logic in balanced gc
The old heap sizing logic, relied primarily on free memory %, in relation to -Xminf and -Xmaxf to decide whether or not the heap should shrink/expand.
If free memory was within the acceptable boundary, the heap sizing logic would then look at gc cpu% to decide whether to expand/contract.

The new logic (which is included in this commit), aims to take a more hybrid approach, where TENURE free memory % and GMP cpu % are working together, to determine whether or not the heap needs to change.
The function calculateCurrentHybridHeapOverhead determines what the current "hybrid overhead" of the heap/gc is (blend of gc cpu % and free memory %).
If this "hybrid overhead" is above -Xmaxt (default to 5%), the heap will expand. Conversely, if the "hybrid overhead" is below -Xmint (2% default), the heap will shrink.

--------Change 2: New eden sizing logic in balanced

The current logic for eden sizing is fairly simple - By default, eden is taken to be 25% of the heap - unless specified otherwise by -Xmn/-Xmns/-Xmnx

The new logic, aims to make use of a few heuristics to improve overall PGC overheaad (% of time being active relative to total time), while trying to respect the newly introduced, tarokTargetMaxPauseTime.
Eden sizing logic now adheres to the following high level concepts:

  1. If the heap is not fully expanded ( that is current heap size <= -Xsoftmx/-Xmx), then eden will increase by 5% if PGC overhead and pgc avg time blend, is above
    dnssExpectedTimeRatioMinimum (5% default). Conversely, eden will shrink by 5% if the hybrid blend of pgc overhead and pgc time, falls below dnssExpectedTimeRatioMaximum (2% default). When the heap is not fully expanded, any changes in eden size, will result in the total heap to be resized by the same amount, in order to keep non-eden size the same.

  2. If the heap is fully expanded, a prediction will be made as to which eden size will produce the lowest total GC cpu overhead, while attempting to respect tarokTargetMaxPauseTime, given the constraints imposed by the heap (free memory, GMP cost, etc.
    This prediction accounts for PGC average time/intervals, GMP time/intervals, free Tenure space, and survival ratio of objects being copied out of eden, in order to determine the "best" eden size.
    When the heap is fully expanded, and eden grows, it will "steal" free regions from "tenure"/non-eden.

Note: The following command line options are now supported by balanced gc policy as part of this change.
- -Xgc:dnssExpectedTimeRatioMinimum -> sets min expected % of time spent in pgc pauses (default 2% for balanced)
- -Xgc:dnssExpectedTimeRatioMaximum -> sets max expected % of time spent in pgc pauses (default 5% for balanced)
- -Xgc:targetPausetime -> sets target (tarokTargetMaxPauseTime defaults to 200ms in balanced)

Depends on: eclipse-omr/omr#5825

Signed-off-by: Cedric Hansen cedric.hansen@ibm.com

@cedrichansen
Copy link
Contributor Author

@amicic FYI

@cedrichansen cedrichansen force-pushed the balancedHeapHeuristics branch 3 times, most recently from 151df40 to 32ea4e0 Compare February 24, 2021 17:33
@amicic amicic added the comp:gc label Feb 24, 2021
@amicic amicic self-requested a review February 24, 2021 20:30
sizeChange = calculateContractionSize(env, allocDescription, _systemGC, true);
}

if (0 == sizeChange && hybridHeapScore >= (double)_extensions->heapContractionGCTimeThreshold) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per coding standards, we put extra oval brackets and don't rely on operator priorities:

((0 == sizeChange) && (hybridHeapScore >= (double)_extensions->heapContractionGCTimeThreshold))

Please go through all the changes and add them when necessary. For this pass I'll ignore the rest of missing ones.

@amicic
Copy link
Contributor

amicic commented Feb 24, 2021

One general comment: if you think you touched a significant part of a file (50% would suffice), then convert J9 specific integer types like UDATA, I_32 etc... into standard types uintptr_t, int32_t etc in the whole file.

@amicic
Copy link
Contributor

amicic commented Feb 24, 2021

Please create an Issue, and move most of PR/commit comments there.

@cedrichansen
Copy link
Contributor Author

This PR addresses #12054

@cedrichansen
Copy link
Contributor Author

The latest 2 commits include the following changes:

  • Fixed some "if" statements conditions to adhere to coding standards
  • Changed j9 specific integer types, into standard integer types for files which saw significant modifications

@@ -1,5 +1,5 @@
/*******************************************************************************
* Copyright (c) 1991, 2020 IBM Corp. and others
* Copyright (c) 1991, 2021 IBM Corp. and others
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file after all has not been changed. Would you please just exclude it from change set without modification of copyrights data?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


goto _exit;
}
#if defined(J9VM_GC_REALTIME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all preprocessor lines should start column 0

@@ -60,6 +60,8 @@ class MM_MemorySubSpaceTarok : public MM_MemorySubSpace

MM_HeapRegionManager *_heapRegionManager; /**< Stored so that we can resolve the table descriptor for given addresses when asked for a pool corresponding to a specific address */
MM_LightweightNonReentrantLock _expandLock; /**< Most of the common expand code is not multi-threaded safe (since it used in standard collectors on alloc path fail path which is single threaded) */
double _lastObservedGcPercentage; /**< The most recently observed GC percentage (time gc is active / time gc is not active) */
U_64 _previouslyObservedPGCCount; /**< The count of observed PGC's since the last GMP cycle - Used to determine if heap resize occured after a system/global gc */
Copy link
Contributor

@dmitripivkine dmitripivkine Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you change variable types to standard in correspondent .cpp file, so it make sense to change types in .hpp as a pair

@cedrichansen
Copy link
Contributor Author

@dmitripivkine PR comments thus far have been addressed. These includes:

  • Fixing multiline comments to adhere to style standards
  • move preprocessor instruction to start of the line
  • making MemorySubspaceTarok.hpp integer types consistent with .cpp integer types
  • removed copyright update for untouched file -> GCextensionsBase.cpp

}

double
MM_MemorySubSpaceTarok::calculateHybridHeapOverhead(MM_EnvironmentBase *env, IDATA heapChange)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few of IDATAs that should be converted to intptr_t

Trc_MM_MemorySubSpaceTarok_calculateHybridHeapOverhead(env->getLanguageVMThread(), gcPercentage, freeMemComponant);
}

return (gcPercentage * gcOverheadWeight) + (freeMemComponant * (1 - gcOverheadWeight));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can still use weightedAverage here

}

double recentOverhead = (double)(_historicalPartialGCTime * 1000)/_averagePgcInterval;
_partialGcOverhead = (double)MM_Math::weightedAverage((float)_partialGcOverhead, (float)recentOverhead, 0.5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

introduce weightedAverage for double

Copy link
Contributor Author

@cedrichansen cedrichansen Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eclipse-omr/omr@0aa8abb adds MM_Math::weightedAverage for doubles (this is a second commit on linked OMR PR)

double
MM_MemorySubSpaceTarok::calculateGcPctForHeapChange(MM_EnvironmentBase *env, IDATA heapSizeChange)
{
MM_EnvironmentVLHGC *envVLHGC = (MM_EnvironmentVLHGC *)env;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we typically name the one from the param envBase, and then this one just env

Copy link
Contributor Author

@cedrichansen cedrichansen Mar 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this file, I have seen it called envVLHGC, but am happy to refactor.

See below for example https://github.com/eclipse/openj9/blob/8394cffecaf5ea67e60c837a3b8f27e8c7d5c290/runtime/gc_vlhgc/MemorySubSpaceTarok.cpp#L447

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right the existing pattern is predominant. I like what I proposed because the sites that use them will have a bit less characters, but it's not a strong argument. We can leave it as is - up to you.

MM_MemorySubSpaceTarok::getHeapSizeWithinBounds(MM_EnvironmentBase *env)
{
double currentHybridHeapScore = calculateCurrentHybridHeapOverhead(env);
uintptr_t reccomendedHeapSize = getActiveMemorySize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spelling: recommended

@cedrichansen
Copy link
Contributor Author

@amicic PR comments have been addressed. These include:

  • Converting IDATA -> intptr_t in MemorySubSpaceTarok
  • Fix spelling mistake 'reccomended' -> 'recommended'
  • using weightedAverage which takes double instead of float

I left out the change related to envVLHGC since some parts of the file in question use the same nomenclature I used for that same cast, but I can gladly change it if need be

if ((x1 < x2) && (y1 < y2)) {
double timeDiff = y1 - y2;
double edenSizeRatio = ((x1 + 1.0) / (x2 + 1.0));
_pgcTimeIncreasePerEdenRegionFactor = pow(edenSizeRatio, (1.0/timeDiff));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note for reviewers - this _pgcTimeIncreasePerEdenRegionFactor should not be a weighted average.

Rationale - _pgcTimeIncreasePerEdenRegionFactor is not "linear" in the sense that the difference between 1.1 and 1.2, is not "the same" as 1.2 to 1.3, since this equation is finding the logarithmic fit between the 2 points.
That being said, using a weighted average, is not really a good idea. Additionally, _historicalPartialGCTime is being used for this calculation, which is already a moving average, so no need to make _pgcTimeIncreasePerEdenRegionFactor a weighted average as well.

Copy link
Contributor

@amicic amicic Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably ok for now, but there are still some assumed consts here, which would fit for some workloads/current eden sizes and not for others. Specifically, A (assumed 1.0) here would could vary:

pow(edenSizeRatio, ( A /timeDiff));

How I would have done it (in theory anyway) is to remember last few (5-10) of (size, time) pairs and do a linear regression. That would give you first derivative (slope of the whatever the real type of the curve is) in the neighborhood of recent eden sizes. Indeed this is what we need, since we are typically doing incremental resizing (<10%), not that far from the recent past samples. Basically, a very simple machine learning.

That approach would get rid of assumptions like minimumPgcTime, factor A etc,
but it would be a lot more calculations and early estimates (during startup/rampup) would be fairly inaccurate.

Copy link
Contributor Author

@cedrichansen cedrichansen Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the 1.0, in pow(edenSizeRatio, ( 1.0 /timeDiff)); is a workaround to represent nthroot in c++. The actual formula is the following

image

Where z in the image, is _pgcTimeIncreasePerEdenRegionFactor in the code. That being said, I don't really see the 1.0 changing, unless we change the formula (which might be an option).

As for the suggested improvement, I agree that this would be a good refinement, and I think we had briefly spoken about doing something along those lines, but as you mention, it adds more complexity/calculations. Up to you if we tackle that refinement in this PR, or at a later point.

LinHu2016 added a commit to LinHu2016/openj9 that referenced this pull request Oct 13, 2021
With New eden and heap sizing logic in Balanced GC(introduced in
eclipse-openj9#12043), maximum Eden
memorypool size could reach to 75% maximum heap size or the size
specified in -Xmn/-Xmnx

minimum Eden size is 2 * regions size or the size
specified in -Xmn/-Xmns
qasimy123 pushed a commit to qasimy123/openj9-jit-debug-agent that referenced this pull request Nov 26, 2021
eclipse-openj9/openj9#12043 introduced a warning when compiled to32 bit platforms.
This commit adds the appropriate cast to fix the warning.

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
qasimy123 pushed a commit to qasimy123/openj9-jit-debug-agent that referenced this pull request Nov 26, 2021
With New eden and heap sizing logic in Balanced GC(introduced in
eclipse-openj9/openj9#12043), maximum Eden
memorypool size could reach to 75% maximum heap size or the size
specified in -Xmn/-Xmnx

minimum Eden size is 2 * regions size or the size
specified in -Xmn/-Xmns
amicic added a commit to cedrichansen/openj9-docs that referenced this pull request Feb 4, 2022
eclipse-openj9/openj9#12043 will add support for
the options mentioned above, for the Balanced GC policy.
This PR contains the documentation changes required to reflect this
added support.

closes eclipse-openj9#836

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
amicic added a commit to cedrichansen/openj9-docs that referenced this pull request Feb 4, 2022
eclipse-openj9/openj9#12043 will add support for
the options mentioned above, for the Balanced GC policy.
This PR contains the documentation changes required to reflect this new
behaviour.

closes eclipse-openj9#834

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
pshipton pushed a commit to pshipton/openj9-docs that referenced this pull request Feb 7, 2022
…ioMinimum` support for Balanced GC

eclipse-openj9/openj9#12043 will add support for the options mentionned above, for the Balanced GC policy.
This PR contains the documentation changes required to reflect this added support.

closes eclipse-openj9#836

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
pshipton pushed a commit to pshipton/openj9-docs that referenced this pull request Feb 7, 2022
eclipse-openj9/openj9#12043 will add support for
the options mentioned above, for the Balanced GC policy.
This PR contains the documentation changes required to reflect this
added support.

closes eclipse-openj9#836

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
pshipton pushed a commit to pshipton/openj9-docs that referenced this pull request Feb 7, 2022
eclipse-openj9/openj9#12043 will add support for the option mentioned above, for the Balanced GC policy.
This PR contains the documentation changes required to reflect this added support.

closes eclipse-openj9#833

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
pshipton pushed a commit to pshipton/openj9-docs that referenced this pull request Feb 7, 2022
eclipse-openj9/openj9#12043 will add support for the options mentioned above, for the Balanced GC policy.
This PR contains the documentation changes required to reflect this new behaviour.

closes eclipse-openj9#834

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
pshipton pushed a commit to pshipton/openj9-docs that referenced this pull request Feb 7, 2022
eclipse-openj9/openj9#12043 will add support for
the options mentioned above, for the Balanced GC policy.
This PR contains the documentation changes required to reflect this new
behaviour.

closes eclipse-openj9#834

Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants