-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework Balanced GC heap and eden sizing logic #12054
Comments
At time of writing/update (July 9th 2021), a variety of specjbb2015 tests have been re-run to validate behaviour of The following tests have all been run with ResultsBaseline/existing logicSimilar to the test configuration that was run in the original issue comment, the heap size is changing by very large amounts (~2G) after global sweep occurs, and then increasing by ~1/2G shortly afterwards, due to GC % being fairly high. -Xgc:targetPauseTime=200-Xgc:targetPauseTime=500Results/Stats
Observations/Notes
|
The models/heuristics being used for both heap, and eden sizing logic, can be found in the following desmos graph |
Implementation wise, there are a few important high level concepts to discuss. The approaches and heuristics described below, are consistent with the code implementation found in #12043 at time of writing - The entry point for everything to do with eden resizing is located at Eden sizing logicheap is not fully expandedWhen the heap is not fully expanded (that is, current heap size is less than -Xsoftmx/-Xmx), then the eden sizing logic will attempt to strike a balance between keeping pause times below specified default (called A few examples of decisions of eden sizing below (assuming the default, 200ms pause time target is used):
The decision as to whether or not eden should be expand, is done by mapping pgc average pause times, to a pgc overhead "equivalent" -> see heap is fully expandedWhen the heap is fully expanded, the eden sizing logic takes a different approach. Since there are now free memory constraints which should be respected, eden sizing logic needs to keep GMP costs into account as well. The main premise behind the eden sizing logic, is that 3 key values can be predicted, which allows eden to determine which results in the best pgc overhead, while remaining below the target pause time. The 3 predictions are:
Interval between PGCThis is directly proportional to eden size. Doubling eden size, means twice as long between consecutive PGC's Average time for PGCThe average time of PGC's. In certain applications, as mentioned in this issues original comment, some applications will see an increase in PGC time, as eden increases (specjbb being one such application). On any given PGC, the internal workings of the PR mentioned in this issue, will create a logarithmic relationship which aims to predict how pgc times will change, given a particular eden size (see Number of PGC per GMP cycleThe amount of PGC's per GMP. This is roughly proportional to the amount of "free tenure" in the heap. Since changing eden size has implication for tenuring rate, this is not exactly linear, but it is quite close (assuming max heap size being used is adequate). If tenure space shrinks by half, we can expect there to be half of the amount of PGC's per GMP cycle. Total GC overhead per change in eden sizeWith these 3 prediction tools, we can model the total gc cpu overhead (GMP + PGC) through the following function, where 'x' is a change in eden size. This function is essentially just modeling Which creates the following graph Armed with this formula/model, a good estimate of which eden size will minimize the value of Important NotesHow does changing eden size affect the rest of the heap size?Assume we have total heap =
How is PGC pause time mapped to GC overhead?Mapping PGC pause time to PGC overhead, is a relatively complex operation. Depending on if the heap is fully expanded or not, the mapping needs to be modified. heap is not fully expandedThere is lots to unpack in this graph. A full explanation can be found by visiting the desmos link, which has significantly more details. Here are some high level ideas. NOTE: This graph is using target pause time 100ms, x axis represents pause time, while y axis represents the "equivalent" pgc overhead
Heap is fully expandedWhen the heap is fully expanded, the underlying eden sizing logic is trying to find the eden size which Minimizes gc overhead. This means that the mapping from pgc pause time, to pgc overhead needs to change. In this situation, if the pgc pause time is below the pause time target, it maps to 0% pgc overhead (ie, it is not incurring a "cost"). As the pgc pause time get's further and further above the target pause time, increasingly add cost, in an exponential fashion. The mapping looks like the following (Target pause time here is 100ms. X axis corresponds to pgc pause time, while Y axis is corresponding gc overhead). Since logic exists to estimate pgc pause time, as a function of eden size, the eden sizing logic knows which eden size will produce the lowest blend of gc overhead, along with satisfying the pgc pause time target (which has been mapped to a pgc overhead) total heap sizing logicThe heap sizing logic, aims to blend GC cpu overhead, with Free memory in tenure. The current heap sizing logic, relies on -Xminf, -Xmaxf, -Xmaxt, and -Xmint for its resizing thresholds, and thankfully, these have been reused in the new heap sizing logic implementation. With the new changes proposed by the attached PR, the heap will be resized if there is not a satisfactory blend of GMP overhead, and free memory in tenure. Since "tenure"(by tenure here, we mean non eden, and non survivor spaces), is really only affecting how frequent/costly GMP is, we completely ignore the duration/overhead of PGC's( again, the duration/performance of PGC's is really only affected by the size of eden) The free memory mapping follows a non-linear mapping, displayed in the image below. The magnitude of the heap expansion/contraction, is driven by a set of formulas which approximate the change in heap size required, so that the hybrid heap score (blend of gc cpu overhead and free memory), will once again be between 1-5%. The basic premise here, is that the free tenure can be estimated by the following function, which can then be mapped to it's equivalent GMP overhead. SummaryThis set of models and heuristics, does a much better job at optimizing heap size, and eden size, to provide users with the best overhead, while attempting to respect a soft pause time target. The cohesiveness between the 2 sets of logics (heap and eden), provide performance benefits which are illustrated in the specjbb comment above, in the heapothesys test results below, and throughout the rest of this github issue |
A different benchmark, which saw significant improvement, was heapothesys benchmark. Below, are GCMV graphs for both the baseline, and the new logic, as well as a data table for the results. In these tests,
Increasing eden to ~260Mb (roughly double the size from baseline), significantly decreased the total time spent in GC pauses, from 74s total, to only 42s, with only a minimal effect to average pause time. The entire test suite of heapothesys tests that was run, saw similar performance improvements. In certain tests, the time spent in GC pauses was reduced by 3x, with minimal effect on average pgc pause time. Please see #10721 for original perf issue WRT to heapothesys |
I have a suggestion. Add options to set maxinium / minnium eden percent. (Some software might use percent option to dear with different heap size. such as minecraft. minecraft use hotspot g1gc with percent option) |
@1a2s3d4f1 thank you for the suggestion. We have briefly discussed the possibility of this option (similar to the one you mention in hotspot), however, since no other OpenJ9 GC policy is using any percentage based eden/nursery sizing options, we are leaning towards only using -Xmn/-Xmns/-Xmnx for setting eden/nursery size. This allows us to maintain more consistency across GC policies. @amicic may be able to provide additional motivations, or provide extra input |
The graph below, shows a baseline run of Specjbb2015, with an IR=1500, and relatively large max heap (-Xmx6G). This should be a rather straightforward workload for Balanced policy, but since eden is improperly sized, GMP is active for ~75% of the run. In this test, eden is significantly too big. Sizing eden to 500MB (25% of heap), with a total heap size of approximately 2G, produces an excessive amount of GMP's. Specjbb2015 contains about 1G of live objects, which leaves only 500Mb for other regions, which is not enough given the dynamics of this particular test. The current implementation of heap sizing notices that about 50% of the heap is occupied, and the GC ratio (which heavily favours STW pauses) is about 10% - within both -Xmint/-Xmaxt and -Xminf/-Xmaxf, so the heap does not resize - which in practice, causes application performance to suffer. The new logic, which is sizing eden and the non-eden heap independently, (by different sets of heuristics), is able to achieve much better performance. Eden settled to 750Mb, while the heap settled at 2.58GB. This configuration left ample room for the live long lived objects, and left enough room that GMP was not constantly being triggered. Below is a comparison of certain key metrics in both runs
The biggest standout improvements from the baseline to the new logic, is the fact that the new logic was performing significantly less GMP work. Doing less GMP work, means threads can be used by the application, instead of being used to perform GC work. Given that what is measured is only STW pauses in the table above, it would be relatively safe to say that the overall "cpu overhead" (including PGC and GMP overheads) of the baseline would be closer to 11%. 7.5% of the recorded 7.95% can be attributed to PGC pauses, while the remaining 0.45% can likely be attributed to GMP STW increments. The additional 3%, comes from the fact that there are GC threads active for a large portion of the application lifetime (this % is a conservative estimate). On the other hand, the new logic, which saw a massive drop in number of GMP cycles, would likely have had a total GC cpu overhead closer to 6% (5.5% existing STW pauses, and another 0.5% for GMP related concurrent work). This means that the overall improvement in terms of reducing GC overhead, is more along the lines of 11%->6%, rather than 7.95%->5.49% |
In the vast majority of cases when the heap is fully expanded, eden will grow in order to improve total GC overhead, since PGC is typically the dominating cost of the all GC work (GMP + PGC) - however in certain situations where free memory is tight (whether it be by using a small Cases where eden will shrink simply because of high GMP overhead should be typically be avoided. If eden is shrinking below ~25% of the total heap size (what was previously the default size for eden), then the heap is likely too small for the application in use. There are certain situations, where eden will shrink below 25% of the heap due to the target pause time criteria - and this is an entirely different consideration. Whether this is desirable or not is left to whoever is deploying/monitoring the java application. Below is a GCMV graph of SPECjbb 2015, where eden shrank due to GMP being too expensive. The following options were used: Baseline/old logicNotice that eden stays at 425M in size for the duration of the run, leading to GMP being active almost 100% of the time New logicNotice that the GC eden shrank to ~260M in size shortly after noticing GMP was far too expensive/frequent. results
A few extra notes about these results:
|
The entire set of heapothesys tests was re-run with the new logic discussed in this issue, and the performance results are in this comment -> #10721 (comment) |
A key question to consider when discussing heap sizing behaviour: what happens if my java application experiences significant changes in workload throughout its lifetime? Will my performance suffer as my workload increases? Will I still use extra memory even when I don't need it? These questions were simulated by running SPECjbb 2015 (as previously mentioned in this issue), and Synthetic GC workload (SGCW) as a java agent. By running SGCW as a java agent, it was possible to stack different types of allocations/live objects on top of the very stable SPECjbb 2015 workload, in the same JVM heap. By doing this, a test configuration was created, which simulated a realistic change in application dynamics. This test configuration has changes in allocation rate at t=10,15, and 20, with the allocations at t=20 until t=30 consisted of objects with a 10m lifetime. At t =30, 45,50 and 55, the allocation rate dropped once more, to return to the original level. The SGCW configuration file which was used is attached. With the new eden/heap sizing logic, with a target pause time of 50ms, the following heap sizing behaviour is obtained. Some key observations:
The experiment shows how the new eden and heap sizing logic, is able to dynamically adjust to changes in allocation rate, live set, and GC times, to give a good balance of target pause times, and total % of time spent in GC pauses. NOTE: to run SPECjbb 2015 + SGCW as a javaagent, the following command line can be used |
This change introduces 2 different major componants, which work together. An in depth overview of the inner workings of these changes can be found eclipse-openj9#12054 ---------Change 1: New heap sizing logic in balanced gc The old heap sizing logic, relied primarily on free memory %, in relation to -Xminf and -Xmaxf to decide whether or not the heap should shrink/expand. If free memory was within the acceptable boundary, the heap sizing logic would then look at gc cpu% to decide whether to expand/contract. The new logic (which is included in this commit), aims to take a more hybrid approach, where TENURE free memory % and gc cpu % are working together, and weighed equally, to determine whether or not the heap needs to change. The function `calculateCurrentHybridHeapOverhead()` determines what the current "hybrid overhead" of the heap/gc is (blend of gc cpu % and free memory %). If this "hybrid overhead" is above -Xmaxt (default to 13%), the heap will expand. Conversely, if the "hybrid overhead" is below -Xmint (5% default), the heap will shrink. --------Change 2: New eden sizing logic in balanced The current logic for eden sizing is fairly simple - By default, eden is taken to be 25% of the heap - unless specified otherwise by -Xmn/-Xmns/-Xmnx The new logic, aims to make use of a few heuristics to improve overall PGC overheaad (% of time being active relative to total time), while trying to respect the newly introduced, `tarokTargetMaxPauseTime`. Eden sizing logic now adheres to the following high level concepts: 1. If the heap is not fully expanded ( that is, the current heap size <= -Xsoftmx/-Xmx), then eden will increase by 5% if PGC overhead and pgc avg time blend, is above dnssExpectedTimeRatioMinimum (5% default). Conversely, eden will shrink by 5% if the hybrid blend of pgc overhead and pgc time, falls below dnssExpectedTimeRatioMaximum (2% default). 2. If the heap is fully expanded, a prediction will be made as to which eden size will produce the lowest total GC cpu overhead, while attempting to respect `tarokTargetMaxPauseTime`, given the constraints imposed by the heap. This prediction accounts for PGC average time/intervals, GMP time/intervals, free Tenure space, and survival ratio of objects being copied out of eden, in order to determine the "best" eden size. 3. If `tarokTargetPauseTime` is not being fully respected, then the eden sizing logic will balance respecting the % of time spent in GC goals, and target pause time goals. The actual mechanism which causes eden size to respect this value, is baked into the logic for 1. and 2. above. This value is a SOFT pause time target, which is defaulted at 200ms Note: The following command line options are now supported by balanced gc policy as part of this change. - `-Xgc:dnssExpectedTimeRatioMinimum` -> sets min expected % of time spent in pgc pauses (default 2% for balanced) - `-Xgc:dnssExpectedTimeRatioMaximum` -> sets max expected % of time spent in pgc pauses (default 5% for balanced) - `-Xgc:targetPausetime` -> sets target (tarokTargetMaxPauseTime defaults to 200ms in balanced) Depends on: eclipse-omr/omr#5825 Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
This change introduces 2 different major componants, which work together. An in depth overview of the inner workings of these changes can be found eclipse-openj9/openj9#12054 ---------Change 1: New heap sizing logic in balanced gc The old heap sizing logic, relied primarily on free memory %, in relation to -Xminf and -Xmaxf to decide whether or not the heap should shrink/expand. If free memory was within the acceptable boundary, the heap sizing logic would then look at gc cpu% to decide whether to expand/contract. The new logic (which is included in this commit), aims to take a more hybrid approach, where TENURE free memory % and gc cpu % are working together, and weighed equally, to determine whether or not the heap needs to change. The function `calculateCurrentHybridHeapOverhead()` determines what the current "hybrid overhead" of the heap/gc is (blend of gc cpu % and free memory %). If this "hybrid overhead" is above -Xmaxt (default to 13%), the heap will expand. Conversely, if the "hybrid overhead" is below -Xmint (5% default), the heap will shrink. --------Change 2: New eden sizing logic in balanced The current logic for eden sizing is fairly simple - By default, eden is taken to be 25% of the heap - unless specified otherwise by -Xmn/-Xmns/-Xmnx The new logic, aims to make use of a few heuristics to improve overall PGC overheaad (% of time being active relative to total time), while trying to respect the newly introduced, `tarokTargetMaxPauseTime`. Eden sizing logic now adheres to the following high level concepts: 1. If the heap is not fully expanded ( that is, the current heap size <= -Xsoftmx/-Xmx), then eden will increase by 5% if PGC overhead and pgc avg time blend, is above dnssExpectedTimeRatioMinimum (5% default). Conversely, eden will shrink by 5% if the hybrid blend of pgc overhead and pgc time, falls below dnssExpectedTimeRatioMaximum (2% default). 2. If the heap is fully expanded, a prediction will be made as to which eden size will produce the lowest total GC cpu overhead, while attempting to respect `tarokTargetMaxPauseTime`, given the constraints imposed by the heap. This prediction accounts for PGC average time/intervals, GMP time/intervals, free Tenure space, and survival ratio of objects being copied out of eden, in order to determine the "best" eden size. 3. If `tarokTargetPauseTime` is not being fully respected, then the eden sizing logic will balance respecting the % of time spent in GC goals, and target pause time goals. The actual mechanism which causes eden size to respect this value, is baked into the logic for 1. and 2. above. This value is a SOFT pause time target, which is defaulted at 200ms Note: The following command line options are now supported by balanced gc policy as part of this change. - `-Xgc:dnssExpectedTimeRatioMinimum` -> sets min expected % of time spent in pgc pauses (default 2% for balanced) - `-Xgc:dnssExpectedTimeRatioMaximum` -> sets max expected % of time spent in pgc pauses (default 5% for balanced) - `-Xgc:targetPausetime` -> sets target (tarokTargetMaxPauseTime defaults to 200ms in balanced) Depends on: eclipse-omr/omr#5825 Signed-off-by: Cedric Hansen <cedric.hansen@ibm.com>
Overview
In the balanced GC policy, the heap sizing logic is driven by free memory constraints (-Xminf/-Xmaxf), while factoring in GC overhead (overhead, for the purpose of this discussion, is % of time gc is active, relative to application). While these two criteria are excellent criteria, the way they are combined leads to unexpected behaviour, which has significant performance implications.
Additionally, the size of “Eden”, where new objects are allocated, is very tightly coupled with the total heap size. As of right now, eden simply defaults to take 25% of the total heap size. While this may seem reasonable (and in certain cases, is fine), Eden size has a direct impact on PGC time, and PGC overhead, neither of which is accounted for by the total heap sizing logic.
Background
There are several key pieces of background information that are instrumental to understanding this issue.
First, there are a handful of default command line options/heuristics that are being used
Free memory
The heap sizing logic will attempt to keep percentage of free memory (P), between -Xminf < P < -Xmaxf.
If -Xmaxf < P, the heap will try to contract, and if P < -Xminf, the heap will try to expand.
GC overhead
Similar to free memory constraints, the heap attempts to keep GC overhead (G), between -Xmint < G < -Xmaxt
If -Xmaxt < G, expand the heap, and if G < -Xmint, contract the heap
-Xmaxt: 13%
-Xmint: 5%
The problem
The problem arises when we combine the free memory goals, with the GC overhead goals. The current logic, heavily prioritizes free memory goals. First, the logic will check to see if free memory goal is met, expanding/contracting if necessary. Once this goal is met, then the heap sizing logic will start to look at the GC overhead, expanding or contracting as necessary. This causes problem when these criteria are pulling in opposite directions (free memory suggests expand/contract, while GC overhead is suggesting the opposite).
Additionally, eden is simply “along for the ride” throughout this entire process (assuming no -Xmn values have been set). This is not very practical, since the size of eden, has very little to do with the amount of free space in the heap, along with the cost of the GMP component of Balanced GC. Rather, the size of eden is directly influencing PGC time, and PGC overhead, which is usually the most expensive operation of Balanced GC.
example of the problem
The following graph, shows Specjbb 2015, with 4000IR (relatively high), and -Xmx5G.
At each GMP end, the heap quickly frees lots of space, and the heap shrinks by ~2G to satisfy free memory goals. Once this is done, the heap sizing logic realizes that the GC overhead is too high (about 14% in this run), and then quickly expands by ~2G to try to improve GC overhead.
Additional motivations
Eden plays a major role in gc performance
The size of eden plays a significant role in the performance of balanced GC, both from a pause time perspective, and an overhead perspective. The following chart, shows eden size, vs pgc pause time, and total time spent in gc pauses.
Note that as eden size increased, PGC pause time also increased, but total time spent in GC pauses, decreased. With this observation, it becomes clear that eden must be given more attention than simply being 25% of the total heap size, since there are many overhead implications, as well as pause time implications. It is important to note, that some applications do not see PGC times increase as rapidly when eden size increases.
Large live set
Consider the following - There is an application with -Xmx10G, and a live set of 8G, and a relatively low allocation rate. Given the current logic, eden will try to be 25% of the heap, which will not be conducive to good performance (possible longer average pause times, and more than likely, excessive GMP’s due to small amount of free space).
Proposed change
As hinted in the Background section above, it becomes clear that Eden sizing logic, and “Non-Eden” sizing logic, need to be reworked, in order to provide steadier performance, and better satisfy pause time goals, overhead goals, and free memory goals.
Eden sizing
Eden size should be able to resize itself to meet it’s own set of overhead goals as well as a pause time goal. While satisfying both goals may not always be possible, the best possible blend between these two targets should be met, in a way that provides steady eden sizing. Gencon GC, sizes nursery independently (with some caveats) to the entire heap.
In a way, Eden/PGC behaviour is mostly unrelated to the size of the rest of the heap and can resize freely, until the heap is close to reaching -Xmx size. At this point, the size of Eden must be carefully calculated, so that it is not exhausting the rest of the free space in the heap, while maximizing application throughput, and respecting the aforementioned pause time goal.
Additionally, a pause time target should be introduced so that users can fine tune the specific pause time target that they can tolerate for their workloads.
Total heap sizing (non-eden)
The total heap sizing must find a better blend of free memory, and gc overhead targets. Additionally, the GC overhead that is used for total heap sizing logic, must not consider the cost of PGC. Since Eden/PGC will be driven by it’s own set of heuristics, there is no need for non-eden resizing to consider cost of PGC.
Finally, the total heap sizing logic, must attempt to maintain the same number of regions after eden is resized. If there are 100 non-eden regions before eden increases by 50 regions, then there should be 100 non-eden regions after resizing eden (in addition to some extra regions for additional expected survivor objects). This is so that GMP kickoff logic, and incremental defragmentation calculations can remain consistent.
Related Issues
The proposed changes, will help address performance for the following issue:
#10721
As well as help resolve the following user raised issue:
#11866
The text was updated successfully, but these errors were encountered: