Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8317755: G1: Periodic GC interval should test for the last whole heap GC #16107

Closed

Conversation

shipilev
Copy link
Member

@shipilev shipilev commented Oct 9, 2023

See the description in the bug. Fortunately, we already track the last whole-heap GC. The new regression test verifies the behavior.

Additional testing:

  • Linux x86_64 fastdebug tier1 tier2 tier3

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change requires a CSR request matching fixVersion 23 to be approved (needs to be created)

Issue

  • JDK-8317755: G1: Periodic GC interval should test for the last whole heap GC (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16107/head:pull/16107
$ git checkout pull/16107

Update a local copy of the PR:
$ git checkout pull/16107
$ git pull https://git.openjdk.org/jdk.git pull/16107/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16107

View PR using the GUI difftool:
$ git pr show -t 16107

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16107.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 9, 2023

👋 Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 9, 2023

@shipilev The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Oct 9, 2023
@shipilev shipilev changed the title 8317755: G1: Periodic GC should check only for last whole heap GC 8317755: G1: Periodic GC interval should test for the last whole heap GC Oct 9, 2023
@shipilev shipilev marked this pull request as ready for review October 9, 2023 20:56
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 9, 2023
@mlbridge
Copy link

mlbridge bot commented Oct 9, 2023

Webrevs

@tschatzl
Copy link
Contributor

tschatzl commented Oct 10, 2023

Initial comments when looking at it:

From what I understand from the CR description, the VM is not completely idle but has GCs now and then which do not (or almost do not) increase old gen occupancy.

So the problem seems to be that the existing policy for the periodic gcs does not detect idleness (and the request to give back memory) in this case. The JEP mentions the G1PeriodicGCSystemLoadThreshold option to improve upon that. Maybe use of it is an option in this case?

This is probably a significant behavioral change which I think is unintended: if the user enabled to use execution of a full gc instead of a concurrent marking - then instead of chugging along if there is at least some activity, there would be regular full gcs now regardless of "being idle" (depending on how you see idle; very long periodis of only young gcs can be induced by e.g. an overly large heap). I do not think this is expected, even if the user asked for full gcs for periodic gcs.

Because this topic about periodic gcs has come up a few times, the RFE does not describe the actual situation you are in, so it would be interesting to have more information. For example, would it be acceptable to have some kind of interaction with the VM to set e.g. SoftMaxHeapSize at idle? It is kind of hard to cover all "idle" situations and this has been part of the discussion in the original JEP, and this change seems to try to fix this with the imo fairly crude hammer of "guarantee a whole heap analysis" regularly instead.
Which is not necessarily incompatible with the current mechanism (i.e. an either-or situation).

There may be other aspects of this suggestion.

Formal things: This is a significant behavioral change which needs a CSR imo. The referenced JEP exactly defines the expected behavior to listen to young gcs.

/csr needed

@openjdk openjdk bot added the csr Pull request needs approved CSR before integration label Oct 10, 2023
@openjdk
Copy link

openjdk bot commented Oct 10, 2023

@tschatzl has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@shipilev please create a CSR request for issue JDK-8317755 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

@tschatzl
Copy link
Contributor

Formal things: This is a significant behavioral change which needs a CSR imo. The referenced JEP exactly defines the expected behavior to listen to young gcs.

Note that this requirement can be removed again if there are significant changes, so probably don't start writing yet... however the current change as is would need one imo.

@tschatzl
Copy link
Contributor

We discussed this issue at Oracle internally a bit, with the following thoughts:

We are still not sure about the use case you want to cover with that change, and probably the option G1PeriodicGCInterval is not named very well - the functionality you are modifying has always been intended to compact the heap for when the VM is almost completely idle.
Given our current understanding of the issue, this does not seem to be the case here.

Instead the intention of this change is probably something more like general cleanup (potentially including heap compaction) that is currently only performed after a whole heap analysis. Similar issues were reported in the past (like https://mail.openjdk.org/pipermail/hotspot-gc-dev/2020-March/028857.html), but nobody acted on it yet.

So the problems we see here are:

  • the G1PeriodicGCInterval option naming is a bit misleading. It should be more specific, containing "Idle" or something in it.
  • there is no real periodic whole heap and cleanup marking functionality.

What we think would be most appropriate for solving your problem (if the suggestion in an earlier post does not apply) currently would be

  • rename G1PeriodicGCInterval to something else and start the deprecation/removal process (temporarily aliasing the old flag to the new one)
  • add the new functionality (periodic concurrent whole heap analysis) using a new flag

Also note that just for heap compaction there is this CR https://bugs.openjdk.org/browse/JDK-8238687 that I've been working for a long time on and off which imo only needs testing and cleanup (hah!). If you want to look into this feel free to contact me.

Hth,
Thomas

@shipilev
Copy link
Member Author

Thanks for looking at it!

Re-reading JEP 346: “Promptly Return Unused Committed Memory from G1”...

In the scenario we are seeing, we do have lots of unused committed memory that would not be reclaimed promptly until concurrent cycle executes. The need for that cleanup is in worst case driven by heap connectivity changes that do not readily reflect in other observable heap metrics. A poster example would be a cache sitting perfectly in "oldgen", eventually dropping the entries, producing a huge garbage patch. We would only discover this after whole heap marking. In some cases, the tuning based on occupancy (e.g. soft max heap size, heap reserve, etc.) would help if we promote enough stuff to actually trigger the concurrent cycle. But if we keep churning very efficient young collections, we would get there very slowly.

Therefore, I’d argue the current behavior is against the spirit of the JEP 346, even though the letter says that we track “any” GC for periodic GC. There is no explicit mention why young GCs should actually be treated as recent GCs, even though — with the benefit of hindsight — they throw away promptness guarantees. Aside: Shenandoah periodic GC does not make any claims it would run only in idle phases; albeit the story is simpler without young GCs. But Generational Shenandoah would follow the same route: the periodic whole heap GC would start periodically regardless of young collections running in between or not.

The current use of “idle” is also awkward in JEP 346. If user enables G1PeriodicGCInterval without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight. I guess we can think about “idle” as “GC is idle”, but then arguably not figuring out the whole heap situation for hours can be described as “GC is idle”. I think the much larger point of the JEP is to reclaim memory promptly, which in turn requires whole heap GC. Looks like JEP somewhat painted itself in the corner by considering all GCs, including young.

I doubt that users would mind if we change the behavior of G1PeriodicGCInterval like this: the option is explicitly opt-in, the configurations I see in prod are running with huge intervals, etc. So we are asking for a relatively rare concurrent GC even when application is doing young GCs. But I agree that departing from the current behavior might still have undesired consequences, for which we need to plan the escape route. There is also a need to have this in JDK 17u and 21u.

So as the compromise I am leaning towards introducing the flag what defines whether periodic GC checks for last whole heap collection or not. The default can still be the current behavior, but the behavior we want here is also easily achievable. This is implemented as new commit. (I realize this still requires CSR, being a product flag and all.)

@albertnetymk
Copy link
Member

I believe that having periodic Garbage Collections (whether they be young, concurrent, full or mixed) may be something that user applications need in certain scenarios. However, the current JVM lacks a way for Java to invoke a specific kind of GC cycle. If the WhiteBox API (or something equivalent) were available in Java, a user application could easily implement periodic GCs, along with any custom tweaks, e.g. the behavior of master -- short-circuiting periodic concurrent GC. That way, one can avoid confusions around The current use of “idle” is also awkward in JEP 346.... Just my 2 cents.

@tschatzl
Copy link
Contributor

I believe that having periodic Garbage Collections (whether they be young, concurrent, full or mixed) may be something that user applications need in certain scenarios. However, the current JVM lacks a way for Java to invoke a specific kind of GC cycle. If the WhiteBox API (or something equivalent) were available in Java, a user application could easily implement periodic GCs,

The user can already do this to some degree with System.gc() and -XX:+ExplicitGCInvokesConcurrent.

I think the point @shipilev (and @gctony) make is that there are resources (not only Java heap memory) that ought to be cleaned up "in time" to keep resource usage and performance "optimal". The end user has even less insight into this than the VM - being "idle" is only a convenient time (for the user) to do that, so making the end user doing random stuff does not make anyone happy.

This is corroborated by experience from me working with end users actually doing that (i.e. force a whole heap analysis externally). The solution was to tell them to not do that because they were spamming these to an extent very detrimental to performance - because they did not know better (it did not help that at that time a forced concurrent cycle interrupted an ongoing one, so it happened that effectively they went into a full gc sometimes).

Both ZGC and Shenandoah have this regular cleanup (and CMS also provided the option) based on time. Not the best solution, but workable.

@tschatzl
Copy link
Contributor

@shipilev : I have not made up my mind about the other parts of your proposal, but:

The current use of “idle” is also awkward in JEP 346. If user enables G1PeriodicGCInterval without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight.

This is the reason for the G1PeriodicGCSystemLoadThreshold option and is handled by the feature/JEP.

@shipilev
Copy link
Member Author

shipilev commented Oct 19, 2023

@shipilev : I have not made up my mind about the other parts of your proposal, but:

The current use of “idle” is also awkward in JEP 346. If user enables G1PeriodicGCInterval without doing anything else, they would get a GC even when application is churning at 100% CPU, but without any recent GC in sight.

This is the reason for the G1PeriodicGCSystemLoadThreshold option and is handled by the feature/JEP.

Yes, that is why I said "without doing anything else". With that example, I wanted to point out that the definition of "idle" is already quite murky even with current JEP, where we have an additional option to tell if "idle" includes the actual system load. In this view, having another option that would tell if "idle" includes young GCs fits well, I think.

@shipilev
Copy link
Member Author

shipilev commented Oct 19, 2023

This is corroborated by experience from me working with end users actually doing that (i.e. force a whole heap analysis externally).

Yes, and that is the beauty of periodic GCs that are driven by GC itself, when it knows it should not start the periodic GCs if there was recent activity. That part if captured in JEP, I think. What this PR does, is extends the usefulness of periodic GCs to the cases where you need a whole heap analysis to make the best call about what to do next. In G1 case, maybe discovering that we need to start running mixed collections to unclutter the heap.

We can indeed make an external agent that calls System.gc() with +ExplicitGCInvokesConcurrent, but that IMO ignores the purpose, usefulness and reliability of periodic GCs... if we make sure they cover the use cases we care about. Otherwise, off to explicit GCs we go! That would feel like a major step backwards.

@mlbridge
Copy link

mlbridge bot commented Oct 20, 2023

Mailing list message from Kirk Pepperdine on hotspot-gc-dev:

Don?t we have already have a GCInterval with a default value of Long.MAX_VALUE?

When I hear the word idle I immediately start thinking, CPU idle. In this case however, I quickly shifted to memory idle which I think translates nicely into how idle are the allocators. Thus basing heap sizing ergonomics on allocation rates seems like a reasonable metric until you consider the edge cases. The most significant ?edge case? IME is when GC overheads start exceeding 20%. In those cases GC will throttle allocations and that in turn cases ergonomics to reduce sizes instead of increasing them (to reduce GC overhead). My conclusion from this is that ergonomics should consider both allocation rates and GC overhead when deciding on how to resize heap at the end of a collection. Fortunately there are a steady stream of GC events that create a convenient point in time to make an ergonomic adjustment. Not having allocations and as a result not having the collector run implies one has to manufacture a convenient point in time to make an ergonomic sizing decision. Unfortunately time based triggers are speculative and the history of speculatively triggered GC cycles has been less than wonderful (think DGC as but one case). My last consulting engagement prior to joining MS involved me tuning an application where the OEM?s recommended configuration (set out of the box) was to run a full collection every two minutes. As one can imagine, the results were devastating. It took 3 days of meetings with various stakeholders and managers to get permission to turn that setting turned off.

If the application is truly idle then it?s a no foul no harm situation. However there are entire classes of applications that are light allocators and the question would be, what would be the impact of speculative collections on those applications?

As for returning memory, two issues, there appears to be no definition for ?unused memory?. Secondly, what I can say after looking at 1000s of GC logs is that the amount of floating garbage that G1 leaves behind even after several concurrent cycles is not insignificant. I also write a G1 heap fragmentation viewer and what it revealed is that heap remains highly fragmented and scattered after each GC cycle. All this suggests that heap will need to be compacted with a full collection in order to return a significantly large enough block of memory to make the entire effort worthwhile. Again, if the application is idle, then no harm no foul. However, for those applications that are memory-idle but not CPU-idle this might not be a great course of action.

In my mind, any trigger for a speculative collection would need to take into consideration, allocation rates, GC overhead, and mutator busyness (for cases when GC and allocator activity is low to 0).

Kind regards,
Kirk

@tschatzl
Copy link
Contributor

tschatzl commented Nov 8, 2023

Sorry for the late reply, I have been busy with getting the 423 JEP out the door :)

Mailing list message from Kirk Pepperdine on hotspot-gc-dev:

Don?t we have already have a GCInterval with a default value of Long.MAX_VALUE?

There is no such flag.

When I hear the word idle I immediately start thinking, CPU idle. In this case however, I quickly shifted to memory idle which I think translates nicely into how idle are the allocators. Thus basing heap sizing ergonomics on allocation rates seems like a reasonable metric until you consider the edge cases. The most significant ?edge case? IME is when GC overheads start exceeding 20%. In those cases GC will throttle allocations and that in turn cases ergonomics to reduce sizes instead of increasing them (to reduce GC overhead). My conclusion from this is that ergonomics should consider both allocation rates and GC overhead when deciding on how to resize heap at the end of a collection.

Allocation rate directly impacts GC overhead by causing more GCs.

Fortunately there are a steady stream of GC events that create a convenient point in time to make an ergonomic adjustment. Not having allocations and as a result not having the collector run implies one has to manufacture a convenient point in time to make an ergonomic sizing decision.

See JDK-8238687.

That CR and this proposed feature complement each other, GC should do something when there is no steady stream of gc events.

[...]
As for returning memory, two issues, there appears to be no definition for ?unused memory?. Secondly, what I can say after looking at 1000s of GC logs is that the amount of floating garbage that G1 leaves behind even after several concurrent cycles is not insignificant.

G1 currently respects Min/MaxHeapFreeRatio which is tunable. They may be fairly lenient.

I also write a G1 heap fragmentation viewer and what it revealed is that heap remains highly fragmented and scattered after each GC cycle. All this suggests that heap will need to be compacted with a full collection in order to return a significantly large enough block of memory to make the entire effort worthwhile.

There is no harm with fragmentation on a region level; G1 will give back memory on a region level anyway if able to do so (see Min/MaxHeapFreeRatio default values). There is now a humongous-object compacting full gc stage if necessary, so keeping region-level fragmentation in check is less of an issue now.

Again, if the application is idle, then no harm no foul. However, for those applications that are memory-idle but not CPU-idle this might not be a great course of action.

Hence the flag described in the JEP. The reason why the defaults disable the CPU-idle check is that there is no good default.

In my mind, any trigger for a speculative collection would need to take into consideration, allocation rates, GC overhead, and mutator busyness (for cases when GC and allocator activity is low to 0).

It does (if told to do so appropriately).

From @shipilev :

This is corroborated by experience from me working with end users actually doing that (i.e. force a whole heap analysis externally).

Yes, and that is the beauty of periodic GCs that are driven by GC itself, when it knows it should not start the periodic GCs if there was recent activity. That part if captured in JEP, I think. What this PR does, is extends the usefulness of periodic GCs to the cases where you need a whole heap analysis to make the best call about what to do next. In G1 case, maybe discovering that we need to start running mixed collections to unclutter the heap.

I think there is still a misconception with the current "periodic gcs" here: it is first and foremost thought to be applicable for catching the idle case with all its options (also consider cpu idle, do full or concurrent gc).

This is the reason for the suggestion to phase out the current option G1PeriodicGCInterval to use a different name, and add this functionality (regular whole heap analysis to clean out cruft) as a new option (something like WholeHeapAnalysisInterval, or something more aligned with other collectors)

I could easily imagine the use case where one would want to act fairly quickly (and allowing a disruptive full gc) on the detected idle case ("shortly" after detecting idle), and be more relaxed about general cleanup (long period, being least disruptive using a concurrent mark).

Thomas

@shipilev
Copy link
Member Author

shipilev commented Nov 9, 2023

I think there is still a misconception with the current "periodic gcs" here: it is first and foremost thought to be applicable for catching the idle case with all its options (also consider cpu idle, do full or concurrent gc).

Yes, "idle case". This PR adds another option that lets users to further fine-tune what "idle case" is.

This is the reason for the suggestion to phase out the current option G1PeriodicGCInterval to use a different name,

IMO, the cost/benefit for renaming flags is too high to be worthwhile here. It would add to further confusion for users, as we change more and more flags across JDK releases. I would argue that proposed scheme fits the existing amendments to the periodic GC configurations:

  1. -XX:G1PeriodicGCInterval=X: "do periodic GCs with this frequency, unless we had other GCs recently"
  2. -XX:G1PeriodicGCSystemLoadThreshold=X: "...except when the CPU load is this high"
  3. -XX:+G1PeriodicGCCheckWholeHeap: "...except when other GCs did not inspect the whole heap"

and add this functionality (regular whole heap analysis to clean out cruft) as a new option (something like WholeHeapAnalysisInterval, or something more aligned with other collectors)

All right, so Shenandoah has ShenandoahGuaranteedGCInterval to control periodic GCs. Generational Shenandoah has ShenandoahGuaranteedOldGCInterval and ShenandoahGuaranteedYoungGCInterval. Would you like to have G1GuaranteedOldGCInterval and G1GuaranteedYoungGCInterval (which is nearly what G1PeriodicGCInterval currently is)?

It looks to me as creating the parallel hierarchy of periodic GC control options. If we phase out G1Periodic* after this, it would be effectively a rename of G1PeriodicGCInterval -> G1GuaranteedYoungGCInterval, G1PeriodicGCSystemLoadThreshold -> G1GuaranteedGCSystemLoadThreshold right? Would it be better to just introduce G1PeriodicOldGCInterval to fit the existing option set?

@tschatzl
Copy link
Contributor

I think there is still a misconception with the current "periodic gcs" here: it is first and foremost thought to be applicable for catching the idle case with all its options (also consider cpu idle, do full or concurrent gc).

Yes, "idle case". This PR adds another option that lets users to further fine-tune what "idle case" is.

This is the reason for the suggestion to phase out the current option G1PeriodicGCInterval to use a different name,

IMO, the cost/benefit for renaming flags is too high to be worthwhile here. It would add to further confusion for users, as we change more and more flags across JDK releases. I would argue that proposed scheme fits the existing amendments to the periodic GC configurations:

1. `-XX:G1PeriodicGCInterval=X`: "do periodic GCs with this frequency, unless we had other GCs recently"

2. `-XX:G1PeriodicGCSystemLoadThreshold=X`: "...except when the CPU load is this high"

3. `-XX:+G1PeriodicGCCheckWholeHeap`: "...except when other GCs did not inspect the whole heap"

Doing periodic whole heap inspections to free resources should not be tied to some notion of "idle" (ideally gc would schedule them at that time, but that is another matter).

E.g. the problem mentioned in https://mail.openjdk.org/pipermail/hotspot-gc-dev/2020-March/028857.html is independent of idle cleanup. Same issue applies to finalizers/symbol table/etc (also written up in https://bugs.openjdk.org/browse/JDK-8213198).

As far as I understand your proposal, in combination with -XX:G1PeriodicGCSystemLoadThreshold g1 may then never trigger whole heap collections (and the expected cleanups). An application may be non-idle for days without the need for concurrent mark, accumulating cruft in these data structures that can only be cleaned out with a concurrent mark.

@tschatzl
Copy link
Contributor

tschatzl commented Nov 13, 2023

Now one can argue that it is up to the user of G1PeriodicGCSystemLoadThreshold to make sure there are enough idle periods... let me discuss this internally too; but my impression about the original JEP was that the idle case was the significant one where one would always use G1PeriodicGCSystemLoadThreshold.

@tschatzl
Copy link
Contributor

There is also the option to trigger full gcs on "idle"; would be fairly bad if they were triggered regularly with the G1PeriodicGCCheckWholeHeap option, and completely overkill for the https://bugs.openjdk.org/browse/JDK-8213198 case.

Copy link
Member

@ysramakrishna ysramakrishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Very useful when there is no direct feedback loop from resource pressure that we might otherwise have used for signaling the need for a whole heap GC, so is a good escape hatch to put in the hands of service owners. Of course, like any escape hatch, it could be misused, but I don't think that is a concern about ejection seats from fighter jets for example. We trust those who fly the jets not to eject for no reason.

I would suggest formally dissociating this from the JEP itself. The fix ups to the JEP's and its implementation shortcomings can be addressed fully in the fullness of time. My USD 0.02.

@tschatzl
Copy link
Contributor

tschatzl commented Dec 1, 2023

LGTM. Very useful when there is no direct feedback loop from resource pressure that we might otherwise have used for signaling the need for a whole heap GC, so is a good escape hatch to put in the hands of service owners. Of course, like any escape hatch, it could be misused, but I don't think that is a concern about ejection seats from fighter jets for example. We trust those who fly the jets not to eject for no reason.

I would suggest formally dissociating this from the JEP itself. The fix ups to the JEP's and its implementation shortcomings can be addressed fully in the fullness of time. My USD 0.02.

I do not think breaking existing functionality to add new functionality given that the former is widely publicized via the JEP for the default collector and "fixing it up later" is the proper way forward.
The only advantage I can see with this approach is that this is a simple change, but implementing the new functionality properly is likely not that much more work anyway.

Copy link
Contributor

@tschatzl tschatzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment stream.

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 29, 2023

@shipilev This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 26, 2024

@shipilev This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Jan 26, 2024
@shipilev
Copy link
Member Author

/open

@openjdk openjdk bot reopened this Jan 26, 2024
@openjdk
Copy link

openjdk bot commented Jan 26, 2024

@shipilev This pull request is now open

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 23, 2024

@shipilev This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@ysramakrishna
Copy link
Member

Hi Thomas @tschatzl :

LGTM. Very useful when there is no direct feedback loop from resource pressure that we might otherwise have used for signaling the need for a whole heap GC, so is a good escape hatch to put in the hands of service owners. Of course, like any escape hatch, it could be misused, but I don't think that is a concern about ejection seats from fighter jets for example. We trust those who fly the jets not to eject for no reason.
I would suggest formally dissociating this from the JEP itself. The fix ups to the JEP's and its implementation shortcomings can be addressed fully in the fullness of time. My USD 0.02.

I do not think breaking existing functionality to add new functionality given that the former is widely publicized via the JEP for the default collector and "fixing it up later" is the proper way forward. The only advantage I can see with this approach is that this is a simple change, but implementing the new functionality properly is likely not that much more work anyway.

Thanks for the response.

I read the JEP (authored by you, Thomas). It got the impression from the verbage there (albeit slightly ambiguous) that the intention was to do the periodic gc's which would induce a concurrent major cycle and collect the entire heap, because the objective was to run a cycle that reclaims garbage in the absence of other triggers and returns memory to the OS. The JEP also states that a major cycle is essential because a minor cycle doesn't suffice. As such, the only rational solution that meets the proposed objectives of the JEP is to consider major cycles as the criterion for the timer whose tripping triggers the major cycle.

I submit that, from this point of view, this PR actually corrects a small defect in the original implementation, and does so in a backward compatible fashion, so that any users dependent on the old, arguably defective, behavior are not surprised.

I think your stance, if I understood correctly, is that there are other things that we have learned since that need to be corrected too, but it looks like we are then allowing the perfect to be the enemy of the better, in allowing this correction to be made.

I may have misunderstood the long comment stream, so will go back and reread in case I missed something crucial about the objection to this PR in its current form.

@ysramakrishna
Copy link
Member

Clearly ...

I may have misunderstood the long comment stream, so will go back and reread in case I missed something crucial about the objection to this PR in its current form.

I now see that you noted it might be better to separate this out into its own independent controller option:

This is the reason for the suggestion to phase out the current option G1PeriodicGCInterval to use a different name, and add this functionality (regular whole heap analysis to clean out cruft) as a new option (something like WholeHeapAnalysisInterval, or something more aligned with other collectors)

I could easily imagine the use case where one would want to act fairly quickly (and allowing a disruptive full gc) on the detected idle case ("shortly" after detecting idle), and be more relaxed about general cleanup (long period, being least disruptive using a concurrent mark).

I believe I now understand your objection. Thanks, and sorry for the noise :-)

@openjdk
Copy link

openjdk bot commented Mar 13, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 11, 2024

@shipilev This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented May 9, 2024

@shipilev This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
csr Pull request needs approved CSR before integration hotspot-gc hotspot-gc-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

4 participants