-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8342444: Shenandoah: Uncommit regions from a separate, STS aware thread #22019
8342444: Shenandoah: Uncommit regions from a separate, STS aware thread #22019
Conversation
👋 Welcome back wkemper! A progress list of the required criteria for merging this PR into |
@earthling-amzn This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 37 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@earthling-amzn The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
if (heap->is_bitmap_slice_committed(region)) { | ||
ctx->clear_bitmap(region); | ||
{ | ||
ShenandoahHeapLocker locker(heap->lock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it a bug that previous version of this code did not acquire the heap lock?
Is the lock required for the entirety of time that we are clearing the bitmap? Or is it just required to get a trustworthy check on is_bitmap_slice_committed()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading more of this PR, I believe we need the heap lock to get a reliable signal of bitmap_slice_committed(). But I believe we do not need the heap lock for ctx->clear_bitmap(region) so would prefer to move that outside the lock, unless I am misunderstanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'm not sure we can do that. Prior to this change, the control thread performed both clearing the bitmap and uncommitting the region's bitmap, so they could never happen concurrently. With this change, a separate thread could perform the uncommit. Consider:
- Control thread takes heap lock, observes that bitmap slice for region A is committed
- Control thread releases heap lock, begins clearing bitmap (writing zeros to bitmap slice)
- Uncommit thread takes heap lock, believes it must uncommit region A
- Uncommit thread uncommits bitmap slice for region A
- Segfault in Control Thread
I do believe if we had a per region lock, it would be useful here. Holding a lock over the entire heap for this feels like overkill. Or, we could schedule the uncommit so that it does not occur during a GC cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, wait a sec. This code is in ShenandoahResetBitmapTask
, so it can run in parallel. Putting a lock here inhibits parallelism. I understand the failure mode, but I think we should really be optimizing for the case when ShenandoahUncommit
is not enabled (e.g. -Xmx
== -Xms
).
Sounds like there is a hassle in allowing concurrent uncommit to overlap with the GC cycle. In addition to this particular problem, we might be stealing cycles from the GC threads and take additional TTSP lag to park the uncommitter for the in-cycle GC pauses. I have no clear solution for this yet, but I think we need to explore if we can suspend the uncommit before going into GC cycle...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have the control and uncommit threads coordinate their efforts. In the worst case, it could mean delaying concurrent reset while the control thread waits for the uncommit thread to yield.
We could also try a more targeted lock only for the region's bitmap slice, but it doesn't seem right that one thread would be trying to clear a bitmap, while the other is trying to uncommit it. A lock could preserve technical correctness, but contention here would just mean that one thread would have wasted its time (either clearing a bitmap that is then uncommitted, or attempting to clear a bitmap that was first uncommitted (in this case, we would need the control thread to detect this and skip the region)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we open a ticket to consider future improved concurrency by moving clear_bitmap(region) outside the global heap lock?
I modified the testing pipelines to set |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursory review:
if (count > 0) { | ||
_heap->notify_heap_changed(); | ||
double elapsed = os::elapsedTime() - start; | ||
log_info(gc)("Uncommitted " SIZE_FORMAT " regions, in %.3fs", count, elapsed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can, can we match the current log format? E.g. print Concurrent uncommit
, with appropriately formatted timestamp? I think we also want log_info(gc,start)
at the beginning of the method. I think ShenandoahConcurrentPhase
helper did all that, can we still use it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can restore the log messages, but I don't think ShenandoahConcurrentPhase
and friends will like being used outside of a cycle. I'll look into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, at least restore the log format and add gc+start
log as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is still not addressed, unfortunately ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I spent some time trying to resurrect ShenandoahConcurrentPhase
for uncommit here, but it really doesn't want to be used outside of a gc cycle. Also, previously it was logging heap usage, which isn't quite what we want here (this may actually increase during this phase, which makes it seem as though nothing is being uncommitted).
I've restored the original logging format, but instead of logging heap usage it is now logging heap committed before and after. Here is an excerpt from specjbb2015 with -Xms5g -Xmx10g
:
[2024-11-20T20:02:25.056+0000][97.396s][22293][info][gc,start ] Concurrent uncommit
[2024-11-20T20:02:25.072+0000][97.412s][22293][info][gc ] Concurrent uncommit 5424M->5120M(5120M) 15.988ms
[2024-11-20T20:05:17.916+0000][270.255s][22293][info][gc,start ] Concurrent uncommit
[2024-11-20T20:05:18.169+0000][270.508s][22293][info][gc ] Concurrent uncommit 10240M->5120M(5120M) 253.048ms
[2024-11-20T20:06:45.329+0000][357.668s][22293][info][gc,start ] Concurrent uncommit
[2024-11-20T20:06:45.596+0000][357.935s][22293][info][gc ] Concurrent uncommit 10240M->5120M(5120M) 267.144ms
[2024-11-20T20:06:57.147+0000][369.486s][22293][info][gc,start ] Concurrent uncommit
[2024-11-20T20:06:57.148+0000][369.487s][22293][info][gc ] Concurrent uncommit 5456M->5440M(5440M) 1.189ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are emitting a log line that looks like a properly formatted GC log line, but the numbers there mean something else for Concurrent uncommit
, we are bound to confuse users and automatic tools. Uncommit should affect capacity
, this is how we know how deep we have uncommitted. So, I suggest we emit:
Concurrent uncommit XXXXM->XXXXM (YYYYM) z.zzzms
...where XXXX
is the heap used at the end of uncommit (note before and after are the same) and YYYY is capacity. This will not expose users to thinking uncommit grows the heap usage, and would give us instantaneous view on heap usage and capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the numbers are preceded by Concurrent uncommit
, with that context it's not much of a stretch to think these numbers represent the change in committed memory. The original log message (in which heap usage may increase during uncommit) was not helpful. A message with the same format in which heap usage also appears to not change at all during an uncommit is also perplexing. Are we trying too hard to preserve the original, not useful message? Maybe we just want a new message that plainly says:
Concurrently uncommitted XXXXM in z.zzzms
or
Concurrent uncommit: time z.zzzms, committed before XXXXM, committed after YYYYM, capacity ZZZZM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I don't want to emit something that looks like a heap usage GC log line, if it is not. Unfortunately, X->Y (Z) T.TTTTms
is a common format for X and Y as heap use. I agree posting X == Y would be only marginally better. So, maybe this goes as middle ground:
Concurrent uncommit XXXXM (YYYYM) z.zzzms
...where XXXX is the amount uncommitted, YYYY is the final heap capacity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this looks good.
if (heap->is_bitmap_slice_committed(region)) { | ||
ctx->clear_bitmap(region); | ||
{ | ||
ShenandoahHeapLocker locker(heap->lock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, wait a sec. This code is in ShenandoahResetBitmapTask
, so it can run in parallel. Putting a lock here inhibits parallelism. I understand the failure mode, but I think we should really be optimizing for the case when ShenandoahUncommit
is not enabled (e.g. -Xmx
== -Xms
).
Sounds like there is a hassle in allowing concurrent uncommit to overlap with the GC cycle. In addition to this particular problem, we might be stealing cycles from the GC threads and take additional TTSP lag to park the uncommitter for the in-cycle GC pauses. I have no clear solution for this yet, but I think we need to explore if we can suspend the uncommit before going into GC cycle...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with this, but best to wait for @shipilev approval before integrating.
if (heap->is_bitmap_slice_committed(region)) { | ||
ctx->clear_bitmap(region); | ||
{ | ||
ShenandoahHeapLocker locker(heap->lock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we open a ticket to consider future improved concurrency by moving clear_bitmap(region) outside the global heap lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have read through the latest version of the code. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. A few documentation comment requests.
Also please share performance data in this PR or in the ticket, especially from the perf/benchmark that may have precipitated this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll approve again, with the following nits:
if (count > 0) { | ||
_heap->notify_heap_changed(); | ||
double elapsed = os::elapsedTime() - start; | ||
log_info(gc)("Uncommitted " SIZE_FORMAT " regions, in %.3fs", count, elapsed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, at least restore the log format and add gc+start
log as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there, modulo restoring the logging.
if (count > 0) { | ||
_heap->notify_heap_changed(); | ||
double elapsed = os::elapsedTime() - start; | ||
log_info(gc)("Uncommitted " SIZE_FORMAT " regions, in %.3fs", count, elapsed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is still not addressed, unfortunately ^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think log message is still confusing a bit...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
@ysramakrishna - I ran several iterations of specjbb2015 with different variations of polling interval. Results show that 1/10th of
|
/integrate |
Going to push as commit bedb68a.
Your commit was automatically rebased without conflicts. |
@earthling-amzn Pushed as commit bedb68a. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Currently, Shenandoah uncommits regions from its control thread. The control thread is responsible for starting GC cycles in a timely fashion. Uncommitting memory from this thread may introduce unwanted delays in the control thread's response to GC pressure.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22019/head:pull/22019
$ git checkout pull/22019
Update a local copy of the PR:
$ git checkout pull/22019
$ git pull https://git.openjdk.org/jdk.git pull/22019/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 22019
View PR using the GUI difftool:
$ git pr show -t 22019
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22019.diff
Using Webrev
Link to Webrev Comment