Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8291418: adjust monitor deflation logging and deflate_idle_monitors use #11293

Closed
wants to merge 3 commits into from

Conversation

dcubed-ojdk
Copy link
Member

@dcubed-ojdk dcubed-ojdk commented Nov 22, 2022

Trivial fixes to monitor deflation logging and deflate_idle_monitors use.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8291418: adjust monitor deflation logging and deflate_idle_monitors use

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11293/head:pull/11293
$ git checkout pull/11293

Update a local copy of the PR:
$ git checkout pull/11293
$ git pull https://git.openjdk.org/jdk pull/11293/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 11293

View PR using the GUI difftool:
$ git pr show -t 11293

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11293.diff

@dcubed-ojdk
Copy link
Member Author

/label add hotspot-runtime

@dcubed-ojdk dcubed-ojdk marked this pull request as ready for review November 22, 2022 17:11
@bridgekeeper
Copy link

bridgekeeper bot commented Nov 22, 2022

👋 Welcome back dcubed! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added rfr Pull request is ready for review hotspot-runtime hotspot-runtime-dev@openjdk.org labels Nov 22, 2022
@openjdk
Copy link

openjdk bot commented Nov 22, 2022

@dcubed-ojdk
The hotspot-runtime label was successfully added.

@mlbridge
Copy link

mlbridge bot commented Nov 22, 2022

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable.

Just an observation on the pre-existing logic ... it is interesting that logging has a side-effect of deflating idle monitors. If someone was encountering a problem due to too many inflated monitors and turned on logging to try and show them why, the problem might disappear and the logging would be ineffective in aiding in tracking down the problem.

@openjdk
Copy link

openjdk bot commented Nov 23, 2022

@dcubed-ojdk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8291418: adjust monitor deflation logging and deflate_idle_monitors use

Reviewed-by: dholmes, stuefe, pchilanomate

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 53 new commits pushed to the master branch:

  • 6065696: 8297982: Exclude vmTestbase/nsk/monitoring/stress/lowmem/ with ZGC until 8297979 is fixed
  • 415cfd2: 8297285: Shenandoah pacing causes assertion failure during VM initialization
  • df07255: 8297984: Turn on warnings as errors for javadoc
  • 227364d: 8297953: Fix several C2 IR matching tests for RISC-V
  • 1370228: 8297941: Add override modifier in space.hpp
  • 319faa5: 8296084: javax/swing/JSpinner/4788637/bug4788637.java fails intermittently on a VM
  • b73363f: 8297686: JFR: Improve documentation of EventStream::onMetadata(Consumer)
  • 1376f33: 8297911: Memory leak in JfrUpcalls::on_retransform
  • 5c0ff26: 8291444: GHA builds/tests won't run manually if disabled from automatic running
  • 6d0fbb2: 8297645: Drop the test/jdk/java/net/httpclient/reactivestreams-tck-tests/TckDriver.java test
  • ... and 43 more: https://git.openjdk.org/jdk/compare/c68556916ae7cc0301f3c89c05148547dcb44278...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 23, 2022
Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

But this felt not trivial to me :) I tried to understand the preexisting code:

while (ObjectSynchronizer::deflate_idle_monitors(tablep) >= (size_t)MonitorDeflationMax) {

So, was the intent behind this "as long as deflate_idle_monitors() uses its full allotment of deflations, continue, but if you notice it slacking off, stop?" And the assumption is we always deflate n*MonitorDeflationMax + a smaller rest? And this assumption was wrong?

Copy link
Contributor

@pchilano pchilano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

if (deflated_count != 0 || log_is_enabled(Debug, monitorinflation)) {
ls->print_cr("deflated " SIZE_FORMAT " monitors in %3.7f secs",
deflated_count, timer.seconds());
if (deflated_count != 0 || unlinked_count != 0 || deleted_count != 0 || log_is_enabled(Debug, monitorinflation)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why check both unlinked_count and deleted_count? Shouldn't they be equal if we got here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If memory serves, it's possible for the final safepoint to catch
the MonitorDeflation thread while it's working in here at different
phases of the deflate_idle_monitors() work. When the VM Thread
comes in here for the "final audit" it can find different amounts of
work to do in the different phases because the MonitorDeflation
thread might have deflated N monitors, but it wasn't able to unlink
them all. When the VM Thread does its pass, it may deflate fewer
monitors (M), but may find N+M monitors need to be unlinked and
deleted.

Similarly, I think it's possible for the MonitorDeflation thread to
deflate N monitors and unlink N monitors, but then block for the
final safepoint. When the VMThread comes along, it may delete
fewer monitors(M) and unlink fewer monitors(M), but may find
N+M monitors need to be deleted. If the VM Thread's M value is
zero, then it will have deflated_count == 0, unlinked_count == 0
and deleted_count != 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't the delete_list local to the MonitorDeflation thread? How can the VMThread access those unlinked but not deleted monitors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right about delete_list being local so once the
MonitorDeflation thread has successfully unlinked a deflated
ObjectMonitor and put it on the local delete_list, if the
MonitorDeflation thread goes to a final safepoint, then any
ObjectMonitors on the delete_list won't be freed.

It also means that unlinked_count == deleted_count in any
thread that reaches L1551. If I change that if-statement, I think
I'd want to add an assert() there also.

@dcubed-ojdk
Copy link
Member Author

@dholmes-ora, @tstuefe and @pchilano - Thanks for the reviews!

Sorry for the delay in getting back to this review. I had an over abundance of
CI/GK work to do before the holiday break and I just finished getting caught
up after the holiday break.

@dcubed-ojdk
Copy link
Member Author

dcubed-ojdk commented Nov 29, 2022

@dholmes-ora:

This seems reasonable.

Thanks!

Just an observation on the pre-existing logic ... it is interesting that logging has a side-effect of deflating idle monitors. If someone was encountering a problem due to too many inflated monitors and turned on logging to try and show them why, the problem might disappear and the logging would be ineffective in aiding in tracking down the problem.

I suppose that is a possibility if, for some reason, the MonitorDeflation thread was not
running in a timely fashion (too long of a delay, missed notification, stuck thread).
However, most cases of too many inflated monitors end up being because the monitors
didn't meet the requirements for being deflated (non-zero waiter count being the most
likely).

@dcubed-ojdk
Copy link
Member Author

dcubed-ojdk commented Nov 29, 2022

@tstuefe:

Looks good.

Thanks!

But this felt not trivial to me :)

Sorry about that. I suspect that I have lived with this code for too long...
I tried to figure out how to update the PR description to put a line thru
the word "Trivial", but I don't see that option.

I tried to understand the preexisting code:

while (ObjectSynchronizer::deflate_idle_monitors(tablep) >= (size_t)MonitorDeflationMax) {

So, was the intent behind this "as long as deflate_idle_monitors() uses its full allotment of deflations, continue, but if you notice it slacking off, stop?" And the assumption is we always deflate n*MonitorDeflationMax + a smaller rest? And this assumption was wrong?

My original intent was "loop here until we have nothing left to deflate".
I implemented it wrong and we stopped if there was a partial
deflation, i.e., less than MonitorDeflationMax deflated.
However, in some cases, we would see multiple partial deflations
in a row before there is nothing left to deflate so I changed the loop
to stop when we did a complete pass and deflated nothing.

@dcubed-ojdk
Copy link
Member Author

@pchilano:

Looks good to me.

Thanks.

@dcubed-ojdk
Copy link
Member Author

@dholmes-ora, @tstuefe and @pchilano -

I think I've replied to all the review comments. Please let me know if
you are okay with the replies.

@tstuefe
Copy link
Member

tstuefe commented Nov 30, 2022

All good. Thanks for answering my question.

@dcubed-ojdk
Copy link
Member Author

Logging output with the latest commit looks like this:

[0.040s][info][monitorinflation] Starting the final audit.
[0.040s][info][monitorinflation] begin deflating: in_use_list stats: ceiling=10240, count=2, max=2
[0.040s][info][monitorinflation] deflated_count=1, {unlinked,deleted}_count=1 monitors in 0.0000017 secs
[0.040s][info][monitorinflation] end deflating: in_use_list stats: ceiling=10240, count=1, max=2

Copy link
Contributor

@pchilano pchilano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Dan, latest changes look good.

@dcubed-ojdk
Copy link
Member Author

@pchilano - Thanks for the re-review.

@dcubed-ojdk
Copy link
Member Author

Local testing of this patch after rebasing to the jdk-20+26 snapshot looks good.

/integrate

@openjdk
Copy link

openjdk bot commented Dec 2, 2022

Going to push as commit 6065516.
Since your change was applied there have been 58 commits pushed to the master branch:

  • 6e54705: 8295424: adjust timeout for another JLI GetObjectSizeIntrinsicsTest.java subtest
  • 562bc17: 8291359: Specification of method j.l.foreign.VaList::skip still deserves clarification
  • 2821fa9: 8280798: com.sun.jdi.ObjectReference::setValue spec should prohibit any final field modification
  • fb6fd03: 8291830: jvmti/RedefineClasses/StressRedefine failed: assert(!is_null(v)) failed: narrow klass value can never be zero
  • 1b92465: 8297608: JFR: Incorrect duration after chunk rotation
  • 6065696: 8297982: Exclude vmTestbase/nsk/monitoring/stress/lowmem/ with ZGC until 8297979 is fixed
  • 415cfd2: 8297285: Shenandoah pacing causes assertion failure during VM initialization
  • df07255: 8297984: Turn on warnings as errors for javadoc
  • 227364d: 8297953: Fix several C2 IR matching tests for RISC-V
  • 1370228: 8297941: Add override modifier in space.hpp
  • ... and 48 more: https://git.openjdk.org/jdk/compare/c68556916ae7cc0301f3c89c05148547dcb44278...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 2, 2022
@openjdk openjdk bot closed this Dec 2, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 2, 2022
@openjdk
Copy link

openjdk bot commented Dec 2, 2022

@dcubed-ojdk Pushed as commit 6065516.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@dcubed-ojdk dcubed-ojdk deleted the JDK-8291418 branch December 8, 2022 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants