Skip to content

8226416: MonitorUsedDeflationThreshold can cause repeated async deflation requests #1993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 21 commits into from

Conversation

dcubed-ojdk
Copy link
Member

@dcubed-ojdk dcubed-ojdk commented Jan 7, 2021

Adding support for a diagnostic option called NoAsyncDeflationProgressMax
with a default value of 3. If we have three async monitor deflation cycles in a
row with zero monitors deflated, then we adjust the in_use_list_ceiling up.

I've locally built and tested this fix on my MBP13 using the
MonitorUsedDeflationThresholdTest.java test that I wrote when this issue
first came up in June of 2019. I will be including this fix in my next Mach5
Tier[1-3] testing batch.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8226416: MonitorUsedDeflationThreshold can cause repeated async deflation requests

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/1993/head:pull/1993
$ git checkout pull/1993

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 7, 2021

👋 Welcome back dcubed! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jan 7, 2021

@dcubed-ojdk The following label will be automatically applied to this pull request:

  • hotspot-runtime

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-runtime hotspot-runtime-dev@openjdk.org label Jan 7, 2021
@dcubed-ojdk
Copy link
Member Author

Note: I'm testing out stacking patches in Git so there are commits for
https://bugs.openjdk.java.net/browse/JDK-8259397 and
https://bugs.openjdk.java.net/browse/JDK-8259349 that show up in this PR.

This is the correct commit for this PR review: af5c3fe

@dcubed-ojdk dcubed-ojdk marked this pull request as ready for review January 7, 2021 23:16
@openjdk openjdk bot added the rfr Pull request is ready for review label Jan 7, 2021
@mlbridge
Copy link

mlbridge bot commented Jan 7, 2021

Webrevs

@openjdk
Copy link

openjdk bot commented Jan 8, 2021

⚠️ @dcubed-ojdk This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@mlbridge
Copy link

mlbridge bot commented Jan 8, 2021

Mailing list message from David Holmes on hotspot-runtime-dev:

Hi Dan,

On 8/01/2021 9:20 am, Daniel D.Daugherty wrote:

On Thu, 7 Jan 2021 23:14:17 GMT, Daniel D. Daugherty <dcubed at openjdk.org> wrote:

Adding support for a diagnostic option called NoAsyncDeflationProgressMax
with a default value of 3. If we have three async monitor deflation cycles in a
row with zero monitors deflated, then we adjust the in_use_list_ceiling up.

I've locally built and tested this fix on my MBP13 using the
MonitorUsedDeflationThresholdTest.java test that I wrote when this issue
first came up in June of 2019. I will be including this fix in my next Mach5
Tier[1-3] testing batch.

Note: I'm testing out stacking patches in Git so there are commits for
https://bugs.openjdk.java.net/browse/JDK-8259397 and
https://bugs.openjdk.java.net/browse/JDK-8259349 that show up in this PR.

This is the correct commit for this PR review: https://github.com/openjdk/jdk/commit/af5c3fec96d1d45af5881947d366e22cb8962dc3

I'm not quite sure what you are trying to test out here. All three
commits are part of this PR and will be integrated when you integrate
this PR. If you wanted this to be handled separately then they should be
in distinct branches and PRs. If you want all three reviewed together
that is fine too.

Cheers,
David

@dcubed-ojdk
Copy link
Member Author

I was hoping that each subsequent PR would be baselined on the previous
branch so that each PR would only have a single commit in it. It didn't work
out that way so that's why I included this blurb in a comment:

This is the correct commit for this PR review: af5c3fe

@dcubed-ojdk
Copy link
Member Author

dcubed-ojdk commented Jan 12, 2021

I've updated this fix with similar changes that were requested in the
JDK-8259349 code review cycle. I've also taken the standalone test
created back in June 2019 and converted it into a JTREG style test
with automated checks for the proper passing conditions.

Any takers for a review?

Note: At this point, the other two fixes that preceded this one have
been integrated so only this fix shows up in the review links.

This fix passed Mach5 Tier[1-3] testing.

… info about no-progress async monitor deflation cycles and the NoAsyncDeflationProgressMax option.
Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dan,
Thanks for the additional explanatory comments. The fix looks good to me.

A couple of comments on the test.

Thanks,
David

_no_progress_cnt >= NoAsyncDeflationProgressMax) {
float remainder = (100.0 - MonitorUsedDeflationThreshold) / 100.0;
size_t new_ceiling = ceiling + (ceiling * remainder) + 1;
size_t old_ceiling = ObjectSynchronizer::in_use_list_ceiling();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could capture:

size_t old_ceiling = ceiling;

back after line 1157, rather than re-reading it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return;
}

Object obj = new Object();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a thread-local object, which means the synchronization has no affect, so I think C2 could elide the following sync-block. Using a static field would make it less likely that this can happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it has to be a static array of Objects since we want a new
ObjectMonitor per recursion.

Comment on lines 116 to 118
if (too_many == null) {
throw new RuntimeException("Did not find too_many string in output.\n");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should call output_detail.reportDiagnosticSummary() before throwing the exception. That emulates what OutputAnalyzer.shouldContain would do. (or you could just use shouldContain).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a call to output_detail.reportDiagnosticSummary().
I didn't use OutputAnalyzer.shouldContain() intentionally
because of this output line:

System.out.println("too_many='" + too_many + "'");

which is there for diagnostic purposes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually tested the use of output_detail.reportDiagnosticSummary()
by temporarily setting too_many = null; before the check. It caused
nice diagnostic output to be printed. I like this technique!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, when the test passes, the output looks like this:

[2021-01-13T16:12:47.673259Z] Gathering output for process 15104
Found beginning of a deflation cycle.
too_many='Too many deflations without progress; bumping in_use_list_ceiling from 36 to 40'
PASSED.
----------System.err:(1/15)----------
STATUS:Passed.
----------rerun:(34/5039)*----------

so there's just a bit of diagnostic output from the test in the passing case.

System.out.println("PASSED.");
return;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment e.g.
`// else we are the exec'd java subprocess, so run the actual test

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

@dcubed-ojdk
Copy link
Member Author

@dholmes-ora - Please re-review when you get the chance.

I could use a second reviewer...

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of minor comments.

Thanks,
David

@@ -1167,7 +1167,7 @@ static bool monitors_used_above_threshold(MonitorList* list) {
_no_progress_cnt >= NoAsyncDeflationProgressMax) {
float remainder = (100.0 - MonitorUsedDeflationThreshold) / 100.0;
size_t new_ceiling = ceiling + (ceiling * remainder) + 1;
size_t old_ceiling = ObjectSynchronizer::in_use_list_ceiling();
size_t old_ceiling = ceiling;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really what you want? ceiling may have been assigned again at line 1160.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misread your comment... Fixing it now...

@@ -45,6 +45,7 @@
public class MonitorUsedDeflationThresholdTest {
public static final int DELAY_SECS = 10;
public static int inflate_count = 0;
public static Object monitors[];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: Object[] monitors;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh.... fixing it.

@openjdk
Copy link

openjdk bot commented Jan 13, 2021

@dcubed-ojdk This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8226416: MonitorUsedDeflationThreshold can cause repeated async deflation requests

Reviewed-by: dholmes, coleenp

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 40 new commits pushed to the master branch:

  • 51e14f2: Merge
  • 5926d75: 8259719: ProblemList runtime/cds/appcds/jigsaw/modulepath/ModulePathAndCP_JFR.java on Windows
  • 8abefde: 8259720: ProblemList java/awt/Focus/AppletInitialFocusTest/AppletInitialFocusTest1.java on Windows
  • fb8ac24: 8259722: ProblemList two jdk/jfr/startupargs tests on Windows
  • ac4cd2e: 8231461: static/instance overload leads to 'unexpected static method found in unbound lookup' when resolving method reference
  • 42d2d6d: 8259063: Possible deadlock with vtable/itable creation vs concurrent class unloading
  • 6bb6093: 8259657: typo in generated HELP page prevents localization
  • 5567530: 8258272: LoadVectorMaskedNode can't be replaced by zero con
  • a99df45: 8259560: Zero m68k: "static assertion failed: align" after JDK-8252049
  • efc36be: 8258985: Parallel WeakProcessor may use too few threads
  • ... and 30 more: https://git.openjdk.java.net/jdk/compare/c338f1167f2ab0c1a35876419db914dce844d5e8...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jan 13, 2021
@dcubed-ojdk
Copy link
Member Author

@dholmes-ora - let's try this again.

Still need one more reviewer...

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! :)

Thanks,
David

Copy link
Contributor

@coleenp coleenp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

import java.io.FileInputStream;
import java.io.InputStreamReader;
import java.util.regex.Pattern;
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these imports commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I should have deleted those.

@dcubed-ojdk
Copy link
Member Author

@dholmes-ora and @coleenp - Thanks for the reviews! I'll integrate
tomorrow when I can watch the CI...

@dcubed-ojdk
Copy link
Member Author

/integrate

@openjdk openjdk bot closed this Jan 14, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jan 14, 2021
@openjdk
Copy link

openjdk bot commented Jan 14, 2021

@dcubed-ojdk Since your change was applied there have been 48 commits pushed to the master branch:

  • c822eda: 8259699: Reduce char[] copying in URLEncoder.encode(String, Charset)
  • ff3e6e4: 8259773: Incorrect encoding of AVX-512 kmovq instruction
  • b8ef2ba: 8259563: The CPU model name is printed multiple times when using -Xlog:os+cpu
  • b040a3d: 8259631: Reapply pattern match instanceof use in HttpClientImpl
  • 3462f7a: 8256955: Move includes of events.hpp out of header files
  • 8b8b1f9: 8259706: C2 compilation fails with assert(vtable_index == Method::invalid_vtable_index) failed: correct sentinel value
  • ae9187d: 8256109: Create implementation for NSAccessibilityButton protocol
  • 5513f98: 8258010: Debug build failure with clang-10 due to -Wdeprecated-copy
  • 51e14f2: Merge
  • 5926d75: 8259719: ProblemList runtime/cds/appcds/jigsaw/modulepath/ModulePathAndCP_JFR.java on Windows
  • ... and 38 more: https://git.openjdk.java.net/jdk/compare/c338f1167f2ab0c1a35876419db914dce844d5e8...master

Your commit was automatically rebased without conflicts.

Pushed as commit be57cf1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@dcubed-ojdk dcubed-ojdk deleted the JDK-8226416 branch January 14, 2021 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

3 participants