Skip to content

8263579: ZGC: Concurrent mark hangs with debug loglevel #3011

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

casparcwang
Copy link
Contributor

@casparcwang casparcwang commented Mar 15, 2021

Do not leave terminate stage 1 when there is nothing to do. This helps eliminate the concurrent mark hang with debug log level, and reduce total mark time without impact the performance.

The following is the performance data of specjbb2015 with stressed zgc:
Current patch:
RUN RESULT: hbIR (max attempted) = 122581, hbIR (settled) = 102168, max-jOPS = 104194, critical-jOPS = 90630
[2021-03-12T17:20:37.072+0800][info ][gc,stats ] Phase: Concurrent Mark 383.026 / 383.026 1222.317 / 1811.813 766.530 / 1811.813 766.530 / 1811.813 ms

RUN RESULT: hbIR (max attempted) = 118234, hbIR (settled) = 108389, max-jOPS = 102864, critical-jOPS = 93572
[2021-03-12T19:28:01.032+0800][info ][gc,stats ] Phase: Concurrent Mark 407.483 / 407.483 1243.775 / 1773.463 756.956 / 1773.463 756.956 / 1773.463 ms

RUN RESULT: hbIR (max attempted) = 111999, hbIR (settled) = 110683, max-jOPS = 104159, critical-jOPS = 92600
[2021-03-12T21:48:09.729+0800][info ][gc,stats ] Phase: Concurrent Mark 412.954 / 412.954 1216.900 / 1927.552 762.315 / 1927.552 762.315 / 1927.552 ms

Original:
RUN RESULT: hbIR (max attempted) = 122581, hbIR (settled) = 109863, max-jOPS = 102968, critical-jOPS = 90836
[2021-03-13T01:59:35.160+0800][info ][gc,stats ] Phase: Concurrent Mark 414.845 / 414.845 1168.357 / 1817.015 795.806 / 1837.452 795.806 / 1837.452 ms

RUN RESULT: hbIR (max attempted) = 122581, hbIR (settled) = 102168, max-jOPS = 102968, critical-jOPS = 89227
[2021-03-12T23:49:17.322+0800][info ][gc,stats ] Phase: Concurrent Mark 405.709 / 405.709 1250.672 / 1724.725 783.993 / 1739.099 783.993 / 1739.099 ms

RUN RESULT: hbIR (max attempted) = 102168, hbIR (settled) = 85156, max-jOPS = 104211, critical-jOPS = 91693
[2021-03-13T04:22:14.338+0800][info ][gc,stats ] Phase: Concurrent Mark 415.444 / 415.444 1254.256 / 1896.037 768.536 / 1896.037 768.536 / 1896.037 ms


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8263579: ZGC: Concurrent mark hangs with debug loglevel

Reviewers

Download

To checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3011/head:pull/3011
$ git checkout pull/3011

To update a local copy of the PR:
$ git checkout pull/3011
$ git pull https://git.openjdk.java.net/jdk pull/3011/head

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 15, 2021

👋 Welcome back casparcwang! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 15, 2021
@openjdk
Copy link

openjdk bot commented Mar 15, 2021

@casparcwang The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Mar 15, 2021
@mlbridge
Copy link

mlbridge bot commented Mar 15, 2021

Webrevs

@pliden
Copy link
Contributor

pliden commented Mar 16, 2021

Thanks for reporting and proposing a fix. I would suggest that we simply remove the ZSubPhaseConcurrentMarkIdle stat timer completely, since that timer isn't that useful anyway.

@casparcwang
Copy link
Contributor Author

Thanks for reporting and proposing a fix. I would suggest that we simply remove the ZSubPhaseConcurrentMarkIdle stat timer completely, since that timer isn't that useful anyway.

Thanks for your review @pliden . The patch is changed according to your suggestion, and removing ZSubPhaseConcurrentMarkIdle solves the problem in the test environment.

The destruction of ZSubPhaseConcurrentMarkIdle and ZSubPhaseConcurrentMarkTryTerminate produce lots of lock contention, which result in the hang. Only removing ZSubPhaseConcurrentMarkIdle can reduce the contention of gc log lock, but there still exists contention of gc log lock of ZSubPhaseConcurrentMarkTryTerminate after _terminate.try_exit_stage0(), which may result in jvm hang or long concurrent mark time in some un-explored scenario.

Copy link
Contributor

@pliden pliden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@casparcwang
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Mar 22, 2021

@casparcwang This PR has not yet been marked as ready for integration.

@casparcwang casparcwang changed the title JDK-8263579: ZGC concurrent mark hangs with debug loglevel ZGC: Concurrent mark hangs with debug loglevel Mar 22, 2021
@openjdk openjdk bot removed the rfr Pull request is ready for review label Mar 22, 2021
@casparcwang
Copy link
Contributor Author

/issue add JDK-8263579

@openjdk openjdk bot changed the title ZGC: Concurrent mark hangs with debug loglevel 8263579: ZGC: Concurrent mark hangs with debug loglevel Mar 22, 2021
@openjdk
Copy link

openjdk bot commented Mar 22, 2021

@casparcwang The primary solved issue for a PR is set through the PR title. Since the current title does not contain an issue reference, it will now be updated.

@openjdk
Copy link

openjdk bot commented Mar 22, 2021

⚠️ @casparcwang the full name on your profile does not match the author name in this pull requests' HEAD commit. If this pull request gets integrated then the author name from this pull requests' HEAD commit will be used for the resulting commit. If you wish to push a new commit with a different author name, then please run the following commands in a local repository of your personal fork:

$ git checkout JDK-8263579
$ git commit -c user.name='Preferred Full Name' --allow-empty -m 'Update full name'
$ git push

@openjdk
Copy link

openjdk bot commented Mar 22, 2021

@casparcwang This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8263579: ZGC: Concurrent mark hangs with debug loglevel

Reviewed-by: pliden, ayang, eosterlund

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 350 new commits pushed to the master branch:

  • 35cd945: 8263908: Build fails due to initialize_static_field_for_dump defined but not used after JDK-8263771
  • cd45538: 8263771: Refactor javaClasses initialization code to isolate dumping code
  • 118a49f: 8263846: Bad JNI lookup getFocusOwner in accessibility code on Mac OS X
  • cb742f9: 8255255: Update Apache Santuario (XML Signature) to version 2.2.1
  • d2c137d: 8263558: Possible NULL dereference in fast path arena free if ZapResourceArea is true
  • ab66d69: 8263138: Initialization of sun.font.SunFontManager.platformFontMap is not thread safe
  • 5b8233b: 8263871: On sem_destroy() failing we should assert
  • 96e5c3f: 8263890: Broken links to Unicode.org
  • 4d9517d: 8263834: Work around gdb for HashtableEntry
  • 6fa6557: 8263825: Remove unused and commented out member from NTLMException
  • ... and 340 more: https://git.openjdk.java.net/jdk/compare/3e13b66e3fd878889cb22e9bf4c79b41a11e36a5...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@pliden, @albertnetymk, @fisk) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 22, 2021
@casparcwang
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Mar 22, 2021
@openjdk
Copy link

openjdk bot commented Mar 22, 2021

@casparcwang
Your change (at version e2280fd) is now ready to be sponsored by a Committer.

@DamonFool
Copy link
Member

/sponsor

@openjdk openjdk bot closed this Mar 22, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed sponsor Pull request is ready to be sponsored labels Mar 22, 2021
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Mar 22, 2021
@openjdk
Copy link

openjdk bot commented Mar 22, 2021

@DamonFool @casparcwang Since your change was applied there have been 350 commits pushed to the master branch:

  • 35cd945: 8263908: Build fails due to initialize_static_field_for_dump defined but not used after JDK-8263771
  • cd45538: 8263771: Refactor javaClasses initialization code to isolate dumping code
  • 118a49f: 8263846: Bad JNI lookup getFocusOwner in accessibility code on Mac OS X
  • cb742f9: 8255255: Update Apache Santuario (XML Signature) to version 2.2.1
  • d2c137d: 8263558: Possible NULL dereference in fast path arena free if ZapResourceArea is true
  • ab66d69: 8263138: Initialization of sun.font.SunFontManager.platformFontMap is not thread safe
  • 5b8233b: 8263871: On sem_destroy() failing we should assert
  • 96e5c3f: 8263890: Broken links to Unicode.org
  • 4d9517d: 8263834: Work around gdb for HashtableEntry
  • 6fa6557: 8263825: Remove unused and commented out member from NTLMException
  • ... and 340 more: https://git.openjdk.java.net/jdk/compare/3e13b66e3fd878889cb22e9bf4c79b41a11e36a5...master

Your commit was automatically rebased without conflicts.

Pushed as commit 5a7f22a.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants