Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing #903

Closed
wants to merge 6 commits into from

Conversation

stefank
Copy link
Member

@stefank stefank commented Oct 28, 2020

This is an alternative version of the fix proposed in 900:
#900

Erik's description:

Today, when you crash, the GCLogPrecious::_lock is taken. This effectively limits you to only get clean crash reports if you crash or assert without holding a lock of rank tty or lower. It is arguably difficult to know what locks you are going to have when crashing. Therefore, I don't think the precious GC log should constrain possible crashing contexts in that fashion.

As Erik mentioned in that PR, I'd like to retain the ability to easily dump the precious log when debugging. The proposed fix changes the Mutex to a Semaphore, and use trywait to safely access the buffer. In the unlikely event that another thread is holding the lock, the hs_err printer skips printing the log.

This also makes it possible to call precious logging from within the stack watermark processing code. I think there's a possibility that we might call the following error logging, when we fail to commit memory for a ZPage, when relocating, during stack watermark processing:
log_error_p(gc)("Failed to commit memory (%s)", err.to_string());


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Testing

Linux additional Linux x64 Linux x86 Windows x64 macOS x64
Build ✔️ (8/8 passed) ✔️ (2/2 passed) ✔️ (2/2 passed) ✔️ (2/2 passed) ✔️ (2/2 passed)
Test (tier1) ✔️ (9/9 passed) ❌ (2/9 failed) ✔️ (9/9 passed) ✔️ (9/9 passed)

Failed test tasks

Issue

  • JDK-8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/903/head:pull/903
$ git checkout pull/903

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 28, 2020

👋 Welcome back stefank! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 28, 2020
@openjdk
Copy link

openjdk bot commented Oct 28, 2020

@stefank The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Oct 28, 2020
@mlbridge
Copy link

mlbridge bot commented Oct 28, 2020

Webrevs

st->print_cr("%s", _lines->base());
}

_lock->signal();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed off-line, perhaps we might want to print something even if the log isn't initialized and/or is empty. Something like:

  st->print_cr("GC Precious Log:");

  if (_lines == NULL) {
    st->print_cr("<Not initialized>");
    return;
  }

  if (!_lock->trywait()) {
    st->print_cr("<Skipped>");
    return;
  }

  if (_lines->size() == 0) {
    st->print_cr("<Empty>");
  } else {
    st->print_cr("%s", _lines->base());
  }

  _lock->signal();

You decide. Looks good otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with suggestion. I also added extra newlines to make the output look pretty. That was needed because _lines->base() is always terminated with a newline.

@openjdk
Copy link

openjdk bot commented Oct 28, 2020

@stefank This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing

Reviewed-by: eosterlund

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 6 new commits pushed to the master branch:

  • bf66d73: 8257073: ZGC: Try forward object before retaining page
  • 1b3aa3a: 8256831: MIPS Zero builds fail with undefined __atomic_compare_exchange_8
  • 734d3c3: 8256862: Several java/foreign tests fail on x86_32 platforms
  • 7946c94: 8257082: ZGC: Clean up ZRuntimeWorkers and ZWorkers
  • f6d6a07: 8256938: Improve remembered set sampling task scheduling
  • b823ad9: 8257072: ZGC: Rename roots iterators

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 28, 2020
Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.


stringStream* GCLogPrecious::_lines = NULL;
stringStream* GCLogPrecious::_temp = NULL;
Mutex* GCLogPrecious::_lock = NULL;
Semaphore* GCLogPrecious::_lock = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe renaming _lock to _semaphore? Additionally, since it's a binary semaphore, new Semaphore(1), some comments explaining why Mutex is not suitable could avoid some future confusions.

PS: not a review, just a comment in passing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used as a lock, so I think the name _lock is appropriate. Instead I introduced a new class: SemaphoreLock, to make the code more readable (IMHO). Also added a comment. Hopefully, this addressed your comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you; it does look more readable. BTW, mutex has try_lock as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then I'll remove that part of the new comment.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org and removed hotspot-gc hotspot-gc-dev@openjdk.org labels Oct 29, 2020
@stefank
Copy link
Member Author

stefank commented Oct 29, 2020

In the latest update I added two new helper classes: SemaphoreLock and SemaphoreLocker. I think this makes the code nicer. Since those classes are more broadly used, I'll go a head and split them out into a separate PR.

@stefank
Copy link
Member Author

stefank commented Oct 29, 2020

Forked off the SemaphoreLock part into #927

@pliden
Copy link
Contributor

pliden commented Oct 29, 2020

Updates look good!

@openjdk
Copy link

openjdk bot commented Nov 24, 2020

⚠️ @stefank This pull request contains merges that bring in commits not present in the target repository. Since this is not a "merge style" pull request, these changes will be squashed when this pull request in integrated. If this is your intention, then please ignore this message. If you want to preserve the commit structure, you must change the title of this pull request to Merge <project>:<branch> where <project> is the name of another project in the OpenJDK organization (for example Merge jdk:master).

@openjdk openjdk bot added hotspot-gc hotspot-gc-dev@openjdk.org and removed hotspot hotspot-dev@openjdk.org labels Nov 24, 2020
@stefank
Copy link
Member Author

stefank commented Nov 24, 2020

I've thrown away the Semaphore implementation and replaced it with Patricio's new try_lock_without_range_check function. I've also changed the lock rank to be as low as possible. We'll still get a lock rank reordering problem if we crash while holding this lock, because EventLog takes its lock and it is of the same rank. I intend to address that with JDK-8256382.

Copy link
Contributor

@pliden pliden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update looks good.

Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Quite entertaining that the "leaf" rank is so far away from being a leaf level rank nowadays.

@stefank
Copy link
Member Author

stefank commented Nov 24, 2020

Looks good. Quite entertaining that the "leaf" rank is so far away from being a leaf level rank nowadays.

I had the exact same thought. :(

Thanks for reviewing @pliden and @fisk

@stefank
Copy link
Member Author

stefank commented Dec 2, 2020

/integrate

@openjdk openjdk bot closed this Dec 2, 2020
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 2, 2020
@openjdk
Copy link

openjdk bot commented Dec 2, 2020

@stefank Since your change was applied there have been 97 commits pushed to the master branch:

  • 1fd0ea7: 8256382: Use try_lock for hs_err EventLog printing
  • bff68f1: 8257533: legacy-jre-image includes jpackage and jlink tools
  • 9a60413: 8248736: [aarch64] runtime/signal/TestSigpoll.java failed "fatal error: not an ldr (literal) instruction."
  • e7ca0c4: 8257224: JDK-8251549 didn't update building.html
  • 7e37c7c: 8257471: fatal error: Fatal exception in JVMCI: Exception during JVMCI compiler initialization
  • 3e3745c: 8256008: UL does not report anything if disk writing fails
  • fb139cf: 8257467: [TESTBUG] -Wdeprecated-declarations is reported at sigset() in exesigtest.c
  • 9de283b: 8257505: nsk/share/test/StressOptions stressTime is scaled in getter but not when printed
  • 282cb32: 8005970: Mouse cursor is default cursor over TextArea's scrollbar
  • f2a0988: 8257228: G1: SIGFPE in G1ConcurrentRefine::create(int*) due to buffers_to_cards overflow
  • ... and 87 more: https://git.openjdk.java.net/jdk/compare/973255c469d794afe8ee74b24ddb5048bfcaadf7...master

Your commit was automatically rebased without conflicts.

Pushed as commit 287b829.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@stefank stefank deleted the 8254877_gclogprecious_locks branch January 13, 2021 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc hotspot-gc-dev@openjdk.org integrated Pull request has been integrated
4 participants