Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8259383: AsyncGetCallTrace() crashes sporadically #2032

Closed
wants to merge 1 commit into from

Conversation

@RealLucy
Copy link
Contributor

@RealLucy RealLucy commented Jan 11, 2021

Hi,
may I please ask the community to review this small fix? It closes another hole in AsyncGetCallTrace().
Thanks a lot!
Lutz


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8259383: AsyncGetCallTrace() crashes sporadically

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/2032/head:pull/2032
$ git checkout pull/2032

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Jan 11, 2021

👋 Welcome back lucy! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk openjdk bot added the rfr label Jan 11, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Jan 11, 2021

@RealLucy The following labels will be automatically applied to this pull request:

  • hotspot
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Jan 11, 2021

Webrevs

Loading

Copy link
Contributor

@plummercj plummercj left a comment

The changes look ok to me. I think it would be good to get someone from the compiler team to verify your reasoning here.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 19, 2021

@RealLucy This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8259383: AsyncGetCallTrace() crashes sporadically

Reviewed-by: cjplummer

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 198 new commits pushed to the master branch:

  • a97f3c1: 8258853: Support separate function declaration and definition with ENABLE_IF-based SFINAE
  • 154e1d6: 8259009: G1 heap summary should be shown in "Heap Parameters" window on HSDB
  • acbcde8: 8256111: Create implementation for NSAccessibilityStaticText protocol
  • f928265: 8260009: InstanceKlass::has_as_permitted_subclass() fails if subclass was redefined
  • 7ed8ba1: 8256814: WeakProcessorPhases may be redundant
  • bfac3fb: 8260212: Shenandoah: resolve-only UpdateRefsMode is not used
  • 58ceb25: 8259842: Remove Result cache from StringCoding
  • d066f2b: 8260030: Improve stringStream buffer handling
  • 1452280: 8164484: Unity, JTable cell editor, javax/swing/JComboBox/6559152/bug6559152.java
  • a70acf2: 8259928: compiler/jvmci tests fail with -Xint
  • ... and 188 more: https://git.openjdk.java.net/jdk/compare/33fbc10cb8c869984159230b55ab9f3c9ca039ec...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Loading

@openjdk openjdk bot added the ready label Jan 19, 2021
@RealLucy
Copy link
Contributor Author

@RealLucy RealLucy commented Jan 19, 2021

/hotspot-compiler-dev

Loading

@RealCLanger
Copy link
Contributor

@RealCLanger RealCLanger commented Jan 19, 2021

/cc hotspot-compiler-dev

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Jan 19, 2021

@RealCLanger
The hotspot-compiler label was successfully added.

Loading

@apangin
Copy link

@apangin apangin commented Jan 20, 2021

Hi Lutz,

Thank you for working on this issue.
I'm not a formal OpenJDK Reviewer, but as an author of async-profiler project (which heavily relies on AsyncGetCallTrace), I'm more or less familiar with this code. Let me leave a few comments.

  1. The statement if (candidate.pc() != NULL) return false; looks a bit odd to me. When a profiler calls AsyncGetCallTrace, pc of the initial frame is obtained from a signal context, and it is never NULL. Returning false when pc != NULL basically invalidates the entire if (fr->cb() == NULL) branch. If this was intended, the branch could be simplified.
  2. At the same time, returning false whenever cb == NULL, discards a fair amount of valid stack traces, e.g. when a runtime function is called from Java without switching to in_vm state. I guess, the original intention of that loop was to handle such cases, but, as you noticed, it does not currently work well because of the assertions. I'd rather prefer revising the logic of that loop then instead of dropping all those valid samples.
  3. What I don't really understand from the bug report is why if (fr->cb() == NULL) branch is executed at all in this particular example. According to the stack trace from the crash dump, the top frame (before a signal handler) is ThreadCritical::~ThreadCritical(). This means, a thread is in_vm state, and its last_Java_frame should have been set. In this case AsyncGetCallTrace prefers to take last_Java_frame, and thus cb should be non-NULL. Another suspicious thing is that a frame below ThreadCritical::~ThreadCritical() is not a C frame. This cannot normally happen (the execution is in the middle of ThreadCritical::~ThreadCritical(), where the stack frame is consistent). These facts make me think something bad has already happened with the stack earlier, and the failed guarantee is probably just one manifestation of a bigger problem.

Loading

@RealLucy RealLucy marked this pull request as draft Jan 22, 2021
@openjdk openjdk bot removed the rfr label Jan 22, 2021
@RealLucy
Copy link
Contributor Author

@RealLucy RealLucy commented Jan 22, 2021

Spent a lot fo time and many thoughts. Finally, I found out it was all for nothing. The failure doesn't occur at the location I was sure it would. Will invest more time and more thoughts... I'll be back!

Loading

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Feb 19, 2021

@RealLucy This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

Loading

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Apr 21, 2021

@RealLucy This pull request has been inactive for more than 16 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

Loading

@bridgekeeper bridgekeeper bot closed this Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
4 participants