Skip to content

Conversation

@shipilev
Copy link
Member

@shipilev shipilev commented Sep 22, 2022

After POSIX signal refactorings, Zero error handling had "regressed" a bit: Zero always gets NULL as pc in error handling code, and thus it fails with SEGV at pc=0x0. We can do better by implementing context decoding where possible.

Unfortunately, this introduces some arch-specific code in Zero code. The arch-specific code is copy-pasted (with inline definitions, if needed) from the relevant os_linux_*.cpp files. The unimplemented arches would still report the same confusing hs_err-s. We can emulate (and thus test) the generic behavior using new diagnostic VM option.

This reverts parts of JDK-8259392.

Sample test:

import java.lang.reflect.*;
import sun.misc.Unsafe;

public class Crash {
  public static void main(String... args) throws Exception {
    Field f = Unsafe.class.getDeclaredField("theUnsafe");
    f.setAccessible(true);
    Unsafe u = (Unsafe) f.get(null);
    u.getInt(42); // accesing via broken ptr
  }
}

Linux x86_64 Zero fastdebug crash currently:

# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=538793, tid=538794
#
...
# (no native frame info)
...
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000002a

Linux x86_64 Zero fastdebug crash with this patch:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fbbbf08b584, pid=520119, tid=520120
#
...
# Problematic frame:
# V  [libjvm.so+0xcbe584]  Unsafe_GetInt+0xe4
....
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000002a

Linux x86_64 Zero fastdebug crash with this patch and -XX:-DecodeErrorContext:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000000000000, pid=520268, tid=520269
#
...
# Problematic frame:
# C  0x0000000000000000
...
siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x000000000000002a

Additional testing:

  • Linux x86_64 Zero fastdebug eyeballing crash logs
  • Linux x86_64 Zero fastdebug, tier1
  • Linux {x86_64, x86_32, aarch64, arm, riscv64, s390x, ppc64le, ppc64be} Zero fastdebug builds

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8294211: Zero: Decode arch-specific error context if possible

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10397/head:pull/10397
$ git checkout pull/10397

Update a local copy of the PR:
$ git checkout pull/10397
$ git pull https://git.openjdk.org/jdk pull/10397/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 10397

View PR using the GUI difftool:
$ git pr show -t 10397

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10397.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 22, 2022

👋 Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 22, 2022
@openjdk
Copy link

openjdk bot commented Sep 22, 2022

@shipilev The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Sep 22, 2022
@mlbridge
Copy link

mlbridge bot commented Sep 22, 2022

Webrevs

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! But why make this conditional with a switch? Who would not want to have better error information?

@shipilev
Copy link
Member Author

Good! But why make this conditional with a switch? Who would not want to have better error information?

Because I want to be able to test the generic error handling paths that would run on "generic" arch, without leaving the comfort of my x86_64 machine :)

@tstuefe
Copy link
Member

tstuefe commented Sep 22, 2022

Good! But why make this conditional with a switch? Who would not want to have better error information?

Because I want to be able to test the generic error handling paths that would run on "generic" arch, without leaving the comfort of my x86_64 machine :)

:-) Okay. Like zero-in-zero.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

BTW, if you want to test error handling, you don't have to use Unsafe put, you can just use -XX:ErrorHandlerTest=number, e.g. 15 should give you a Segfault I believe. Only works in debug VMs though.

@openjdk
Copy link

openjdk bot commented Sep 22, 2022

@shipilev This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8294211: Zero: Decode arch-specific error context if possible

Reviewed-by: stuefe, luhenry

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Sep 22, 2022
@shipilev
Copy link
Member Author

I think this still works. Any other reviews, please?

@shipilev
Copy link
Member Author

I think this still works. Any other reviews, please?

Ping. :)

@shipilev
Copy link
Member Author

/integrate

@openjdk
Copy link

openjdk bot commented Oct 19, 2022

Going to push as commit 3f3d63d.
Since your change was applied there have been 48 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 19, 2022
@openjdk openjdk bot closed this Oct 19, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 19, 2022
@openjdk
Copy link

openjdk bot commented Oct 19, 2022

@shipilev Pushed as commit 3f3d63d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@shipilev shipilev deleted the JDK-8294211-zero-error-context branch October 21, 2022 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

3 participants