Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8310228: Improve error reporting for uncaught native exceptions on Windows #510

Closed
wants to merge 4 commits into from

Conversation

GoeLin
Copy link
Member

@GoeLin GoeLin commented Apr 19, 2024

A bugfix useful for windows.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • JDK-8310228 needs maintainer approval

Issue

  • JDK-8310228: Improve error reporting for uncaught native exceptions on Windows (Enhancement - P4 - Approved)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk21u-dev.git pull/510/head:pull/510
$ git checkout pull/510

Update a local copy of the PR:
$ git checkout pull/510
$ git pull https://git.openjdk.org/jdk21u-dev.git pull/510/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 510

View PR using the GUI difftool:
$ git pr show -t 510

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk21u-dev/pull/510.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Apr 19, 2024

👋 Welcome back goetz! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Apr 19, 2024

@GoeLin This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8310228: Improve error reporting for uncaught native exceptions on Windows

Reviewed-by: stuefe

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 13 new commits pushed to the master branch:

  • 7a400f2: 8309890: TestStringDeduplicationInterned.java waits for the wrong condition
  • 16ba673: 8331639: [21u]: Bump GHA bootstrap JDK to 21.0.3
  • 2b858f5: 8328938: C2 SuperWord: disable vectorization for large stride and scale
  • 9159882: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking
  • 3ff5359: 8330011: [s390x] update block-comments to make code consistent
  • abbad92: 8326201: [S390] Need to bailout cleanly if creation of stubs fails when code cache is out of space
  • 3770c28: 8331331: :tier1 target explanation in doc/testing.md is incorrect
  • 021372c: 8328703: Illegal accesses in Java_jdk_internal_org_jline_terminal_impl_jna_linux_CLibraryImpl_ioctl0
  • d459ae9: 8329850: [AIX] Allow loading of different members of same shared library archive
  • 835d016: 8330094: RISC-V: Save and restore FRM in the call stub
  • ... and 3 more: https://git.openjdk.org/jdk21u-dev/compare/3892078094735be9d8074d23ce3d70201cd60445...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot changed the title Backport 38bf1192b637cf3513cb25ac21f513bfb51cb55b 8310228: Improve error reporting for uncaught native exceptions on Windows Apr 19, 2024
@openjdk
Copy link

openjdk bot commented Apr 19, 2024

This backport pull request has now been updated with issue from the original commit.

@openjdk
Copy link

openjdk bot commented Apr 19, 2024

⚠️ @GoeLin This change is now ready for you to apply for maintainer approval. This can be done directly in each associated issue or by using the /approval command.

@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 19, 2024
@mlbridge
Copy link

mlbridge bot commented Apr 19, 2024

Webrevs

@Rigner
Copy link

Rigner commented Apr 20, 2024

Hey, sorry if it's not the right place for this, but I felt like I could add some more information to this bug.

I've been having this issue locally when trying to migrate some native code to Java 21+. I can even reproduce it on Java 22 / 23 when native code is throwing an Access Violation (C0000005).

Did we have confirmation that the issue was properly fixed for all cases with that commit ?

Here's my WinDbg output on Java 23 (same for 21/22):

ModLoad: 00007ff8`5fcf0000 00007ff8`5fd5a000   C:\Windows\system32\mswsock.dll
ModLoad: 00007ff8`5f9b0000 00007ff8`5f9eb000   C:\Windows\SYSTEM32\iphlpapi.dll
ModLoad: 00007ff8`5ff50000 00007ff8`5ff68000   C:\Windows\SYSTEM32\CRYPTSP.dll
ModLoad: 00007ff8`5f5d0000 00007ff8`5f604000   C:\Windows\system32\rsaenh.dll
ModLoad: 00007ff8`309c0000 00007ff8`309de000   C:\Program Files\Java\jdk-23\bin\java.dll
ModLoad: 00007ff8`61980000 00007ff8`61aab000   C:\Windows\System32\ole32.dll
ModLoad: 00007ff8`61d70000 00007ff8`620c3000   C:\Windows\System32\combase.dll
(f61c.10030): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
000001f4`0d33094e 8b06            mov     eax,dword ptr [rsi] ds:00000000`00000000=????????
0:004> g
ModLoad: 00007ff8`5e630000 00007ff8`5edce000   C:\Windows\SYSTEM32\windows.storage.dll
ModLoad: 00007ff8`5ff10000 00007ff8`5ff3e000   C:\Windows\SYSTEM32\Wldp.dll
ModLoad: 00007ff8`610c0000 00007ff8`6118d000   C:\Windows\System32\OLEAUT32.dll
ModLoad: 00007ff8`61510000 00007ff8`615bd000   C:\Windows\System32\SHCORE.dll
ModLoad: 00007ff8`625e0000 00007ff8`62635000   C:\Windows\System32\shlwapi.dll
ModLoad: 00007ff8`604f0000 00007ff8`60515000   C:\Windows\SYSTEM32\profapi.dll
ModLoad: 00007fff`f4350000 00007fff`f4427000   C:\Program Files\Java\jdk-23\bin\jsvml.dll
(f61c.10030): Stack overflow - code c00000fd (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
jvm!verify+0x89e57:
00007fff`845598b7 e8b4b30d00      call    jvm!AsyncGetCallTrace+0xcf5e0 (00007fff`84634c70)

That native code is working fine for Java 8 / 17 btw.

@openjdk openjdk bot removed the clean label Apr 29, 2024
@GoeLin
Copy link
Member Author

GoeLin commented Apr 29, 2024

GHA failure: Win build failed "onnection attempt failed: Connection refused: no further information" unrelated.

Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Did not look at the test, was that where the diff is? Since the rest of the patch appears clean, but PR is not labeled clean.

@tstuefe
Copy link
Member

tstuefe commented Apr 29, 2024

Obviously, the patch should be built and tested on Windows :)

@GoeLin
Copy link
Member Author

GoeLin commented May 2, 2024

Hi @Stuefe, good point windows is essential here :) I restarted the tests.
After the report of @Rigner I'm not so sure I should proceed with this.
What do you think, is this useful for 21?

@JornVernee
Copy link
Member

Hey, sorry if it's not the right place for this, but I felt like I could add some more information to this bug.

I've been having this issue locally when trying to migrate some native code to Java 21+. I can even reproduce it on Java 22 / 23 when native code is throwing an Access Violation (C0000005).

Did we have confirmation that the issue was properly fixed for all cases with that commit ?

Here's my WinDbg output on Java 23 (same for 21/22):

ModLoad: 00007ff8`5fcf0000 00007ff8`5fd5a000   C:\Windows\system32\mswsock.dll
ModLoad: 00007ff8`5f9b0000 00007ff8`5f9eb000   C:\Windows\SYSTEM32\iphlpapi.dll
ModLoad: 00007ff8`5ff50000 00007ff8`5ff68000   C:\Windows\SYSTEM32\CRYPTSP.dll
ModLoad: 00007ff8`5f5d0000 00007ff8`5f604000   C:\Windows\system32\rsaenh.dll
ModLoad: 00007ff8`309c0000 00007ff8`309de000   C:\Program Files\Java\jdk-23\bin\java.dll
ModLoad: 00007ff8`61980000 00007ff8`61aab000   C:\Windows\System32\ole32.dll
ModLoad: 00007ff8`61d70000 00007ff8`620c3000   C:\Windows\System32\combase.dll
(f61c.10030): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
000001f4`0d33094e 8b06            mov     eax,dword ptr [rsi] ds:00000000`00000000=????????
0:004> g
ModLoad: 00007ff8`5e630000 00007ff8`5edce000   C:\Windows\SYSTEM32\windows.storage.dll
ModLoad: 00007ff8`5ff10000 00007ff8`5ff3e000   C:\Windows\SYSTEM32\Wldp.dll
ModLoad: 00007ff8`610c0000 00007ff8`6118d000   C:\Windows\System32\OLEAUT32.dll
ModLoad: 00007ff8`61510000 00007ff8`615bd000   C:\Windows\System32\SHCORE.dll
ModLoad: 00007ff8`625e0000 00007ff8`62635000   C:\Windows\System32\shlwapi.dll
ModLoad: 00007ff8`604f0000 00007ff8`60515000   C:\Windows\SYSTEM32\profapi.dll
ModLoad: 00007fff`f4350000 00007fff`f4427000   C:\Program Files\Java\jdk-23\bin\jsvml.dll
(f61c.10030): Stack overflow - code c00000fd (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
jvm!verify+0x89e57:
00007fff`845598b7 e8b4b30d00      call    jvm!AsyncGetCallTrace+0xcf5e0 (00007fff`84634c70)

That native code is working fine for Java 8 / 17 btw.

If I modify the test code to throw an EXCEPTION_ACCESS_VIOLATION I still see an hs_err log being produced, just not containing the text "Internal Error". This matches what I remember from the debugging I did around the original fix as well.

The access violation in your log is expected, FWIW. It is always thrown at VM startup.

@Rigner
Copy link

Rigner commented May 2, 2024

If I modify the test code to throw an EXCEPTION_ACCESS_VIOLATION I still see an hs_err log being produced, just not containing the text "Internal Error". This matches what I remember from the debugging I did around the original fix as well.

The access violation in your log is expected, FWIW. It is always thrown at VM startup.

That makes sense since I was not able to debug that specific access violation.

I was able to pinpoint the cause of my crash, was because of some JNI FindClass() calls early in the startup, while some internal JDK classes were loaded (Calling FindClass() in agent ClassFileLoadHook hook to load a class as early as possible, works fine on JDK <= 17 but crashes on 18+ (don't ask me why we're doing this, I know it's awful)).

I'm not sure what signal / exception is actually being thrown but it's definitely triggering a stack overflow in the error reporting, so maybe one specific error isn't handled properly

@tstuefe
Copy link
Member

tstuefe commented May 2, 2024

Hi @Stuefe, good point windows is essential here :) I restarted the tests. After the report of @Rigner I'm not so sure I should proceed with this. What do you think, is this useful for 21?

@GoeLin
The fix seems correct, seems low risk and is somewhat useful.

AFAIU, the bug is: An uncaught signal that is not handled specially by the JVM (eg implicit null pointer segfaults) should lead to a proper hs-err file. But it can lead to incorrect or torn hs-err files, the latter if we run out of stack. AFAIU this happens more or less randomly, depending on the content of the SSE control register at the time of the crash.

Seeing that the risk is low, and that we were historically plagued by bad hs-err files on windows, I would fix it in 21. Possibly even in 17.

@Rigner @JornVernee maybe discuss follow-up issues in the JBS section of the original bug, or its original PR? These discussions are valuable and should not get lost in this backport PR.

@tstuefe
Copy link
Member

tstuefe commented May 3, 2024

P.S. with "somewhat useful" - not a native speaker - I meant it is useful but not an urgent fix.

@GoeLin
Copy link
Member Author

GoeLin commented May 3, 2024

Hi @tstuefe, thanks for your opinion. I will request approval.

@openjdk openjdk bot added the approval label May 3, 2024
@openjdk openjdk bot added ready Pull request is ready to be integrated and removed approval labels May 7, 2024
@GoeLin
Copy link
Member Author

GoeLin commented May 7, 2024

/integrate

@openjdk
Copy link

openjdk bot commented May 7, 2024

Going to push as commit ed2f5a8.
Since your change was applied there have been 13 commits pushed to the master branch:

  • 7a400f2: 8309890: TestStringDeduplicationInterned.java waits for the wrong condition
  • 16ba673: 8331639: [21u]: Bump GHA bootstrap JDK to 21.0.3
  • 2b858f5: 8328938: C2 SuperWord: disable vectorization for large stride and scale
  • 9159882: 8310513: [s390x] Intrinsify recursive ObjectMonitor locking
  • 3ff5359: 8330011: [s390x] update block-comments to make code consistent
  • abbad92: 8326201: [S390] Need to bailout cleanly if creation of stubs fails when code cache is out of space
  • 3770c28: 8331331: :tier1 target explanation in doc/testing.md is incorrect
  • 021372c: 8328703: Illegal accesses in Java_jdk_internal_org_jline_terminal_impl_jna_linux_CLibraryImpl_ioctl0
  • d459ae9: 8329850: [AIX] Allow loading of different members of same shared library archive
  • 835d016: 8330094: RISC-V: Save and restore FRM in the call stub
  • ... and 3 more: https://git.openjdk.org/jdk21u-dev/compare/3892078094735be9d8074d23ce3d70201cd60445...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 7, 2024
@openjdk openjdk bot closed this May 7, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 7, 2024
@openjdk
Copy link

openjdk bot commented May 7, 2024

@GoeLin Pushed as commit ed2f5a8.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport integrated Pull request has been integrated
4 participants