Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8309613: [Windows] hs_err files sometimes miss information about the code containing the error #14358

Closed

Conversation

TheRealMDoerr
Copy link
Contributor

@TheRealMDoerr TheRealMDoerr commented Jun 7, 2023

We have seen hs_err files for errors triggered by C2 compiled methods which miss the most relevant information: the C2 method (see JBS issue for more details). I have found a possibility to add it. Please take a look and provide feedback.

Testing:

diff --git a/src/hotspot/share/opto/parse1.cpp b/src/hotspot/share/opto/parse1.cpp
index f179d3ba88d..c35a1ac595e 100644
--- a/src/hotspot/share/opto/parse1.cpp
+++ b/src/hotspot/share/opto/parse1.cpp
@@ -1210,6 +1210,12 @@ void Parse::do_method_entry() {
     make_dtrace_method_entry(method());
   }
 
+  if (UseNewCode) {
+    Node* halt = _gvn.transform(new HaltNode(control(), frameptr(), "Requested Halt!"));
+    C->root()->add_req(halt);
+    set_control(halt);
+  }
+
 #ifdef ASSERT
   // Narrow receiver type when it is too broad for the method being parsed.
   if (!method()->is_static()) {

"java -XX:+UseNewCode -version" shows the following output (when no hsdis lib is provided):

---------------  T H R E A D  ---------------

Current thread (0x0000024daebb2b30):  JavaThread "main"             [_thread_in_Java, id=30876, stack(0x000000cdacc00000,0x000000cdacd00000) (1024K)]

Stack: [0x000000cdacc00000,0x000000cdacd00000]
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [jvm.dll+0x6ca5b9]  os::win32::platform_print_native_stack+0xd9  (os_windows_x86.cpp:236)
V  [jvm.dll+0x8a3af1]  VMError::report+0xd61  (vmError.cpp:991)
V  [jvm.dll+0x8a5d6e]  VMError::report_and_die+0x5fe  (vmError.cpp:1797)
V  [jvm.dll+0x283061]  report_fatal+0x71  (debug.cpp:212)
V  [jvm.dll+0x621c3e]  MacroAssembler::debug64+0x8e  (macroAssembler_x86.cpp:829)
C  0x0000024dc1553cf4

The last pc belongs to nmethod (printed below).

Compiled method (c2)      92   16       4       java.lang.Object::<init> (1 bytes)
 total in heap  [0x0000024dc1553b10,0x0000024dc1553d50] = 576
 relocation     [0x0000024dc1553c70,0x0000024dc1553c88] = 24
 main code      [0x0000024dc1553ca0,0x0000024dc1553d00] = 96
 stub code      [0x0000024dc1553d00,0x0000024dc1553d18] = 24
 metadata       [0x0000024dc1553d18,0x0000024dc1553d20] = 8
 scopes data    [0x0000024dc1553d20,0x0000024dc1553d28] = 8
 scopes pcs     [0x0000024dc1553d28,0x0000024dc1553d48] = 32
 dependencies   [0x0000024dc1553d48,0x0000024dc1553d50] = 8

[Constant Pool (empty)]

[MachCode]
[Entry Point]
  # {method} {0x0000000800478d78} '<init>' '()V' in 'java/lang/Object'
  #           [sp+0x20]  (sp of caller)
  0x0000024dc1553ca0: 448b 5208 | 49bb 0000 | 0000 0800 | 0000 4d03 | d349 3bc2 

  0x0000024dc1553cb4: ;   {runtime_call ic_miss_stub}
  0x0000024dc1553cb4: 0f85 c6a9 | 8fff 6690 | 0f1f 4000 
[Verified Entry Point]
  0x0000024dc1553cc0: 4881 ec18 | 0000 0048 | 896c 2410 | 4181 7f20 | 0100 0000 | 0f85 1b00 

  0x0000024dc1553cd8: ;   {external_word}
  0x0000024dc1553cd8: 0000 48b9 | 98e7 0702 | fd7f 0000 | 4883 e4f0 

  0x0000024dc1553ce8: ;   {runtime_call MacroAssembler::debug64}
  0x0000024dc1553ce8: 48b8 b01b | c201 fd7f | 0000 ffd0 

  0x0000024dc1553cf4: ;   {runtime_call StubRoutines (final stubs)}
  0x0000024dc1553cf4: f4e8 0622 | 8eff e9db | ffff fff4 
[Exception Handler]
  0x0000024dc1553d00: ;   {no_reloc}
  0x0000024dc1553d00: e97b c99d | ffe8 0000 | 0000 4883 

  0x0000024dc1553d0c: ;   {runtime_call DeoptimizationBlob}
  0x0000024dc1553d0c: 2c24 05e9 | 0c40 90ff | f4f4 f4f4 
[/MachCode]


---------------  P R O C E S S  ---------------

Without the patch, we don't see which compiled method called this at all. An example from a real C2 bug is here: https://bugs.openjdk.org/secure/attachment/103766/hs_err_pid7652.log


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8309613: [Windows] hs_err files sometimes miss information about the code containing the error (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/14358/head:pull/14358
$ git checkout pull/14358

Update a local copy of the PR:
$ git checkout pull/14358
$ git pull https://git.openjdk.org/jdk.git pull/14358/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 14358

View PR using the GUI difftool:
$ git pr show -t 14358

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/14358.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 7, 2023

👋 Welcome back mdoerr! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 7, 2023
@openjdk
Copy link

openjdk bot commented Jun 7, 2023

@TheRealMDoerr The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Jun 7, 2023
@mlbridge
Copy link

mlbridge bot commented Jun 7, 2023

Webrevs

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC the basic fix here is track the last pc that was found before things "got stuck" so we print it. Though I'm unclear on the details - didn't we already print this pc in the stack trace as it was the last good pc? If so what does the new print_code show in addition?

An example of the before/after output in the hs_err file would be helpful.

Thanks.

Comment on lines 679 to 680
const int printed_capacity = max_error_log_print_code;
address printed[printed_capacity];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this buffer get reused/overwritten by the "printing code blobs" logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of the buffer is to keep track of what has already been printed. We don't want to print the same code again if we encounter the address again. This can be used across several error reporting steps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I just wanted to make sure the different usages don't interfere with each other.

// We have printed the native stack in platform-specific code
// Windows/x64 needs special handling.
// Stack walking may got stuck. Try to print the calling code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/got/get/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@TheRealMDoerr
Copy link
Contributor Author

I noticed that we should check the result of print_code and update printed_len. Fixed with 2nd commit. Example output added to the description.

@TheRealMDoerr
Copy link
Contributor Author

/label add hotspot-compiler

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Jun 9, 2023
@openjdk
Copy link

openjdk bot commented Jun 9, 2023

@TheRealMDoerr
The hotspot-compiler label was successfully added.

@dholmes-ora
Copy link
Member

C  0x000001635fe021f4

called by the following code:

So does everything after called by the following code relate to the frame C 0x000001635fe021f4? If so I'd like to see this delineated more clearly as presently it would appear very disruptive when reading the initial sections of the hs_err file.

Thanks

@TheRealMDoerr
Copy link
Contributor Author

Hi David, I've moved the verbose parts to the step "printing code blobs if possible" and updated the example output. In this particular case, the steps between the native stack and my new code don't print anything, so, the output looks very similar.

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I never come across hs_err logs with this kind of detail as I was expecting to see something between the "T H R E A D" and "P R O C E S S" sections.

A couple of minor nits but otherwise I guess this is okay.

Thanks.

print_code(st, _thread, lastpc, true, printed, printed_capacity);
const char* name = find_code_name(lastpc);
if (name != nullptr) {
st->print_cr("the last pc belongs to %s (will be printed below)", name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/the/The
s/will be//

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not print the nmethod right here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my earlier comments. I was looking for some kind of delineation of all this additional stuff so that it would be easier to see where it fits into things. Not sure we've really achieved that regardless.

@openjdk
Copy link

openjdk bot commented Jun 13, 2023

@TheRealMDoerr This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8309613: [Windows] hs_err files sometimes miss information about the code containing the error

Reviewed-by: dholmes, stuefe

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 77 new commits pushed to the master branch:

  • 4f23fc1: 8309671: Avoid using jvmci.Compiler property to determine if Graal is enabled
  • 1a9edb8: 8309838: Classfile API Util.toBinaryName and other cleanup
  • f7de726: 8295555: Primitive wrapper caches could be @Stable
  • 5d71612: 8309852: G1: Remove unnecessary assert_empty in G1ParScanThreadStateSet destructor
  • 23a54f3: 8309538: G1: Move total collection increment from Cleanup to Remark
  • 57fc9a3: 8309763: Move tests in test/jdk/sun/misc/URLClassPath directory to test/jdk/jdk/internal/loader
  • 2dca5ae: 8299052: ViewportOverlapping test fails intermittently on Win10 & Win11
  • 3028295: 8309065: Move the logic to determine archive heap location from CDS to G1 GC
  • 80a8144: 8309867: redundant class field RSAPadding.md
  • 3c15ad8: 8307144: namedParams in XECParameters and EdDSAParameters can be private final
  • ... and 67 more: https://git.openjdk.org/jdk/compare/5b147eb5e46ac7fa637ed997c6da8f238f685ea4...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 13, 2023
Copy link
Member

@tstuefe tstuefe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Small nits only, up to you.

print_code(st, _thread, lastpc, true, printed, printed_capacity);
const char* name = find_code_name(lastpc);
if (name != nullptr) {
st->print_cr("the last pc belongs to %s (will be printed below)", name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not print the nmethod right here?

@@ -1008,7 +1008,7 @@ class os: AllStatic {

public:
inline static bool platform_print_native_stack(outputStream* st, const void* context,
char *buf, int buf_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, can we make this not a reference, but a pointer, and make it optional and possibly default it to nullptr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, we should probably name it lastpc_ptr? Would that really be better? I'm not conviced.

@TheRealMDoerr
Copy link
Contributor Author

Thanks for the reviews!

@TheRealMDoerr
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Jun 14, 2023

Going to push as commit bd79db3.
Since your change was applied there have been 96 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 14, 2023
@openjdk openjdk bot closed this Jun 14, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Jun 14, 2023
@openjdk
Copy link

openjdk bot commented Jun 14, 2023

@TheRealMDoerr Pushed as commit bd79db3.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@TheRealMDoerr TheRealMDoerr deleted the 8309613_Windows_hs_err branch June 14, 2023 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
3 participants