-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8368787: Error reporting: hs_err files should show instructions when referencing code in nmethods #27530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8368787: Error reporting: hs_err files should show instructions when referencing code in nmethods #27530
Conversation
… referencing code in nemthods
|
👋 Welcome back mdoerr! A progress list of the required criteria for merging this PR into |
|
@TheRealMDoerr This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be: You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been no new commits pushed to the ➡️ To integrate this PR with the above commit message to the |
|
@TheRealMDoerr The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
|
Looks like a useful diagnostic tool. At very least we should dump the raw instruction stream around that pc, like we do for Relocation trick is cute, but it hinges on assumption that relocations are always pointing at instruction boundary. I looked around and I think while most relocs are that way, there are some relocs that do not follow this rule. For example: I am guessing it is still fine to attempt to disassemble in this case. |
|
Thanks for looking at this PR! Should we print both, hex dump and disassembly? Interesting. I haven't tried with ZGC. Did you find more relocations which don't point to an instruction start? diff --git a/src/hotspot/share/code/codeBlob.cpp b/src/hotspot/share/code/codeBlob.cpp
index 6511b4689ed..2e4d49a81c1 100644
--- a/src/hotspot/share/code/codeBlob.cpp
+++ b/src/hotspot/share/code/codeBlob.cpp
@@ -52,6 +52,9 @@
#ifdef COMPILER1
#include "c1/c1_Runtime1.hpp"
#endif
+#if defined(AMD64) && INCLUDE_ZGC
+#include "gc/z/zBarrierSetAssembler.hpp"
+#endif
#include <type_traits>
@@ -919,6 +922,10 @@ void CodeBlob::dump_for_addr(address addr, outputStream* st, bool verbose) const
// disassemble correctly at instruction start addresses.)
RelocIterator iter(nm, start);
while (iter.next() && iter.addr() < addr) { // find relocation before addr
+#if defined(AMD64) && INCLUDE_ZGC
+ // There's a relocation which doesn't point to an instruction start:
+ if ((iter.type() != relocInfo::barrier_type) || (iter.format() != ZBarrierRelocationFormatStoreGoodAfterMov))
+#endif
start = iter.addr();
}
if (iter.has_current()) { |
Yes, I think if we know the location is within nmethod, it makes sense to dump around the location. I think hex dump is most bullet-proof, as we can always disassemble offline it at different offsets. I don't think we want to specialize for reloc types, it does not gain us much? Also, relocs solve the variable-sized encoding only if you are lucky to hit the reloc right at the location you are decoding, right? Anything in between relocs is still pretty foggy. I suspect current patch would work in 99% of the cases, as it is hard to imagine e.g. the value in the register that points into nmethod and does not have some sort of reloc. Then I also suspect that disassemblers actually able to figure the instruction boundaries pretty well? Because I don't quite see how our usual printout of |
Right. The disassembler produces garbage if it starts disassembling somewhere besides the correct instruction start (on x86).
Yeah, nmethods don't contain a lot of data which is something else than valid instructions. So, most of the time, disassembly works as proposed here, but we may still produce garbage in rare cases. That's not a big problem if we have the hex dump.
|
|
@shipilev: I have made some improvements after your feedback. Please take another look! Thanks! |
|
Updated example output above. We could potentially dump more code if |
|
Could the new 'nmethod::print_code_snippet' coding crash under some bad circumstances ? If so, maybe we might need a way to disable it easily. |
Unlikely, and if it does, we still have the raw values and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. This looks useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine. I see no clear point in referencing ZBarrierRelocationFormatStoreGoodAfterMov in the comments, but I have no strong opinion either.
| if (iter.addr() == addr) iter.next(); // find relocation after addr | ||
| if (iter.has_current()) end = iter.addr(); | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, the size of the printout is somewhat random. In the extreme cases, this may be either (close to) start-of-method to end-of-method, so almost the whole method. Or, it may be from an address very close to the address, so a very small snippet.
Tying the end address to a relocation is not strictly necessary, no? We could just print to `MIN2(code end, addr + 64)? Disassembler should be fine if the printout stops in the middle of an instruction, as long as instruction addresses are correct?
And could we start printing at the relocation preceding-or-at addr - 64 instead, to ensure we have at least 64 bytes of printout before the crash address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the size is somewhat random. Relocations seem to be the most fine-grained information we currently have. In addition, they typically point to some meaningful points in the code. This PR disassembles the smallest possible snippet around the given address using relocations as start and end.
Right, having a relocation as end address is technically not strictly required. However, I've seen that the disassembler on x86 produced garbage as well when the end is not an instruction boundary.
I agree with you that we usually want at least 64 Bytes ahead. On the other hand, some people don't want too much, either. See JDK-8274986.
So, I changed only the hex dump for which we can afford printing more without bloating the hs_err file too much. Please take a look at my new commit.
Btw. addr is typically not a crash address, but one which is referenced by a register and somehow related to a crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for taking my comment into account.
|
Thanks for the reviews! |
|
/integrate |
|
Going to push as commit b31bbfc.
Your commit was automatically rebased without conflicts. |
|
@TheRealMDoerr Pushed as commit b31bbfc. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
We'd like to have a little more information in hs_err files in the following scenario: The VM crashes in code which does something with an nmethod. We often have a register pointing into code of the nmethod, but the nmethod is not disassembled in the hs_err file because the crash happens outside of it.
We can disassemble some instructions around the address inside the nmethod code. This is tricky on platforms which have variable length instructions (like x86). We need to find correct instruction start addresses. I'm proposing to use relocations for this purpose. There are usually enough of them distributed over the nmethod and they point to instruction start addresses.
I've tested this proposal by the following code on x86_64:
The output is (requires hsdis library, otherwise we only get the hex dump):
Feedback and further improvement suggestions are welcome.
Progress
Issue
Reviewers
Reviewing
Using
gitCheckout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/27530/head:pull/27530$ git checkout pull/27530Update a local copy of the PR:
$ git checkout pull/27530$ git pull https://git.openjdk.org/jdk.git pull/27530/headUsing Skara CLI tools
Checkout this PR locally:
$ git pr checkout 27530View PR using the GUI difftool:
$ git pr show -t 27530Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/27530.diff
Using Webrev
Link to Webrev Comment