-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8242181: [Linux] Show source information when printing native stack traces in hs_err files #7126
Conversation
👋 Welcome back chagedorn! A progress list of the required criteria for merging this PR into |
@chhagedorn The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
/label remove build |
@chhagedorn |
…tion of algorithm
…atrix row, using existing source information for internal errors in first frame.
…more flexible and pass if no libjvm.so debug symbols are found which might not always be there
I've filed JDK-8287021 to later add support for DWARF 5. To bring up again some general points that are left to decide upon/fix:
@tstuefe: Is this still an issue with the latest commit which added I'm currently running some testing with a merge of latest JDK which is looking good so far. |
Issues seem to have disappeared. I'll schedule some more tests next week to be sure. Cheers, Thomas |
Thanks Thomas for doing the testing! |
Hi Christian, I can see no problems on ppcle attributable to your test. However, I may overlook something since tests are still failing all over the place because of loom. I see test errors in TestDwarf on ppcle and x64, however. On x64:
|
After getting some more logs from @tstuefe, it turns out that they are using GCC 8 in the testing which emits the The In this process, I've also changed I've also decided to get rid of UL as discussed before and replaced it by |
@chhagedorn Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information. |
@chhagedorn This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me too.
We are a bit strapped for review- and test-cycles atm, therefore please go ahead and push it. This is a really useful change. If problems show up on some of our weird architectures or old Linux distros, we will deal with it then.
Cheers, Thomas
Thanks Thomas for your review and assessment! That sounds good. I will give it another spin in our testing. Now that UL was replaced by Thanks, |
Testing in our CI looked good. I've also tested it again locally on x86_32. I will integrate this tomorrow morning. |
Thanks everybody for your feedback, comments, and reviews! /integrate |
Going to push as commit 13c0369.
Your commit was automatically rebased without conflicts. |
@chhagedorn Pushed as commit 13c0369. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
When printing the native stack trace on Linux (mostly done for hs_err files), it only prints the method with its parameters and a relative offset in the method:
This makes it sometimes difficult to see where exactly the methods were called from and sometimes almost impossible when there are multiple invocations of the same method within one method.
This patch improves this by providing source information (filename + line number) to the native stack traces on Linux similar to what's already done on Windows (see JDK-8185712):
For Linux, we need to parse the debug symbols which are generated by GCC in DWARF - a standardized debugging format. This patch adds support for DWARF 4, the default of GCC 10.x, for 32 and 64 bit architectures (tested with x86_32, x86_64 and AArch64). DWARF 5 is not supported as it was still experimental and not generated for HotSpot. However, newer GCC version may soon generate DWARF 5 by default in which case this parser either needs to be extended or the build of HotSpot configured to only emit DWARF 4.
The code follows the parsing steps described in the official DWARF 4 spec: https://dwarfstd.org/doc/DWARF4.pdf
I added references to the corresponding sections throughout the code. However, I tried to explain the steps from the DWARF spec directly in the code (method names, comments etc.). This allows to follow the code without the need to actually deep dive into the spec.
The comments at the
Dwarf
class in theelf.hpp
file explain in more detail how a DWARF file is structured and how the parsing algorithm works to get to the filename and line number information. There are more class comments throughout theelf.hpp
file about how different DWARF sections are structured and how the parsing algorithm needs to fetch the required information. Therefore, I will not repeat the exact workings of the algorithm here but refer to the code comments. I've tried to add as much information as possible to improve the readability.Generally, I've tried to stay away from adding any assertions as this code is almost always executed when already processing a VM error. Instead, the DWARF parser aims to just exit gracefully and possibly omit source information for a stack frame instead of risking to stop writing the hs_err file when an assertion would have failed. To debug failures,
-Xlog:dwarf
can be used withinfo
,debug
ortrace
which provides logging messages throughout parsing.Testing:
Apart from manual testing, I've added two kinds of tests:
get_source()
method which initiates DWARF parsing. Tested some special cases, for example, having a buffer that is not big enough to store the filename.On top of that, there are also existing JTreg tests that call
-XX:NativeMemoryTracking=detail
which will print a native stack trace with the new source information. These tests were also run as part of the standard tier testing and can be considered as sanity tests for this implementation.To make tests work in our infrastructure or if some other setups want to have debug symbols at different locations, I've added support for an additional
_JVM_DWARF_PATH
environment variable. This variable can specify a path from which the DWARF symbol file should be read by the parser if the default locations do not contain debug symbols (required somemake
changes). This is similar to what's done on Windows with_NT_SYMBOL_PATH
. The JTreg test, however, also works if there are no symbols available. In that case, the test just skips all the assertion checks for the filename and line number.I haven't run any specific performance testing as this new code is mainly executed when an error will exit the VM and only if symbol files are available (which is normally not the case when using Java release builds as a user).
Special thanks to @tschatzl for giving me some pointers to start based on his knowledge from a DWARF 2 parser he once wrote in Pascal and for discussing approaches on how to retrieve the source information and to @erikj79 for providing help for the changes required for
make
!Thanks,
Christian
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/7126/head:pull/7126
$ git checkout pull/7126
Update a local copy of the PR:
$ git checkout pull/7126
$ git pull https://git.openjdk.org/jdk pull/7126/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 7126
View PR using the GUI difftool:
$ git pr show -t 7126
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/7126.diff