New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8274986: max code printed in hs-err logs should be configurable #5875
Conversation
|
b69792c
to
a72e8bc
Compare
@@ -1336,6 +1336,10 @@ const intx ObjectAlignmentInBytes = 8; | |||
develop(intx, StackPrintLimit, 100, \ | |||
"number of stack frames to print in VM-level stack dump") \ | |||
\ | |||
develop(int, ErrorLogPrintCodeLimit, 3, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be at least diagnostic? develop flags are fairly useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose develop to be consistent with StackPrintLimit
.
What's the process for adding a diagnostic flag @dholmes-ora ? Should we change both StackPrintLimit
and ErrorLogPrintCodeLimit
to be diagnostic? I agree with Tom that having these as develop flags is of limited use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what the history of StackPrintLimit
is but there doesn't seem to be any call to have it be a non-develop flag. I'm assuming 100 is plenty and noone ever needs to change it.
For the new flag, develop is pretty useless if the issue is that the default value may not show enough information. Making it diagnostic seems reasonable. If people hit an issue and the default value doesn't show enough information again then they can change the value and hopefully diagnose the problem further.
David
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's just make ErrorLogPrintCodeLimit
be diagnostic. Is there more process for this apart from simply making the change in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the change in the PR is needed - no CSR request for diagnostic flags.
nm->print_nmethod(verbose); | ||
if (st == tty) { | ||
nm->print_nmethod(verbose); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be something like this:
ResourceMark rm;
st->print(INTPTR_FORMAT " is at entry_point+%d in (nmethod*)" INTPTR_FORMAT,
p2i(addr), (int)(addr - nm->entry_point()), p2i(nm));
- if (verbose) {
- st->print(" for ");
- nm->method()->print_value_on(st);
- }
st->cr();
- nm->print_nmethod(verbose);
+ if (verbose && st == tty) {
+ nm->print_nmethod(verbose);
+ } else {
+ nm->print_on(st);
+ }
return;
}
st->print_cr(INTPTR_FORMAT " is at code_begin+%d in ", p2i(addr), (int)(addr - code_begin()));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so. As far as I can see, the intent is to have a single line of output for an nmethod followed by an optional " for ..." suffix when verbose == true
.
The nm->print_nmethod(verbose)
call then prints extra multi-line detail for the nmethod with the number of lines printed governed by verbose
.
This code seems like it went from being hard coded to always go to tty
and was then parameterized to go to an arbitrary stream but the evolution accidentally overlooked some code that still goes to tty
.
I don't want to make extensive changes here as there really should be a single effort to rationalize all dumping to ensure it's parameterized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree you might not want to bite this off in this PR, but this piece of code is reason you commonly see random nmethods appearing on the tty just before a crash. verbose is only ever true when called from findpc in debug.cpp. All the other non-verbose work print_nmethod does is useless, like writing to the xtty, and otherwise boils down to nm->print_on(tty). But all these printing paths could use a rework so I'm fine if you skip it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you convinced me in a side conversation, your suggestion has the advantage that we now at least get this output in a hs_err file instead of loosing it altogether.
Webrevs
|
@dougxc This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 1 new commit pushed to the
Please see this link for an up-to-date comparison between the source branch of this pull request and the
|
…tack for code to print
…ead of tty in non-verbose mode
Hi Doug,
A couple of minor comments/queries but otherwise this seems okay.
Thanks,
David
@@ -34,6 +34,7 @@ | |||
#include "gc/shared/referenceProcessor.hpp" | |||
#include "oops/markWord.hpp" | |||
#include "runtime/task.hpp" | |||
#include "utilities/vmError.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the definition of VMError::max_error_log_print_code
to be available just like task.hpp
is included to make PeriodicTask::min_interval
(used in defining the range of PerfDataSamplingInterval
) available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing there is some macro weirdness involved when defining the flag in globals.hpp.
// Even though ErrorLogPrintCodeLimit is ranged checked | ||
// during argument parsing, there's no way to prevent it | ||
// being set to a value outside the range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean here. Do we not abort the VM if the flag value is out-of-range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ErrorLogPrintCodeLimit
is a writable global variable so this is just being extra defensive should anything update ErrorLogPrintCodeLimit
(e.g. ErrorLogPrintCodeLimit == 100;
) after argument parsing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I clarified the comment: 29060fc
if (limit > 0) { | ||
// Scan the native stack | ||
if (!_print_native_stack_used) { | ||
// Only try print code of the crashing frame since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing typo: s/try/try to/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
/integrate |
Going to push as commit 33050f8.
Your commit was automatically rebased without conflicts. |
// Even though ErrorLogPrintCodeLimit is ranged checked | ||
// during argument parsing, there's no way to prevent it | ||
// subsequently (i.e., after parsing) being set to a | ||
// value outside the range. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is overly defensive IMO. Flags should never be touched after initialization is complete and we assume we can trust ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I agree but in error reporting code I get extra defensive plus the defensive code is small and not on a hot path.
That said, I won't object to it being undone in a subsequent PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the point of MIN2 here is to handle ErrorLogPrintCodeLimit < max. IMHO ErrorLogPrintCodeLimit > max would be just an assert-worthy error, since as David wrote flag values should not be changed after initialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point is to ensure that we don't run off the end of the stack allocated printed
array in the (granted, unlikely) case that ErrorLogPrintCodeLimit
is (accidentally) updated after arg parsing .
I'm not sure an assert is the best thing as it would cause error reporting to recurse.
Maybe I was being too defensive but I figured the overhead is negligible so why not be ultra-safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recursive asserts would be caught by secondary error handling and show up as "Error occurred during error reporting" printout. Not ideal, but at least won't endanger the rest of the printing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but in a production scenario (which is where robust error reporting is critical), the assert is ignored and we end up with potential buffer overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be a guarantee then.
This PR adds a
ErrorLogPrintCodeLimit
(develop) option for configuring the amount of code printed in a hs-err log file. There's a hard limit of 10 so that the buffer used to avoid duplicates inVMError::print_code
is stack allocated.In addition, the Java stack is also scanned when considering code to print as the native stack may not have any Java compiled frames. For example, a transition into the VM through a RuntimeStub can prevent the native stack walk from seeing the frames above the stub.
The MachCodeFramesInErrorFile test has been made more robust in terms of validating its expectations of how C2 intrinsifies methods.
There's one other minor change to address this comment.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5875/head:pull/5875
$ git checkout pull/5875
Update a local copy of the PR:
$ git checkout pull/5875
$ git pull https://git.openjdk.java.net/jdk pull/5875/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 5875
View PR using the GUI difftool:
$ git pr show -t 5875
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5875.diff