New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8267532: C2: Profile and prune untaken exception handlers #16416
Conversation
👋 Welcome back jvernee! A progress list of the required criteria for merging this PR into |
@JornVernee The following labels will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command. |
/label remove core-libs |
@JornVernee |
2744204
to
78837c4
Compare
Thanks for the clarifications. Spotted another inconsistency: C1 doesn't set Please, file a separate bug for it. The code in question is used in verification code and available only in debug VM. If the failures block this patch, you can comment out relevant asserts. They'll be re-enabled as part of the proper fix. |
Is it possible for C1 to see a monitorexit but not the earlier monitorenter? |
Hm, good question, Dean. I was under impression that there are no guarantees C1 visits all reachable bytecodes of a method during parsing, so |
I had a similar concern and also convinced myself it's not a problem for C1. If it was, we should have seen an assert in the loom code that depends on has_monitor(). |
|
I've removed the fix for the I've also added checks using P.S. it looks like the |
This reverts commit bee0553.
I also realize that what I said here is not quite true, as we mark the handler as entered in the deopt code. So, if we deopt once, we won't get an uncommon trap again. (there's a test case for that as well). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks very good.
I have been thinking about the following choices made in this PR:
- amount of profiling data: binary (seen vs not seen) vs integral (branch count)
- deoptimization action:
reinterpret
vsmade_not_entrant
- place where uncommon trap is inserted (
Parse
vsciTypeFlow
)
I haven't come with strong arguments to change any of these choices, so I'm the patch as it is now. We can adjust them later as follow-up enhancements if we decide to do so.
On naming: ex_handler
is used only once - GraphKit::has_ex_handler()
. Everywhere else in the code base exception_handler
is used. Please, align the naming. Feel free to adjust GraphKit::has_ex_handler()
.
The tests are very nice! Can you, please, point me to the test case which covers profiling in interpreter?
Done.
I've added 2 more test cases that target interpreter profiling specifically. See: dfd5da1 |
Another round of tier 1 - 8 testing came back clean. I'm planning to integrate the patch tomorrow. |
/integrate |
Going to push as commit a5ccd3b.
Your commit was automatically rebased without conflicts. |
@JornVernee Pushed as commit a5ccd3b. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Mailing list message from Vitaly Provodin on hotspot-dev: Hi all, With the latest changes I got the following error =======================8<---------------------- Here is my build environment Configuration summary: Tools summary: Could you please clarify how to overcome this issue? Thanks, -------------- next part -------------- |
Mailing list message from Vladimir Kozlov on hotspot-dev: I hit is too on my Mac, I filed following bug and assigned to Jorn. https://bugs.openjdk.org/browse/JDK-8321141 Note, I checked and all testing passed when 8267532 was reviewed. Thanks, On 11/30/23 4:47 AM, Vitaly Provodin wrote:
|
Mailing list message from Jorn Vernee on hotspot-dev: Hello, It seems to be an issue with XCode 12.2 not supporting the [[noreturn]] Jorn [1]: https://github.com/openjdk/jdk/blob/master/doc/building.md#macos On 30/11/2023 19:48, Vladimir Kozlov wrote:
|
The issue is essentially that for the Java try-with-resource construct, javac generates multiple calls to
close(
) the resource. One of those calls is inside the hidden exception handler of the try block. The issue for us is that typically the exception handler is never entered (since no exception is thrown), however we don't profile exception handlers at the moment, so the block is not pruned. C2 doesn't inline theclose()
call in the handler due to low call site frequency. As a result, the receiver of that call escapes and can not be scalar replaced, which then leads to a loss in performance.There has been some discussion on the JBS issue that this could be fixed by profiling catch blocks. And another suggestion that partial escape analysis could help here to prevent the object from escaping. But, I think there are other benefits to being able to prune dead catch blocks, such as general reduction in code size, and other optimizations being possible by dead code being eliminated. So, I've implemented catch block profiling + pruning in this patch.
The implementation is essentially very straightforward: we allocate an extra bit of profiling data for each
exception handler of a method in the
MethodData
for that method (which holds all the profilingdata). Then when looking up the exception handler after an exception is thrown, we mark the
exception handler as entered. When C2 parses the exception handler block, and it sees that it has
never been entered, we emit an uncommon trap instead.
I've also cleaned up the handling of profiling data sections a bit. After adding the extra section of data to MethodData, I was seeing several crashes when ciMethodData was used. The underlying issue seemed to be that the offset of the parameter data was computed based on the total data size - parameter data size (which doesn't work if we add an additional section for exception handler data). I've re-written the code around this a bit to try and prevent issues in the future. Both MethodData and ciMethodData now track offsets of parameter data and exception handler data, and the size of the each data section is derived from the offsets.
Finally, there was an assert firing in
freeze_internal
incontinuationFreezeThaw.cpp
:This assert relies on
has_monitors
being set for a method, which in itself relies onmonitorenter
andmonitorexit
being parsed. However, if we prune untaken exception handlers, we might not see anymonitorexit
, which is a problem for OSR compilations since then we might also not see anymonitorenter
. After some investigation, it turns out thatciMethod
already tracks whether monitor bytecodes are being used, so we can just piggyback on that instead of relying onmonitorenter
ormonitorexit
being parsed. We can follow the existing pattern for howhas_reserved_stack_access
is being tracked (which I've done). See a484206 a33a905 and d727df7Benchmark with
-XX:-PruneDeadExceptionHandlers
:with
-XX:+PruneDeadExceptionHandlers
:Note that with the optimization turned on, timing and
gc.alloc.rate.norm
is ~equal.I also noticed through other experiments that C2's ability to inline improves, due to
inline_instructions_size
being reduced for methods with untaken exception handlers, which might bring the size underInlineSmallCode
, and allow the method to be inlined again.Finally, I've changed all the foreign benchmarks to use try-with-resources where they were working around this issue, and verified that allocations go down when turning on the optimization.
Testing : tier 1-6. Local run of
hotspot_compiler
suite with-XX:+DeoptimizeALot
and with-XX:+StressPrunedExceptionHandlers
.Special thanks to Tobias Hartmann, and Vladimir Ivanov for the discussion during the design process of this patch.
Progress
Issues
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16416/head:pull/16416
$ git checkout pull/16416
Update a local copy of the PR:
$ git checkout pull/16416
$ git pull https://git.openjdk.org/jdk.git pull/16416/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 16416
View PR using the GUI difftool:
$ git pr show -t 16416
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16416.diff
Webrev
Link to Webrev Comment