-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8321509: False positive in get_trampoline fast path causes crash #19796
Conversation
👋 Welcome back dlong! A progress list of the required criteria for merging this PR into |
@dean-long This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 195 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. ➡️ To integrate this PR with the above commit message to the |
@dean-long The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
AArch64 binds some trampoline call-sites early, thanks to its is_always_within_branch_range() check. This allows a false positive match with a trampoline stub during code buffer expansion in rare situations. To fix this, this PR makes the following changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
Thanks Vladimir. |
I am hoping an AArch64 expert can take a look at this. @theRealAph maybe? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an AArch64 expert but this fix looks good to me.
This solution looks ok to me as far as jdk mainline is concerned. However, I think there is a problem as far Leyden is concerned. The code changes Relocation::pd_call_destination to always expect its associated call to be embedded within an nmethod when orig_addr is null (i.e. when it is called with no args as reloc.pd_call_destination()). This is where the problem arises. Currently, Leyden calls Relocation::pd_call_destination() from CallRelocation::destination() (and also from trampoline_stub_Relocation::destination()) when storing an nmethod to the CDS code cache. It needs to do this in order to be able to track relocs of type virtual_call_type, opt_virtual_call_type, static_call_type and runtime_call_type (also trampoline_stub_type). That is because all these relocs need their call destination to be adjusted when the nmethod is restored from the CDS code cache. However, we already have prototype code in Leyden to store generated blobs to the CDS code cache. These blobs may legitimately include runtime_call_type relocs which also need tracking and adjusting at restore. For example, shared runtime or compiler stubs may call out to the JVM. Likewise, stubs in a stub generator blob may need to call out to the JVM or to a stub in some earlier generated blob. So, Leyden will need to call CallRelocation::destination() in cases where the associated call is embedded in a non-nmethod. Note that these calls will never employ trampolines. The obvious fix is to modify Relocation::pd_call_destination so that it drops through to call MacroAssembler::pd_call_destination if the incoming blob is not an nmethod. |
@adinn is right. I thought that it mostly affect code during codeBlob expansion but it is not.
|
To fix Leyden premain, I'd suggest to change the "nmethod expected" assert in NativeCall::destination() into conditional code that returns the "raw" destination if it is not an nmethod, and optionally restore the following performance optimization (with a comment as suggested by Vladimir):
But I don't have a strong opinion on whether it should be fixed here or only in Leyden. |
Yes, I agree that will work as a solution. I would recommend making this change in main. I think it is reasonable to expect NativeCall::destination() to be able to access the target for any instruction that can be viewed as a NativeCall, irrespective of whether it is embedded in an nmethod or some other blob. Clearly, the assert confirms that the current mainline code does not use it for anything other than an nmethod but there is nothing to say that it has to remain that way. Leyden is just one potential case where we might want to use it for some other blob. |
@vnkozlov I was more right than I even realized! I was only concerned about generated stubs but, as you point out, we will also call NativeCall::destination() when processing a native call from JITted method code while it is still residing in a CodeBuffer. |
Restoring performance check in
@dean-long , I think it should be added in your changes. @adinn, I suggest you to test these changes with leyden changes for stubs. |
@vnkozlov I applied @dean-long's patch to my Leyden premain repo that saves and restores generated stubs. Without the above extra patch it crashes. With it everything works fine. So, @dean-long assuming the above tweak is applied I believe it is good to go. |
Unfortunately, adding the shortcut for self-calls is not enough for Leyden. Trampoline calls to always-reachable targets are bound early to their destination, so there can be NativeCalls that are not self-calls. To see this in a debug build, this line needs to be adjusted:
|
Do we generate trampolines for "always-reachable targets " ? |
No, there's no trampoline stub. But we still call destination().
|
Any destination == addr call needs a trampoline stub to store the final destination. The benefit of early binding for always reachable calls is we can avoid creating a trampoline stub. An alternative would be to always store the destination in the CallRelocation. |
We should be pessimistic in Leyden. When we load AOT code there is no guarantee that destination is reachable. |
So for Leyden it sounds like you need to change |
Or perhaps just adapt MacroAssembler::far_branches(). It returns false if the code cache max range exceeds branch_range. In Leyden we can make it return false when we are generating AOT code. |
Oops, sorry, I got that the wrong way round. We need to change is_always_within_branch_range() as @dean-long suggested. |
Hi @dean-long, C2 generates code into CodeBuffer. Some calls have targets always within a branch range. Direct BL instructions are generated for them. Such calls don't have My current knowledge of the area:
IMO we should fix I don't think |
We also should somehow guard |
For runtime call inside CodeCache |
With my limited knowledge of AOT code, we should always generate trampoline based code for AArch64. |
We might need to adapt |
We already return During this PR testing with Leyden I also found that we need to do the same in codestub_branch_needs_far_jump() And now in |
If there is other code calling Assembler::reachable_from_branch_at() directly then you might need to change that function too. |
Yes, I will do. But this should not prevent you from pushing your changes. I only request to add "optimization" check |
@eastig, your understanding is correct.
That's roughly what this patch does. I detect expand by checking dest->blob() and orig_addr. However, I don't see an easy way to detect trampoline vs non trampoline calls in the shared code iterator. Instead, I removed the fast-path trampoline lookup during expand and find the trampoline call-sites by iterating their stubs to find owners.
It is used by expand(). But maybe you meant copy_code_to(). I would like to keep additional changes to a minimum, to make back-ports easier. I suggest a separate RFE for further improvements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good.
@dean-long I'm ok with Valdimir's suggestion just to include the "optimization" check. This fixes the problem with processing relocations when saving/restoring AOT code including in generated stub routines. n.b. unlike nmethods, generated stub code can contain direct pc-rel branches within the buffer which do not target a trampoline. This happens in the arraycopy stub as one example. However, I don't believe this invalidates your assumptions as to how to handle buffer resize events because buffers used for stubs are pre-allocated large enough to avoid the need for resizing. |
That sounds fine. In fact, they probably don't need to use a Relocation at all (except maybe in Leyden). If a forward reference needs a fixup, it can use a Label. What would invalidate current assumptions is trying to support trampoline stubs in non-nmethods. We can cross that bridge when we get to it. |
/integrate |
Going to push as commit 73e3e0e.
Your commit was automatically rebased without conflicts. |
@dean-long Pushed as commit 73e3e0e. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19796/head:pull/19796
$ git checkout pull/19796
Update a local copy of the PR:
$ git checkout pull/19796
$ git pull https://git.openjdk.org/jdk.git pull/19796/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 19796
View PR using the GUI difftool:
$ git pr show -t 19796
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19796.diff
Webrev
Link to Webrev Comment