-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8293351: Add second tmp register to aarch64 BarrierSetAssembler::load_at #10161
Conversation
👋 Welcome back aboldtch! A progress list of the required criteria for merging this PR into |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is s welcome improvement. Thanks.
@xmas92 This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 98 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@theRealAph, @tschatzl, @RealFYang) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
@@ -2463,7 +2463,7 @@ void MacroAssembler::verify_heapbase(const char* msg) { | |||
} | |||
#endif | |||
|
|||
void MacroAssembler::resolve_jobject(Register value, Register thread, Register tmp) { | |||
void MacroAssembler::resolve_jobject(Register value, Register tmp1, Register tmp2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to rename the second parameter for its declaration in file macroAssembler_aarch64.hpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch 👍 . Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind a few more tweaks? Thanks.
@@ -69,7 +70,7 @@ class G1BarrierSetAssembler: public ModRefBarrierSetAssembler { | |||
#endif | |||
|
|||
void load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, | |||
Register dst, Address src, Register tmp1, Register tmp_thread); | |||
Register dst, Address src, Register tmp1, Register tmp2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need similar modification for function definition in file barrierSetAssembler_aarch64.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 . Fixed
@@ -288,8 +288,7 @@ void BarrierSetAssembler::c2i_entry_barrier(MacroAssembler* masm) { | |||
__ stp(r10, r11, Address(__ pre(sp, -2 * wordSize))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to save r11 then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As resolve_weak_handle
may call into the VM r11
may still be clobbered. The reason that rscratch[1/2]
was not used previously was because some GC in the resolve_weak_handle
call graph used them implicitly. This is all explicit now so rscratch
is used instead as it seems like it is the way the GC API has gone, with all registers being explicit.
There is still some room for improvement here with regards to being more consistent with what registers are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"As resolve_weak_handle may call into the VM r11 may still be clobbered. "
-- I don't quite understand this. If this is the case, shouldn't r11 be saved & restored immediate before and after the VM call together with other live caller-save registers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe? Actually looking at it a second time it seems like r0-r7 and r9-r17 are pushed to the stack at the vm call I was looking at.
I am probably missing something. I do not fully understand the reason for pushing the registers at this point, not earlier or later. Would have to investigate further. I do however think that such a change belongs to a separate issue/PR. This PR should not change the instruction emission, only which registers are used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the reason is that register r11 was used as temp register (and thus gets clobbered) by resolve_weak_handle previously. That's why this register was saved/restored here. But this has changed since it not used here anymore with your change. I am OK if you want to handle that in another PR. Just reminds. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into it some further. You are correct. The reason I thought it was important was because this change hard crashed.
+ __ str(r10, Address(__ pre(sp, -wordSize)));
- __ stp(r10, r11, Address(__ pre(sp, -2 * wordSize)));
But I guess it is because I am misaligning the stack or there is some other reason sp
cannot be bumped by 8
, only 16
. As this change is fine.
+ __ str(r10, Address(__ pre(sp, -2 * wordSize)));
- __ stp(r10, r11, Address(__ pre(sp, -2 * wordSize)));
Maybe using the RegSets and push
would do this correctly, or it is some special invariant for these adapters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stackpointer needs 128 bit/64 byte alignment. use RegSets and MacroAssembler::push(RegSet, sp);
, which will only push pairs of register. In this case: stp(r10, zr, Address(__ pre(sp, -2 * wordSize)));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix using MacroAssembler::push(RegSet, sp);
@@ -46,6 +46,7 @@ class G1BarrierSetAssembler: public ModRefBarrierSetAssembler { | |||
Register pre_val, | |||
Register thread, | |||
Register tmp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to use tmp1
as the definition site does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Fixed. Made all two temporary register names consistent across g1BarrierSetAssembler
@@ -274,6 +275,7 @@ void G1BarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorator | |||
dst /* pre_val */, | |||
rthread /* thread */, | |||
tmp1 /* tmp */, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For tmp1 and tmp2 the comments, to match the paramter names, would also be "/* tmp1 /" and "/ tmp2 */".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 👍
@@ -296,6 +298,7 @@ void G1BarrierSetAssembler::oop_store_at(MacroAssembler* masm, DecoratorSet deco | |||
tmp2 /* pre_val */, | |||
rthread /* thread */, | |||
tmp1 /* tmp */, | |||
rscratch2 /* tmp */, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"tmp1 /* tmp1 /," and "rscratch2 / tmp2 */," to match the signature argument names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed 👍
@@ -288,8 +288,7 @@ void BarrierSetAssembler::c2i_entry_barrier(MacroAssembler* masm) { | |||
__ stp(r10, r11, Address(__ pre(sp, -2 * wordSize))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stackpointer needs 128 bit/64 byte alignment. use RegSets and MacroAssembler::push(RegSet, sp);
, which will only push pairs of register. In this case: stp(r10, zr, Address(__ pre(sp, -2 * wordSize)));
Mailing list message from Andrew Haley on hotspot-dev: On 9/6/22 08:28, Axel Boldt-Christmas wrote:
It would. Using a RegSet for push and pop also reduces the risk of error. |
@shqking @stooart-mon Your feedback have been addressed with code changes. Is there anything more you would like to add? |
/integrate |
/sponsor |
Going to push as commit 725f41f.
Your commit was automatically rebased without conflicts. |
Add a second tmp register to the BarrierSetAssembler::load_at GC API for aarch64.
Today G1 and Shenandoah uses a second temporary register. This will also be the case for generational ZGC.
Testing: Oracle platforms tier 1-3
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10161/head:pull/10161
$ git checkout pull/10161
Update a local copy of the PR:
$ git checkout pull/10161
$ git pull https://git.openjdk.org/jdk pull/10161/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 10161
View PR using the GUI difftool:
$ git pr show -t 10161
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10161.diff