-
Notifications
You must be signed in to change notification settings - Fork 5.8k
8294729: [s390] Implement nmethod entry barriers #10558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Welcome back tsteele! A progress list of the required criteria for merging this PR into |
@backwaterred The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
48bd190
to
945f2c2
Compare
- add missing parameter - debug stack frames
945f2c2
to
a2fff19
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken a quick look. Please find my change requests.
0eed052
to
06412f9
Compare
06412f9
to
3310743
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you missed implementing the c2i entry barrier that comes hand in hand with the nmethod entry barriers. Once you implemented that it looks like you need to spill more registers in g1_write_barrier_pre() in g1BarrierSetAssembler_s390.cpp with this change as well, as all java arguments are live at that point and with G1 the phantom load of the method holder will call the pre_write barrier which occasionally will call the runtime. At that point, even floating point arguments must be saved. The current s390 g1 pre write barrier seems to assume it is only called from the interpreter, I believe. All other platforms went through the same dance when adding nmethod entry barriers.
Good catch! Is the c2i entry barrier really required without concurrent class unloading? I guess that could get handled in a separate JBS issue and is not strictly necessary to make the JVM usable again. |
@backwaterred Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to revoke my review. This does no longer look correct.
|
||
// Load class loader data to determine whether the method's holder is concurrently unloading. | ||
__ load_method_holder(Z_R0_scratch, Z_method); | ||
__ z_lg(Z_R0_scratch, in_bytes(InstanceKlass::class_loader_data_offset()), Z_R0_scratch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think R0 can be used for storage addressing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those cases where 'if you use R0 it really means 0 and not the contents of the register' are a real gotcha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very simple, basically. R0 is good for everything, except for address calculations which are performed as part of an instruction execution, in which case R0 indicates "not specified". Take LA Z_R0,0(Z_R0,Z_R0) as an example. The leftmost use of Z_R0 is perfectly fine. It just serves result register. The middle (index) and right (base address) occurrences, however, are inputs for the triadic add to form the address. As said above, index and base register are "not specified" in the example and a "0" is fed into the adder.
You will get used to this pretty quick.
|
||
// Class loader is weak. Determine whether the holder is still alive. | ||
__ z_lg(Z_R1_scratch, in_bytes(ClassLoaderData::holder_offset()), Z_R0_scratch); | ||
__ resolve_weak_handle(Address(Z_R1_scratch), Z_R1_scratch, Z_R0_scratch, Z_R2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're killing R2 which contains the 1st argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought this might draw some attention. Is there a better register to use, or will I simply have to save & restore the value of whichever register I choose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using R0 and R1 implicitly is ok, because they are scratch regs. Other regs should get passed by argument. There's already a tmp reg where the barrier is inserted which you can simply pass.
@@ -236,11 +239,13 @@ void G1BarrierSetAssembler::g1_write_barrier_pre(MacroAssembler* masm, Decorator | |||
|
|||
// Push frame to protect top frame with return pc and spilled register values. | |||
__ save_return_pc(); | |||
__ push_frame_abi160(0); // Will use Z_R0 as tmp. | |||
__ push_frame_abi160(nbytes_save); // Will use Z_R0 as tmp. | |||
__ save_volatile_regs(Z_SP, frame::z_abi_160_size, true, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adds overhead for other usages which don't need it. May be still ok, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. The PPC implementation I looked at has more complicated logic for determining if a frame is needed, but it relies on a PreservationLevel flag which is not present on s390. If there is another way to deduce whether a frame is required here, or if we should add that flag on s390, I am happy to make that change as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PPC uses fine-grained control over what needs to get preserved. I think this would be good for s390, too. Be aware that changing this requires modification of the whole GC interface and all GCs. I'd split the work into 2 JBS issues and PRs. Maybe use one for nmethod entry barriers only and one for c2i entry barriers with modified GC interface?
I still believe that c2i entry barriers are currently not needed for s390 to work correctly and are hence less urgent. They are needed for concurrent class unloading which only comes with ZGC and ShenandoahGC which are unavailable on s390.
Written earlier; tests just completed. I made some changes:
I've run the tests in |
Noted. Since the platform is building again, it may be a good idea to follow your suggestion of pushing the changes up to your review, and then create a separate PR for the new changes. I suppose this depends on whether @fisk believes the previous changes (up to the time of their first comment)[1] to be complete. |
72382f7
to
49e701d
Compare
@backwaterred Please do not rebase or force-push to an active PR as it invalidates existing review comments. All changes will be squashed into a single commit automatically when integrating. See OpenJDK Developers’ Guide for more information. |
Apologies for the force-push and associated chaos. I believe the best course of action is to merge this PR as it was before the suggestions by fisk (which are appreciated nonetheless), as that PR solves the more immediate problem of fixing the build on s390. I have incorporated fisk's suggestions in a separate PR, where I propose we continue development. In the new PR, I have integrated Martin & Lutz' improvements to my s390 assembly given in the reviews above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this is the version which looks correct. Please consider moving the nmethod entry barrier implementation to the end such that you don't have to move it in your next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, looks good.
Also, in .hpp file, please. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
__ load_const(Z_R1_scratch, (uint64_t)StubRoutines::zarch::nmethod_entry_barrier()); // 2*6 bytes | ||
|
||
// Load value from current java object: | ||
__ z_lg(Z_R0_scratch, in_bytes(bs_nm->thread_disarmed_offset()), Z_thread); // 6 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this loading from the current java Thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just saw this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your question is about the comment on line 137. Yes, this could be improved.
Excellent :-) and thanks again for all the support. I believe it is time to: /integrate |
Going to push as commit f4d8c20.
Your commit was automatically rebased without conflicts. |
@backwaterred Pushed as commit f4d8c20. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
This draft PR implements native method barriers on s390. When complete, this will fix the build, and bring the other benefits of JDK-8290025 to that platform.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/10558/head:pull/10558
$ git checkout pull/10558
Update a local copy of the PR:
$ git checkout pull/10558
$ git pull https://git.openjdk.org/jdk pull/10558/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 10558
View PR using the GUI difftool:
$ git pr show -t 10558
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/10558.diff