-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8221554: aarch64 cross-modifying code #428
Conversation
Hi @a74nh, welcome to this OpenJDK project and thanks for contributing! We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user a74nh" as summary for the issue. If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing |
/covered |
Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated! |
@a74nh this pull request can not be integrated into git checkout aarch64_cmf
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push |
Webrevs
|
I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also delete the verification code, as you can't really mess it up any more. |
I'd need to see what the code looks like - but I'm thinking it just reduces the number of cross_modify_fence calls, instead of reducing explicit isb calls in the backend (of which only 3 remain with this patch). Either way, it's good news. I'm going to try rebasing locally on the top of your patch (I'm assuming it's this one #296 ) and see where that gets me. I'd be cautious about removing the verification code - it should be useful for helping to catch issues if we do start getting 1 run in a million crash dumps. But agreed, doesn't need to be there if the whole isb process has become trivial. |
You no longer need any isb when returning from the native wrappers (interpreted or compiled variant) after my patch. That should be 2 if the mentioned 3 hooks (without looking closely at the code). Because that will be done in the runtime instead when waking up from native. Which hook do we have left? |
Agreed. With both patches, and with the JNI isbs removed, that leaves just one safepoint isb in the AArch64 code (plus the other isb in emit_static_call_stub). The test patch exists not just to test the AArch64 but the common code too (ideally it would be extended to other targets too). Are we happy that the cross_modify_fence is called at all the required points? A hole in the common code would fail only very rarely. It does require quite a bit of code to add this test though. |
Mailing list message from David Holmes on hotspot-dev: Hi Alan, On 1/10/2020 2:30 am, Alan Hayward wrote:
I don't agree with the change here. The cross_modify_fence() is not http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-March/037153.html "The name "instruction_pipeline" seems a bit implementation specific @robehn , @fisk please chime in here. :) Thanks, |
The only reason for having cmf() on thread is the validation in debug builds, right?
You only need the boolean in debug builds on JavaThreads and you don't need to move the cmf() from OA and create the new one? |
I have no strong feeling on the names of these functions. |
Mailing list message from Andrew Haley on hotspot-dev: On 01/10/2020 03:18, David Holmes wrote:
I agree with you, David. cross_modify_fence() should not be moved. -- |
Mailing list message from Andrew Haley on hotspot-dev: On 02/10/2020 09:08, Alan Hayward wrote:
Right, but you can get the JavaThread efficiently any time you want, -- |
Oh, ok, didn't spot that. |
Patch updated.
|
I would recommend someone that knows aarch64 better than me, as I said mostly looked at shared parts. |
Mailing list message from Andrew Haley on hotspot-dev: On 11/11/2020 14:32, Alan Hayward wrote:
Please look at the issues still marked Pending. -- |
Change-Id: I73323c90765bf8524f12f680abde7e7e5b3bb898 CustomizedGitHooks: yes
Change-Id: Ibbe45650d351d8cff6fbf7a7c8baf30afbdac17c CustomizedGitHooks: yes
Mailing list message from Alan Hayward on hotspot-dev:
I?m probably missing something obvious, but I don?t see anything Spotted the comments in the patch were slightly out of date, so Alan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still pending.
} | ||
#endif | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless VerifyCrossModifyFence is turned on in debug builds it will almost never be used. Please turn this on by default in AArch64 debug builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha - Looks like your comments hadn't been made public until now.
The problem is it massively slows down a run. A tier1 test run for fastdebug went from 1h 32m 58s to
3h 43m 47s. I didn't think that would be acceptable.
@@ -1410,7 +1409,8 @@ address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) { | |||
__ mov(c_rarg0, rthread); | |||
__ mov(rscratch2, CAST_FROM_FN_PTR(address, JavaThread::check_special_condition_for_native_trans)); | |||
__ blr(rscratch2); | |||
__ maybe_isb(); | |||
// An instruction sync is required here after the call into the VM. However, | |||
// that will have been caught in the VM by a cross_modify_fence call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this wording is confusing; I had to read it several times.
Would not something like
'''
// When we return from the VM, the instruction stream may have
// been modified. Previously we emitted an ISB at this point, but
// it's now unnecessary because the VM itself calls cross_modify_fence()
'''
be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still pending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid mentioning code that no longer exists. (Maybe it's best to just drop the comment?)
How about:
// When we return from the VM, the instruction stream may have
// been modified. Therefore needs an isb is required. The VM will
// have already done this by calling cross_modify_fence().
// barrier for the instruction code cache. It guarantees that any memory access | ||
// to the instruction code preceding the fence is not reordered w.r.t. any | ||
// memory accesses to instruction code subsequent to the fence in program order. | ||
// It should be used in conjunction with safepointing to ensure that changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rather misleading: the AArch64 needs the ISB for the instruction pipeline rather than the cache, which is invalidated by the IC IVAU broadcast. I suspect other processors work in the same way.
The language in the AArch64 spec is better, IMO:
"ensures that all instructions that come after the ISB instruction in program order are fetched from
the cache or memory after the ISB instruction has completed"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This better?
// Finally, we define an "instruction_fence" operation, which ensures that all
// instructions that come after the ISB instruction in program order are fetched
// from the cache or memory after the ISB instruction has completed
// It should be used in conjunction with safepointing to ensure that changes
// to the instruction stream are seen on exit from a safepoint. Namely:
.....etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Finally, we define an "instruction_fence" operation, which ensures that all
// instructions that come after the fence in program order are fetched
// from the cache or memory after the fence has completed
Mailing list message from Andrew Haley on hotspot-dev: On 11/12/20 5:28 PM, Alan Hayward wrote:
"Hadn't been made public?" There's a way to comment without the owner
But why is it so expensive? All it does is mark the threads at a safepoint -- |
Mailing list message from Andrew Haley on hotspot-dev: On 11/12/20 5:28 PM, Alan Hayward wrote:
The comment only makes sense in the context of the code that was there
This is self contradicting: firstly you say and ISB is required, then -- |
Change-Id: I256b857ca275a8806febc1b9dc5412aac6d862a7 CustomizedGitHooks: yes
Ok, please ignore my comment about slowdowns. Patch updated to enable VerifyCrossModifyFence always for aarch64 debug.
Agreed. The comment only makes sense when it refers to code that used to exist. And I really dislike comments that do that. So, I've dropped them completely.
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please incorporate the following orderAccess changes for windows_aarch64.hpp in your PR?: https://gist.github.com/mo-beck/2dc66d068741030b4a422b57607bea8e#file-orderaccess_windows_aarch64_hpp-diff
Change-Id: I8701ea60d2823d16666cb43cb9d0935d92b81e52 CustomizedGitHooks: yes
Happy to do this. The file has been added since I started the patch, so that's fine. But as a disclaimer for myself, I've no way of testing it, and I'm going to have to assume it just works for Windows AArch64. |
/integrate |
/sponsor |
@nick-arm @a74nh Since your change was applied there have been 24 commits pushed to the
Your commit was automatically rebased without conflicts. Pushed as commit d183fc7. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
The AArch64 port uses maybe_isb in places where an ISB might be required
because the code may have safepointed. These maybe_isbs are very conservative
and are used in many places are used when a safepoint has not happened.
cross_modify_fence was added in common code to place a barrier in all the
places after a safepoint has occurred. All the uses of it are in common code,
yet it remains unimplemented on AArch64.
This set of patches implements cross_modify_fence for AArch64 and reconsiders
every uses of maybe_isb, discarding many of them. In addition, it introduces
a new diagnostic option, which when enabled on AArch64 tests the correct
usage of the barriers.
Advantage of this patch is threefold:
Patch 1: Split cross_modify_fence
This is simply refactoring work split out to simplify the other two patches.
instruction_fence() is provided by each target and simply places
a fence for the instruction stream.
cross_modify_fence() is now a member of JavaThread and just calls
instruction_fence. This function will be extended in Patch 3.
Patch 2: Use cross_modify_fence instead of maybe_isb
The [n] References refer to the comments for cross_modify_fence in
thread.hpp.
This is all the existing uses of maybe_isb in the AArch64 target:
Instances of Java code calling a VM function
** MacroAssembler::call_VM_leaf_base()
** generate_fast_get_int_field0()
** stubGenerator_aarch64 generate_throw_exception()
** sharedRuntime_aarch64 generate_handler_blob()
** SharedRuntime::generate_resolve_blob()
** C1 LIR_Assembler::rt_call
** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw,
generate_handle_exception, generate_code_for.
** OptoRuntime::generate_exception_blob()
Instances of Java code calling a JNI function
** SharedRuntime::generate_native_wrapper()
** TemplateInterpreterGenerator::generate_native_entry()
but it completed during the call. This happens if the code doesn't
branch on safepoint_in_progress
reguard_yellow_pages and complete_monitor_unlocking_C are after the thread
goes back into it's original state, so are covered by [2] and [3], the
same as a normal VM call.
Patching functions
** patch_callers_callsite() (called by gen_c2i_adapter())
C1 MacroAssembler::emit_static_call_stub()
destination is required for proper functioning.
be picked up.
Patch 3: Add cross modify fence verification
The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct
usage of instruction barriers. It can safely be enabled on any Java run.
Enabling it will cause the following:
Once all threads have been brought to a safepoint, each thread will be
marked.
On a cross_modify_fence and safepoint_fence the mark for that thread
will be cleared.
On entry to a method and in a safepoint poll, then the thread is checked.
If it is marked, then the code will error.
Progress
Testing
Issue
Reviewers
Download
$ git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428
$ git checkout pull/428