Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8221554: aarch64 cross-modifying code #428

Closed
wants to merge 14 commits into from
Closed

Conversation

@a74nh
Copy link
Contributor

@a74nh a74nh commented Sep 30, 2020

The AArch64 port uses maybe_isb in places where an ISB might be required
because the code may have safepointed. These maybe_isbs are very conservative
and are used in many places are used when a safepoint has not happened.

cross_modify_fence was added in common code to place a barrier in all the
places after a safepoint has occurred. All the uses of it are in common code,
yet it remains unimplemented on AArch64.

This set of patches implements cross_modify_fence for AArch64 and reconsiders
every uses of maybe_isb, discarding many of them. In addition, it introduces
a new diagnostic option, which when enabled on AArch64 tests the correct
usage of the barriers.

Advantage of this patch is threefold:

  • Reducing the number of ISBs - giving a theoretical performance improvement.
  • Use of common code instead of backend specific code.
  • Additional test diagnostic options

Patch 1: Split cross_modify_fence

This is simply refactoring work split out to simplify the other two patches.

instruction_fence() is provided by each target and simply places
a fence for the instruction stream.

cross_modify_fence() is now a member of JavaThread and just calls
instruction_fence. This function will be extended in Patch 3.

Patch 2: Use cross_modify_fence instead of maybe_isb

The [n] References refer to the comments for cross_modify_fence in
thread.hpp.

This is all the existing uses of maybe_isb in the AArch64 target:

  1. Instances of Java code calling a VM function

    • This encapsulates the changes to:
      ** MacroAssembler::call_VM_leaf_base()
      ** generate_fast_get_int_field0()
      ** stubGenerator_aarch64 generate_throw_exception()
      ** sharedRuntime_aarch64 generate_handler_blob()
      ** SharedRuntime::generate_resolve_blob()
      ** C1 LIR_Assembler::rt_call
      ** C1 StubAssembler::call_RT(): used by Used by generate_exception_throw,
      generate_handle_exception, generate_code_for.
      ** OptoRuntime::generate_exception_blob()
    • Any changes will be caught due to calls to [2] or [3] by the VM function.
    • Any calls that do not call [2] or [3] do not require an ISB.
    • This patch is more optimal for these cases.
  2. Instances of Java code calling a JNI function

    • This encapsulates the changes to:
      ** SharedRuntime::generate_native_wrapper()
      ** TemplateInterpreterGenerator::generate_native_entry()
    • A safepoint still in progress after the call with be caught by [4].
    • An ISB is still required for the case where there was a safepoint
      but it completed during the call. This happens if the code doesn't
      branch on safepoint_in_progress
    • In the SharedRuntime version, the two possible calls to
      reguard_yellow_pages and complete_monitor_unlocking_C are after the thread
      goes back into it's original state, so are covered by [2] and [3], the
      same as a normal VM call.
    • This patch is only more optimal for the two post-JNI calls.
  3. Patching functions

    • This encapsulates the changes to:
      ** patch_callers_callsite() (called by gen_c2i_adapter())
    • This results in code being patched, but does not safepoint
    • Therefore an ISB is required.
    • This patch introduces no change here.
  4. C1 MacroAssembler::emit_static_call_stub()

    • Calls ISB (not maybe_isb)
    • By design, the patching doesn't require that the up-to-date
      destination is required for proper functioning.
    • However, the ISB makes it most likely that the new destination will
      be picked up.
    • This patch introduces no change here.

Patch 3: Add cross modify fence verification

The VerifyCrossModifyFence diagnostic flag enables confirmation to the correct
usage of instruction barriers. It can safely be enabled on any Java run.

Enabling it will cause the following:

  • Once all threads have been brought to a safepoint, each thread will be
    marked.

  • On a cross_modify_fence and safepoint_fence the mark for that thread
    will be cleared.

  • On entry to a method and in a safepoint poll, then the thread is checked.
    If it is marked, then the code will error.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Testing

Linux aarch64 Linux arm Linux ppc64le Linux s390x Linux x64 Linux x86 Windows x64 macOS x64
Build ✔️ (1/1 passed) ✔️ (1/1 passed) ✔️ (1/1 passed) ✔️ (1/1 passed) ✔️ (6/6 passed) ✔️ (2/2 passed) ✔️ (2/2 passed) ✔️ (2/2 passed)
Test (tier1) ✔️ (9/9 passed) ✔️ (9/9 passed) ✔️ (9/9 passed) ✔️ (9/9 passed)

Issue

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/428/head:pull/428
$ git checkout pull/428

@a74nh a74nh changed the title [AArch64: Use cross_modify_fence instead of maybe_isb 8221554: AArch64: Use cross_modify_fence instead of maybe_isb Sep 30, 2020
@bridgekeeper bridgekeeper bot added the oca label Sep 30, 2020
@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Sep 30, 2020

Hi @a74nh, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user a74nh" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Sep 30, 2020

@a74nh The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

Loading

@openjdk openjdk bot added the hotspot label Sep 30, 2020
@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Sep 30, 2020

/covered

Loading

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Sep 30, 2020

Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

Loading

@a74nh a74nh changed the title 8221554: AArch64: Use cross_modify_fence instead of maybe_isb 8221554: aarch64 cross-modifying code Sep 30, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Sep 30, 2020

@a74nh this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout aarch64_cmf
git fetch https://git.openjdk.java.net/jdk master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Sep 30, 2020

Webrevs

Loading

@fisk
Copy link
Contributor

@fisk fisk commented Oct 1, 2020

I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also delete the verification code, as you can't really mess it up any more.

Loading

@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Oct 1, 2020

I am about to integrate concurrent stack scanning for ZGC. This patch changes things a bit so that disarming of the poll word is always done only by the thread itself. The consequence is that if a thread is in native or blocked, and the safepoint finishes, then the thread will when waking up, still take one slow path to figure out all is done, disarm the poll and run the cross modifying fence. The consequence is that we trivially need the cross modifying fence only in the poll slow path and nowhere else. So maybe if you wait for that one, we can delete a few more ISBs, and IMO also delete the verification code, as you can't really mess it up any more.

I'd need to see what the code looks like - but I'm thinking it just reduces the number of cross_modify_fence calls, instead of reducing explicit isb calls in the backend (of which only 3 remain with this patch). Either way, it's good news. I'm going to try rebasing locally on the top of your patch (I'm assuming it's this one #296 ) and see where that gets me.

I'd be cautious about removing the verification code - it should be useful for helping to catch issues if we do start getting 1 run in a million crash dumps. But agreed, doesn't need to be there if the whole isb process has become trivial.

Loading

@fisk
Copy link
Contributor

@fisk fisk commented Oct 1, 2020

You no longer need any isb when returning from the native wrappers (interpreted or compiled variant) after my patch. That should be 2 if the mentioned 3 hooks (without looking closely at the code). Because that will be done in the runtime instead when waking up from native. Which hook do we have left?

Loading

@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Oct 1, 2020

Agreed. With both patches, and with the JNI isbs removed, that leaves just one safepoint isb in the AArch64 code (plus the other isb in emit_static_call_stub).

The test patch exists not just to test the AArch64 but the common code too (ideally it would be extended to other targets too). Are we happy that the cross_modify_fence is called at all the required points? A hole in the common code would fail only very rarely. It does require quite a bit of code to add this test though.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 2, 2020

Mailing list message from David Holmes on hotspot-dev:

Hi Alan,

On 1/10/2020 2:30 am, Alan Hayward wrote:

The AArch64 port uses maybe_isb in places where an ISB might be required
because the code may have safepointed. These maybe_isbs are very conservative
and are used in many places are used when a safepoint has not happened.

cross_modify_fence was added in common code to place a barrier in all the
places after a safepoint has occurred. All the uses of it are in common code,
yet it remains unimplemented on AArch64.

This set of patches implements cross_modify_fence for AArch64 and reconsiders
every uses of maybe_isb, discarding many of them. In addition, it introduces
a new diagnostic option, which when enabled on AArch64 tests the correct
usage of the barriers.

Advantage of this patch is threefold:
* Reducing the number of ISBs - giving a theoretical performance improvement.
* Use of common code instead of backend specific code.
* Additional test diagnostic options

Patch 1: Split cross_modify_fence
=================================
This is simply refactoring work split out to simplify the other two patches.

instruction_fence() is provided by each target and simply places
a fence for the instruction stream.

cross_modify_fence() is now a member of JavaThread and just calls
instruction_fence. This function will be extended in Patch 3.

I don't agree with the change here. The cross_modify_fence() is not
related to thread API imo, it belongs in OrderAccess. The name was
deliberately selected to abstract away from the specific details of why
a given platform may need this fence:

http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-March/037153.html

"The name "instruction_pipeline" seems a bit implementation specific
about what HW architectural features need to be taken care of due to
cross-modifying code, which may or may not apply to a given platform.
Perhaps cross_modify_fence(), or something along those lines, would be
better. That makes it more clear what we are protecting against, as
opposed to what HW architectural features that might concern on a given
platform."

@robehn , @fisk please chime in here. :)

Thanks,
David

Loading

@robehn
Copy link
Contributor

@robehn robehn commented Oct 2, 2020

The only reason for having cmf() on thread is the validation in debug builds, right?
If you do it the opposite way:

inline void OrderAccess::cross_modify_fence_non_arch_specific() {
    OrderAccess::cross_modify_fence_arch_impl();
#ifdef ASSERT
    if (VerifyCrossModifyFence) {
      Thread::current()->set_requires_cross_modify_fence(false);
    }
#endif
}

You only need the boolean in debug builds on JavaThreads and you don't need to move the cmf() from OA and create the new one?

Loading

@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Oct 2, 2020

Mailing list message from David Holmes on hotspot-dev:

Hi Alan,

On 1/10/2020 2:30 am, Alan Hayward wrote:

The AArch64 port uses maybe_isb in places where an ISB might be required
because the code may have safepointed. These maybe_isbs are very conservative
and are used in many places are used when a safepoint has not happened.
cross_modify_fence was added in common code to place a barrier in all the
places after a safepoint has occurred. All the uses of it are in common code,
yet it remains unimplemented on AArch64.
This set of patches implements cross_modify_fence for AArch64 and reconsiders
every uses of maybe_isb, discarding many of them. In addition, it introduces
a new diagnostic option, which when enabled on AArch64 tests the correct
usage of the barriers.
Advantage of this patch is threefold:

  • Reducing the number of ISBs - giving a theoretical performance improvement.
  • Use of common code instead of backend specific code.
  • Additional test diagnostic options
    Patch 1: Split cross_modify_fence
    =================================
    This is simply refactoring work split out to simplify the other two patches.
    instruction_fence() is provided by each target and simply places
    a fence for the instruction stream.
    cross_modify_fence() is now a member of JavaThread and just calls
    instruction_fence. This function will be extended in Patch 3.

I don't agree with the change here. The cross_modify_fence() is not
related to thread API imo, it belongs in OrderAccess. The name was
deliberately selected to abstract away from the specific details of why
a given platform may need this fence:

http://mail.openjdk.java.net/pipermail/hotspot-dev/2019-March/037153.html

"The name "instruction_pipeline" seems a bit implementation specific
about what HW architectural features need to be taken care of due to
cross-modifying code, which may or may not apply to a given platform.
Perhaps cross_modify_fence(), or something along those lines, would be
better. That makes it more clear what we are protecting against, as
opposed to what HW architectural features that might concern on a given
platform."

@robehn , @fisk please chime in here. :)

Thanks,
David

I have no strong feeling on the names of these functions.
The reason for moving it was that in the third part (the verification testing) it needs JavaThread.
In an earlier version I simply had OrderAccess::cross_modify_fence(JavaThread *thread). But then OrderAccess is dependant on JavaThread, which felt wrong. Obvious solution was to add a wrapper in JavaThread that calls down to OrderAccess.
Alternatively, I could switch the name of instruction_fence back to cross_modify_fence, and then think of a name for the added function in JavaThread. Alternatively alternatively, they could both be call cross_modify_fence.
If the verification test patch was removed from the set, then most of the first patch wouldn't be needed either.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 2, 2020

Mailing list message from Andrew Haley on hotspot-dev:

On 01/10/2020 03:18, David Holmes wrote:

cross_modify_fence() is now a member of JavaThread and just calls
instruction_fence. This function will be extended in Patch 3.
I don't agree with the change here. The cross_modify_fence() is not
related to thread API imo, it belongs in OrderAccess.

I agree with you, David. cross_modify_fence() should not be moved.

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 2, 2020

Mailing list message from Andrew Haley on hotspot-dev:

On 02/10/2020 09:08, Alan Hayward wrote:

I have no strong feeling on the names of these functions.
The reason for moving it was that in the third part (the verification testing) it needs JavaThread.

Right, but you can get the JavaThread efficiently any time you want,
so you don't need to pass it to cross_modify_fence().

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Loading

@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Oct 2, 2020

Mailing list message from Andrew Haley on hotspot-dev:

On 02/10/2020 09:08, Alan Hayward wrote:

I have no strong feeling on the names of these functions.
The reason for moving it was that in the third part (the verification testing) it needs JavaThread.

Right, but you can get the JavaThread efficiently any time you want,
so you don't need to pass it to cross_modify_fence().

Oh, ok, didn't spot that.
This would result in code in OrderAccess.cpp calling a function in JavaThread.
It feels that OrderAccess should be much lower level than JavaThread. But, that might be ok.

Loading

@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Oct 5, 2020

Patch updated.

  • cross_modify_fence now calls cross_modify_fence_impl as suggested.

  • ISBs in the JNI calls have been removed. This means that it is currently unsafe to merge until #296 has been merged.

Loading

@robehn
Copy link
Contributor

@robehn robehn commented Nov 11, 2020

Patch merged to master - so that it's on top of pchilano's patch.
Already tested that both the patches work fine together (did a complete run of all our tests with VerifyCrossModifyFence set to true).
robehn has reviewed this patch, but I think I need a second review too (?)

I would recommend someone that knows aarch64 better than me, as I said mostly looked at shared parts.

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 11, 2020

Mailing list message from Andrew Haley on hotspot-dev:

On 11/11/2020 14:32, Alan Hayward wrote:

Quick ping for this.
AIUI, it needs a second reviewer to vote ok.
There aren?t any outstanding issues (that I?m aware of).

Please look at the issues still marked Pending.

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Loading

a74nh added 2 commits Nov 12, 2020
Change-Id: I73323c90765bf8524f12f680abde7e7e5b3bb898
CustomizedGitHooks: yes
Change-Id: Ibbe45650d351d8cff6fbf7a7c8baf30afbdac17c
CustomizedGitHooks: yes
@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 12, 2020

Mailing list message from Alan Hayward on hotspot-dev:

On 11 Nov 2020, at 16:44, Andrew Haley <aph at redhat.com> wrote:

On 11/11/2020 14:32, Alan Hayward wrote:

Quick ping for this.
AIUI, it needs a second reviewer to vote ok.
There aren?t any outstanding issues (that I?m aware of).

Please look at the issues still marked Pending.

I?m probably missing something obvious, but I don?t see anything
marked on the gitHub page, or see anything outstanding in the
comments. Unless you just mean missing having a sponsor?

Spotted the comments in the patch were slightly out of date, so
I?ve updated those in the meantime (plus merged to head).

Alan.
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Loading

Copy link
Contributor

@theRealAph theRealAph left a comment

Still pending.

Loading

}
#endif
}

Copy link
Contributor

@theRealAph theRealAph Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless VerifyCrossModifyFence is turned on in debug builds it will almost never be used. Please turn this on by default in AArch64 debug builds.

Loading

Copy link
Contributor

@theRealAph theRealAph Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please...

Loading

Copy link
Contributor Author

@a74nh a74nh Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha - Looks like your comments hadn't been made public until now.

The problem is it massively slows down a run. A tier1 test run for fastdebug went from 1h 32m 58s to
3h 43m 47s. I didn't think that would be acceptable.

Loading

@@ -1410,7 +1409,8 @@ address TemplateInterpreterGenerator::generate_native_entry(bool synchronized) {
__ mov(c_rarg0, rthread);
__ mov(rscratch2, CAST_FROM_FN_PTR(address, JavaThread::check_special_condition_for_native_trans));
__ blr(rscratch2);
__ maybe_isb();
// An instruction sync is required here after the call into the VM. However,
// that will have been caught in the VM by a cross_modify_fence call.
Copy link
Contributor

@theRealAph theRealAph Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this wording is confusing; I had to read it several times.

Would not something like
'''
// When we return from the VM, the instruction stream may have
// been modified. Previously we emitted an ISB at this point, but
// it's now unnecessary because the VM itself calls cross_modify_fence()
'''

be better?

Loading

Copy link
Contributor

@theRealAph theRealAph Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still pending.

Loading

Copy link
Contributor Author

@a74nh a74nh Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to avoid mentioning code that no longer exists. (Maybe it's best to just drop the comment?)

How about:

// When we return from the VM, the instruction stream may have
// been modified. Therefore needs an isb is required. The VM will
// have already done this by calling cross_modify_fence().

Loading

// barrier for the instruction code cache. It guarantees that any memory access
// to the instruction code preceding the fence is not reordered w.r.t. any
// memory accesses to instruction code subsequent to the fence in program order.
// It should be used in conjunction with safepointing to ensure that changes
Copy link
Contributor

@theRealAph theRealAph Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather misleading: the AArch64 needs the ISB for the instruction pipeline rather than the cache, which is invalidated by the IC IVAU broadcast. I suspect other processors work in the same way.
The language in the AArch64 spec is better, IMO:
"ensures that all instructions that come after the ISB instruction in program order are fetched from
the cache or memory after the ISB instruction has completed"

Loading

Copy link
Contributor Author

@a74nh a74nh Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This better?

// Finally, we define an "instruction_fence" operation, which ensures that all
// instructions that come after the ISB instruction in program order are fetched
// from the cache or memory after the ISB instruction has completed
// It should be used in conjunction with safepointing to ensure that changes
// to the instruction stream are seen on exit from a safepoint. Namely:
.....etc

Loading

Copy link
Contributor

@theRealAph theRealAph Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Finally, we define an "instruction_fence" operation, which ensures that all
// instructions that come after the fence in program order are fetched
// from the cache or memory after the fence has completed

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 13, 2020

Mailing list message from Andrew Haley on hotspot-dev:

On 11/12/20 5:28 PM, Alan Hayward wrote:

On Fri, 23 Oct 2020 10:17:09 GMT, Andrew Haley <aph at openjdk.org> wrote:

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5317:

5315: #endif
5316: }
5317:

Unless VerifyCrossModifyFence is turned on in debug builds it will almost never be used. Please turn this on by default in AArch64 debug builds.

Please...

Aha - Looks like your comments hadn't been made public until now.

"Hadn't been made public?" There's a way to comment without the owner
of the PR being able to see the comments?

The problem is it massively slows down a run. A tier1 test run for fastdebug went from 1h 32m 58s to
3h 43m 47s. I didn't think that would be acceptable.

But why is it so expensive? All it does is mark the threads at a safepoint
and later check the mark at safepoints. It's not as if it's doing anything
much, but you're telling me it is more expensive than everything else put
together.

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Loading

@mlbridge
Copy link

@mlbridge mlbridge bot commented Nov 13, 2020

Mailing list message from Andrew Haley on hotspot-dev:

On 11/12/20 5:28 PM, Alan Hayward wrote:

On Wed, 11 Nov 2020 15:04:32 GMT, Andrew Haley <aph at openjdk.org> wrote:

src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 1413:

1411: __ blr(rscratch2);
1412: // An instruction sync is required here after the call into the VM. However,
1413: // that will have been caught in the VM by a cross_modify_fence call.

I think this wording is confusing; I had to read it several times.

Would not something like
'''
// When we return from the VM, the instruction stream may have
// been modified. Previously we emitted an ISB at this point, but
// it's now unnecessary because the VM itself calls cross_modify_fence()
'''

be better?

This is still pending.

I wanted to avoid mentioning code that no longer exists. (Maybe it's best to just drop the comment?)

The comment only makes sense in the context of the code that was there
before.

How about:

// When we return from the VM, the instruction stream may have
// been modified. Therefore needs an isb is required. The VM will
// have already done this by calling cross_modify_fence().

This is self contradicting: firstly you say and ISB is required, then
you say why it isn't.

--
Andrew Haley (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

Loading

@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Nov 18, 2020

The problem is it massively slows down a run. A tier1 test run for fastdebug went from 1h 32m 58s to
3h 43m 47s. I didn't think that would be acceptable.

But why is it so expensive? All it does is mark the threads at a safepoint
and later check the mark at safepoints. It's not as if it's doing anything
much, but you're telling me it is more expensive than everything else put
together.

Ok, please ignore my comment about slowdowns.
Turns out that our testing infrastructure had been scheduling onto different boxes. Once forced onto a single machine, I see no real difference in a complete tier 1 run with and without VerifyCrossModifyFence. (With the slower boxes taking ~4 hours and a quicker one taking ~1 hours)

Patch updated to enable VerifyCrossModifyFence always for aarch64 debug.

I wanted to avoid mentioning code that no longer exists. (Maybe it's best to just drop the comment?)

The comment only makes sense in the context of the code that was there
before.

How about:
// When we return from the VM, the instruction stream may have
// been modified. Therefore needs an isb is required. The VM will
// have already done this by calling cross_modify_fence().

This is self contradicting: firstly you say and ISB is required, then
you say why it isn't.

Agreed. The comment only makes sense when it refers to code that used to exist. And I really dislike comments that do that. So, I've dropped them completely.

// Finally, we define an "instruction_fence" operation, which ensures that all
// instructions that come after the fence in program order are fetched
// from the cache or memory after the fence has completed

Done.

Loading

Copy link
Contributor

@mo-beck mo-beck left a comment

Could you please incorporate the following orderAccess changes for windows_aarch64.hpp in your PR?: https://gist.github.com/mo-beck/2dc66d068741030b4a422b57607bea8e#file-orderaccess_windows_aarch64_hpp-diff

Loading

Change-Id: I8701ea60d2823d16666cb43cb9d0935d92b81e52
CustomizedGitHooks: yes
@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Nov 18, 2020

Could you please incorporate the following orderAccess changes for windows_aarch64.hpp in your PR?: https://gist.github.com/mo-beck/2dc66d068741030b4a422b57607bea8e#file-orderaccess_windows_aarch64_hpp-diff

Happy to do this. The file has been added since I started the patch, so that's fine. But as a disclaimer for myself, I've no way of testing it, and I'm going to have to assume it just works for Windows AArch64.

Loading

@a74nh
Copy link
Contributor Author

@a74nh a74nh commented Nov 19, 2020

/integrate

Loading

@openjdk openjdk bot added the sponsor label Nov 19, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 19, 2020

@a74nh
Your change (at version 655497a) is now ready to be sponsored by a Committer.

Loading

@nick-arm
Copy link
Member

@nick-arm nick-arm commented Nov 19, 2020

/sponsor

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 19, 2020

@nick-arm @a74nh Since your change was applied there have been 24 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

Pushed as commit d183fc7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

openjdk-notifier bot referenced this issue Nov 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
8 participants