New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8270947: AArch64: C1: use zero_words to initialize all objects #4919
Conversation
|
@theRealAph The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
I hit the following assertion failure when I ran tier1 with
|
Maybe we need to reserve some extra space for the trampoline stubs in C1's |
Mailing list message from Andrew Haley on hotspot-dev: On 7/29/21 9:56 AM, Nick Gasson wrote:
Oh, that really is weird. I'll dig some more. -- |
Mailing list message from Andrew Haley on hotspot-dev: On 7/29/21 10:19 AM, Andrew Haley wrote:
I get Running test 'jtreg:test/hotspot/jtreg/compiler/loopopts/TestDeepGraphVerifyIterativeGVN.java' perhaps because that is a C2-only test. -- |
Mailing list message from Andrew Haley on hotspot-dev: On 7/29/21 10:36 AM, Andrew Haley wrote:
Ah no, it's a debug-only test. -- |
Mailing list message from Nick Gasson on hotspot-dev: On 29/07/21 17:36 pm, Andrew Haley wrote:
I did: make exploded-test TEST="compiler/loopopts/TestDeepGraphVerifyIterativeGVN.java" \ On a fastdebug build. -- |
Mailing list message from Andrew Haley on hotspot-dev: On 7/29/21 10:58 AM, Nick Gasson wrote:
Fascinating. C1 doesn't allocate any space for trampoline stubs, except for explicit call LIR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me and I've tested tier1 with -XX:TieredStopAtLevel=1. Although you probably ought to update the copyright year in c1_MacroAssembler_aarch64.hpp.
@theRealAph This change now passes all automated pre-integration checks. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 59 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me
My initial reaction to that extra last commit was that anything other than a compiler thread calling MacroAssembler::zero_words implies "Something is rotten in the state of Danmark" and the only end game is a stage strewn with corpses. Of course, zero_words gets called from the stub generator. However, I guess the stub complete check avoids any problems there. Were you thinking of some other potential use for it? e.g. lazy intrinsics? Anyway, the check certainly does no harm so it is all still good. |
Mailing list message from Andrew Haley on hotspot-dev: On 7/30/21 4:55 PM, Andrew Dinn wrote:
I *think* it's probably unnecessary. However, we do lazily generate IC It is certainly the case that you can't call ciEnv::current() unless static CompilerThread* current() { And therefore ciEnv::current()->task() is Undefined Behaviour if not on a (This is pretty awful, and I should submit a patch to perhaps add So it's fairly uncontroversial, I would have thought, to guard -- |
/integrate |
Going to push as commit 6c68ce2.
Your commit was automatically rebased without conflicts. |
@theRealAph Pushed as commit 6c68ce2. |
C1 has its own code generators for zeroing words. We should use the same logic for C1 and C2, which should give us better C1 performance and result in less code to maintain.
This is one of those patches that's a great joy to write, because it consists mainly of deletions. The code I've added is mostly adapters to allow the C1 code to use the memory-zeroing logic written originally for C2. This means we have less code, but also that VM configuration options (e.g.
BlockZeroingLowLimit
) work with C1 and C2 in th esame way.Measuring the performance of memory allocation is quite tricky, so I've written a JMH test case that measures the raw allocation rate of the JVM for various object sizes. This is inevitably rather noisy because it combines the effects of both the allocation code and other GC-related pauses. Nonetheless, it's a useful sanity check.
The performance differences between old and one are mostly in the noise, but with large allocations the advantage of
DC ZVA
becomes apparent:old:
new:
Full test results, Graviton 2 (i.e. Neoverse N1). Units are megabytes per second,
objects sizes are in bytes:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/4919/head:pull/4919
$ git checkout pull/4919
Update a local copy of the PR:
$ git checkout pull/4919
$ git pull https://git.openjdk.java.net/jdk pull/4919/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 4919
View PR using the GUI difftool:
$ git pr show -t 4919
Using diff file
Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/4919.diff