Skip to content

8296545: C2 Blackholes should allow load optimizations #11041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

shipilev
Copy link
Member

@shipilev shipilev commented Nov 8, 2022

If you look at generated code for the JMH benchmark like:

public class ArrayRead {
    @Param({"1", "100", "10000", "1000000"})
    int size;

    int[] is;

    @Setup
    public void setup() {
        is = new int[size];
        for (int c = 0; c < size; c++) {
            is[c] = c;
        }
    }

    @Benchmark
    public void test(Blackhole bh) {
        for (int i = 0; i < is.length; i++) {
            bh.consume(is[i]);
        }
    }
}

...then you would notice that the loop always re-reads is, is.length, does the range check, etc. -- all the things we would otherwise expect to be hoisted out of the loop.

This is because C2 blackholes are modeled as membars that pinch both control and memory slices (like you would expect from the opaque non-inlined call), therefore every iteration has to re-read the referenced memory contents and recompute everything dependent on those loads. This behavior is not new -- the old, non-compiler blackholes were doing the same thing, accidentally -- but it was drowned in blackhole overheads. Now, these effects are clearly visible.

We can try to do this a bit better: allow load optimizations to work across the blackholes, leaving only "prevent dead code elimination" part, as minimally required by blackhole semantics.

Motivational improvements on the test above:

Benchmark        (size)  Mode  Cnt        Score   Error     Units

# Before, full Java blackholes
ArrayRead.test        1  avgt    9        5.422 ±    0.023  ns/op
ArrayRead.test      100  avgt    9      460.619 ±    0.421  ns/op
ArrayRead.test    10000  avgt    9    44697.909 ± 1964.787  ns/op
ArrayRead.test  1000000  avgt    9  4332723.304 ± 2791.324  ns/op

# Before, compiler blackholes
ArrayRead.test        1  avgt    9        1.791 ±    0.007  ns/op
ArrayRead.test      100  avgt    9      114.103 ±    1.677  ns/op
ArrayRead.test    10000  avgt    9     8528.544 ±   52.010  ns/op
ArrayRead.test  1000000  avgt    9  1005139.070 ± 2883.011  ns/op

# After, compiler blackholes
ArrayRead.test        1  avgt    9        1.686 ±    0.006  ns/op  ; ~1.1x better
ArrayRead.test      100  avgt    9       16.249 ±    0.019  ns/op  ; ~7.0x better
ArrayRead.test    10000  avgt    9     1375.265 ±    2.420  ns/op  ; ~6.2x better
ArrayRead.test  1000000  avgt    9   136862.574 ± 1057.100  ns/op  ; ~7.3x better

-prof perfasm shows the reason for these improvements clearly:

Before:

          ↗  0x00007f0b54498360:   mov    0xc(%r12,%r10,8),%edx    ; range check 1
   7.97%  │  0x00007f0b54498365:   cmp    %edx,%r11d               
   1.27%  │  0x00007f0b54498368:   jae    0x00007f0b5449838f
          │  0x00007f0b5449836a:   shl    $0x3,%r10
   0.03%  │  0x00007f0b5449836e:   mov    0x10(%r10,%r11,4),%r10d  ; get "is[i]"
   7.76%  │  0x00007f0b54498373:   mov    0x10(%r9),%r10d          ; restore "is"
   0.24%  │  0x00007f0b54498377:   mov    0x3c0(%r15),%rdx         ; safepoint poll, part 1
  17.48%  │  0x00007f0b5449837e:   inc    %r11d                    ; i++
   0.17%  │  0x00007f0b54498381:   test   %eax,(%rdx)              ; safepoint poll, part 2  
  53.26%  │  0x00007f0b54498383:   mov    0xc(%r12,%r10,8),%edx    ; loop index check
   4.84%  │  0x00007f0b54498388:   cmp    %edx,%r11d
   0.31%  ╰  0x00007f0b5449838b:   jl     0x00007f0b54498360          

After:

          ↗  0x00007fa06c49a8b0:   mov    0x2c(%rbp,%r10,4),%r9d   ; stride read
  19.66%  │  0x00007fa06c49a8b5:   mov    0x28(%rbp,%r10,4),%edx
   0.14%  │  0x00007fa06c49a8ba:   mov    0x10(%rbp,%r10,4),%ebx
  22.09%  │  0x00007fa06c49a8bf:   mov    0x14(%rbp,%r10,4),%ebx
   0.21%  │  0x00007fa06c49a8c4:   mov    0x18(%rbp,%r10,4),%ebx
  20.19%  │  0x00007fa06c49a8c9:   mov    0x1c(%rbp,%r10,4),%ebx
   0.04%  │  0x00007fa06c49a8ce:   mov    0x20(%rbp,%r10,4),%ebx
  24.02%  │  0x00007fa06c49a8d3:   mov    0x24(%rbp,%r10,4),%ebx
   0.21%  │  0x00007fa06c49a8d8:   add    $0x8,%r10d               ; i += 8
          │  0x00007fa06c49a8dc:   cmp    %esi,%r10d
   0.07%  ╰  0x00007fa06c49a8df:   jl     0x00007fa06c49a8b0       

Additional testing:

  • Eyeballing JMH Samples -prof perfasm
  • Linux x86_64 fastdebug, compiler/blackhole, compiler/c2/irTests/blackhole
  • Linux x86_64 fastdebug, JDK benchmark corpus

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8296545: C2 Blackholes should allow load optimizations

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11041/head:pull/11041
$ git checkout pull/11041

Update a local copy of the PR:
$ git checkout pull/11041
$ git pull https://git.openjdk.org/jdk pull/11041/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 11041

View PR using the GUI difftool:
$ git pr show -t 11041

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11041.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 8, 2022

👋 Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 8, 2022

@shipilev The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Nov 8, 2022
@shipilev shipilev marked this pull request as ready for review November 8, 2022 22:06
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 8, 2022
@mlbridge
Copy link

mlbridge bot commented Nov 8, 2022

@adinn
Copy link
Contributor

adinn commented Nov 8, 2022

Ooh, nice work. Unfortunately the VerifyGraphEdges test seems to be failing.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
It is just copy of insert_mem_bar() without setting memory to BH.

@openjdk
Copy link

openjdk bot commented Nov 8, 2022

@shipilev This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8296545: C2 Blackholes should allow load optimizations

Reviewed-by: kvn, vlivanov

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been no new commits pushed to the master branch. If another commit should be pushed before you perform the /integrate command, your PR will be automatically rebased. If you prefer to avoid any potential automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Nov 8, 2022
@vnkozlov
Copy link
Contributor

vnkozlov commented Nov 9, 2022

Ooh, nice work. Unfortunately the VerifyGraphEdges test seems to be failing.

https://bugs.openjdk.org/browse/JDK-8295867

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me with one comment.

Copy link
Contributor

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

(hs-tier1 - hs-tier2 testing passed.)

@shipilev
Copy link
Member Author

Thanks for reviews. I'll sit on it a bit, while doing more extensive benchmarking with these new blackholes.

@shipilev
Copy link
Member Author

shipilev commented Dec 5, 2022

Experiments look fine. I am integrating to get this to JDK 20.

/integrate

@openjdk
Copy link

openjdk bot commented Dec 5, 2022

Going to push as commit eab0ada.
Since your change was applied there have been 6 commits pushed to the master branch:

  • dea2161: 8297959: Provide better descriptions for some Operating System JFR events
  • d523d9d: 8297864: Dead code elimination
  • 777fb52: 8297974: ClassCastException in com.sun.tools.javac.comp.AttrRecover.doRecovery
  • 17666fb: 8297794: Deprecate JMX Management Applets for Removal
  • 619b68c: 8294540: Remove Opaque2Node: it is broken and triggers assert
  • 82561de: 8296384: [TESTBUG] sun/security/provider/SecureRandom/AbstractDrbg/SpecTest.java intermittently timeout

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 5, 2022
@openjdk openjdk bot closed this Dec 5, 2022
@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Dec 5, 2022
@openjdk openjdk bot removed the rfr Pull request is ready for review label Dec 5, 2022
@openjdk
Copy link

openjdk bot commented Dec 5, 2022

@shipilev Pushed as commit eab0ada.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@shipilev shipilev deleted the JDK-8296545-blackhole-effects branch January 4, 2023 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants