Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8283566: G1: Improve G1BarrierSet::enqueue performance #7921

Closed
wants to merge 7 commits into from

Conversation

shipilev
Copy link
Contributor

@shipilev shipilev commented Mar 23, 2022

While looking at startup/warmup benchmarks for VarHandles, I noticed that G1BarrierSet::enqueue is not inlined and quite hot. Its uses also load the oop first, and then check if queues are active. This could be improved a bit. This only matters for the barrier paths that VM takes, which is the case for java.lang.invoke VM infra.

Not sure about enqueue_oop name, I did it to avoid accidental overload clash with enqueue(T* dst). Open for suggestions.

On a simple test:

import java.lang.invoke.MethodHandles;
import java.lang.invoke.VarHandle;

public class VHWarmup {

    static final int SIZE = 1_000_000;

    private VarHandle[] vhs;
    private int x;

    public static void main(String... args) throws Exception {
        new VHWarmup().run();
    }

    public void run() throws Exception {
        vhs = new VarHandle[SIZE];
        for (int c = 0; c < SIZE; c++) {
            VarHandle vh = MethodHandles.lookup().findVarHandle(VHWarmup.class, "x", int.class);
            vh.get(this);
            vhs[c] = vh;
        }
    }
}

Baseline:

$ perf stat -r 10 build/linux-x86_64-server-release/images/jdk/bin/java -Xms128m -Xmx128m VHWarmup > /dev/null

 Performance counter stats for 'build/linux-x86_64-server-release/images/jdk/bin/java -Xms128m -Xmx128m VHWarmup' (10 runs):

          1,392.13 msec task-clock                #    1.183 CPUs utilized            ( +-  0.45% )
               887      context-switches          #    0.637 K/sec                    ( +-  1.05% )
                 6      cpu-migrations            #    0.004 K/sec                    ( +- 14.72% )
            31,755      page-faults               #    0.023 M/sec                    ( +-  0.17% )
     4,708,582,477      cycles                    #    3.382 GHz                      ( +-  0.44% )  (50.46%)
       168,230,638      stalled-cycles-frontend   #    3.57% frontend cycles idle     ( +-  2.75% )  (50.35%)
       790,033,580      stalled-cycles-backend    #   16.78% backend cycles idle      ( +-  1.82% )  (50.03%)
     9,094,235,238      instructions              #    1.93  insn per cycle         
                                                  #    0.09  stalled cycles per insn  ( +-  0.22% )  (49.54%)
     1,716,367,002      branches                  # 1232.909 M/sec                    ( +-  0.38% )  (49.65%)
         3,908,772      branch-misses             #    0.23% of all branches          ( +-  2.89% )  (49.97%)

           1.17642 +- 0.00616 seconds time elapsed  ( +-  0.52% )

Patched:

$ perf stat -r 10 build/linux-x86_64-server-release/images/jdk/bin/java -Xms128m -Xmx128m VHWarmup > /dev/null

 Performance counter stats for 'build/linux-x86_64-server-release/images/jdk/bin/java -Xms128m -Xmx128m VHWarmup' (10 runs):

          1,288.69 msec task-clock                #    1.203 CPUs utilized            ( +-  0.73% )
               860      context-switches          #    0.667 K/sec                    ( +-  1.38% )
                 7      cpu-migrations            #    0.006 K/sec                    ( +- 11.96% )
            31,830      page-faults               #    0.025 M/sec                    ( +-  0.13% )
     4,331,508,315      cycles                    #    3.361 GHz                      ( +-  0.72% )  (50.01%)
       151,638,797      stalled-cycles-frontend   #    3.50% frontend cycles idle     ( +-  2.94% )  (49.56%)
       604,797,159      stalled-cycles-backend    #   13.96% backend cycles idle      ( +-  4.17% )  (49.70%)
     8,517,580,297      instructions              #    1.97  insn per cycle         
                                                  #    0.07  stalled cycles per insn  ( +-  0.50% )  (49.99%)
     1,600,179,319      branches                  # 1241.714 M/sec                    ( +-  0.46% )  (50.44%)
         4,129,679      branch-misses             #    0.26% of all branches          ( +-  4.39% )  (50.30%)

           1.07125 +- 0.00822 seconds time elapsed  ( +-  0.77% )

Additional testing:

  • Linux x86_64 fastdebug tier1
  • Linux x86_64 fastdebug tier2
  • SPECjvm2008 shows no regressions

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8283566: G1: Improve G1BarrierSet::enqueue performance

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7921/head:pull/7921
$ git checkout pull/7921

Update a local copy of the PR:
$ git checkout pull/7921
$ git pull https://git.openjdk.java.net/jdk pull/7921/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7921

View PR using the GUI difftool:
$ git pr show -t 7921

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7921.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 23, 2022

👋 Welcome back shade! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Mar 23, 2022
@openjdk
Copy link

openjdk bot commented Mar 23, 2022

@shipilev The following label will be automatically applied to this pull request:

  • hotspot-gc

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-gc label Mar 23, 2022
@mlbridge
Copy link

mlbridge bot commented Mar 23, 2022

Webrevs

Copy link
Contributor

@tschatzl tschatzl left a comment

Maybe rename enqueue_oop to enqueue_not_null? This is a fairly common naming scheme for methods specialized that way.

Lgtm.

@openjdk
Copy link

openjdk bot commented Mar 24, 2022

@shipilev This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8283566: G1: Improve G1BarrierSet::enqueue performance

Reviewed-by: tschatzl, ayang

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 3 new commits pushed to the master branch:

  • 720e751: 8283937: riscv: RVC: Fix c_beqz to c_bnez
  • 51c05e8: 8282764: AArch64: compiler/vectorapi/reshape/TestVectorCastNeon.java failed with incorrect result
  • b82b009: 8283737: riscv: MacroAssembler::stop() should emit fixed-length instruction sequence

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Mar 24, 2022
@shipilev
Copy link
Contributor Author

shipilev commented Mar 24, 2022

Maybe rename enqueue_oop to enqueue_not_null? This is a fairly common naming scheme for methods specialized that way.

Thing is, it is not about nullity. It is about whether we have already loaded the oop or not. Maybe enqueue_loaded?

@tschatzl
Copy link
Contributor

tschatzl commented Mar 25, 2022

Thing is, it is not about nullity. It is about whether we have already loaded the oop or not. Maybe enqueue_loaded?

Okay.

@shipilev
Copy link
Contributor Author

shipilev commented Mar 28, 2022

Thing is, it is not about nullity. It is about whether we have already loaded the oop or not. Maybe enqueue_loaded?

Okay.

After some back and forth, I settled on enqueue_loc for new method. This matches some of the related methods we have in Shenandoah. This also simplifies the patch.

@shipilev
Copy link
Contributor Author

shipilev commented Mar 30, 2022

Thing is, it is not about nullity. It is about whether we have already loaded the oop or not. Maybe enqueue_loaded?

Okay.

After some back and forth, I settled on enqueue_loc for new method. This matches some of the related methods we have in Shenandoah. This also simplifies the patch.

Any opinions about this version?

@shipilev
Copy link
Contributor Author

shipilev commented Mar 31, 2022

Thanks for reviews!

/integrate

@openjdk
Copy link

openjdk bot commented Mar 31, 2022

Going to push as commit 6ebf845.
Since your change was applied there have been 38 commits pushed to the master branch:

  • d276da5: 8281469: aarch64: Improve interpreter stack banging
  • a41550b: 8283842: TestZoneTextPrinterParser.test_roundTripAtOverlap fails: DateTimeParseException
  • 207b099: 8283890: Changes in CFG file format break C1Visualizer
  • 49fcc7a: 8283013: Simplify Arguments::parse_argument()
  • 73cb922: 8284026: Use unmodifiable collections where practical
  • 77a205a: 8284090: com/sun/security/auth/module/AllPlatforms.java fails to compile
  • 64025b0: 8283901: Introduce "make doctor" to diagnose build environment problems
  • 5740a3b: 8280193: summary javadoc for java.awt.GraphicsEnvironment#preferProportionalFonts broken
  • a11cc97: 8283997: Unused argument in GraphKit::builtin_throw
  • 067b258: 8224977: [macos] On AquaLookAndFeel, Iconified JInternalFrame does not restore when Control + F5 is used.
  • ... and 28 more: https://git.openjdk.java.net/jdk/compare/edb42d7b0aabd07ef9f46cb1f0e862f7bb912378...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated label Mar 31, 2022
@openjdk openjdk bot closed this Mar 31, 2022
@openjdk openjdk bot removed ready rfr labels Mar 31, 2022
@openjdk
Copy link

openjdk bot commented Mar 31, 2022

@shipilev Pushed as commit 6ebf845.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@shipilev shipilev deleted the JDK-8283566-g1-enqueue branch Apr 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-gc integrated
3 participants