Skip to content

Conversation

@eastig
Copy link
Member

@eastig eastig commented Nov 14, 2025

Arm Neoverse N1 erratum 1542419: "The core might fetch a stale instruction from memory which violates the ordering of instruction fetches". It is fixed in Neoverse N1 r4p1.

Neoverse-N1 implementations mitigate erratum 1542419 with a workaround:

  • Disable coherent icache.
  • Trap IC IVAU instructions.
  • Execute:
    • tlbi vae3is, xzr
    • dsb sy

tlbi vae3is, xzr invalidates translations for all address spaces (global for address). It waits for all memory accesses using in-scope old translation information to complete before it is considered complete.

As this workaround has significant overhead, Arm Neoverse N1 (MP050) Software Developer Errata Notice version 29.0 suggests:

"Since one TLB inner-shareable invalidation is enough to avoid this erratum, the number of injected TLB invalidations should be minimized in the trap handler to mitigate the performance impact due to this workaround."

This PR introduces a mechanism to defer instruction cache (ICache) invalidation for AArch64 to address the Arm Neoverse N1 erratum 1542419, which causes significant performance overhead if ICache invalidation is performed too frequently. The implementation includes detection of affected Neoverse N1 CPUs and automatic enabling of the workaround for relevant Neoverse N1 revisions.

Changes include:

  • Added a new diagnostic JVM flag NeoverseN1Errata1542419 to enable or disable the workaround for the erratum. The flag is automatically enabled for Neoverse N1 CPUs prior to r4p1, as detected during VM initialization.
  • Introduced the ICacheInvalidationContext class to manage deferred ICache invalidation, with platform-specific logic for AArch64. This context is used to batch ICache invalidations, reducing performance impact. As the address for icache invalidation is not relevant, we use the nmethod's code start address.
  • Provided a default (no-op) implementation for ICacheInvalidationContext on platforms where the workaround is not needed, ensuring portability and minimal impact on other architectures.
  • Modified barrier patching and relocation logic (ZBarrierSetAssembler, ZNMethod, RelocIterator, and related code) to accept a defer_icache_invalidation parameter, allowing ICache invalidation to be deferred and later performed in bulk.

Benchmarking results: Neoverse-N1 r3p1 (Graviton 2)

  • Baseline
$ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:-NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGCThreads=1 -jar benchmarks.jar org.openjdk.bench.vm.gc.GCPatchingNmethodCost

Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt     Score     Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3    73.937 ±  17.764  ms/op
GCPatchingNmethodCost.fullGC                       2           5000  avgt    3   648.331 ±  85.773  ms/op
GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  1221.186 ±  72.401  ms/op
GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  2336.644 ± 446.816  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3    77.495 ±  11.963  ms/op
GCPatchingNmethodCost.systemGC                     2           5000  avgt    3   662.447 ± 231.244  ms/op
GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  1217.174 ± 232.325  ms/op
GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  2339.458 ± 271.820  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3     9.955 ±   1.649  ms/op
GCPatchingNmethodCost.youngGC                      2           5000  avgt    3   163.623 ±  42.342  ms/op
GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   318.399 ±  87.674  ms/op
GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   618.169 ± 191.474  ms/op
  • Fix
$ taskset -c 0-3 java -Xbootclasspath/a:./wb.jar -XX:+UnlockDiagnosticVMOptions -XX:+NeoverseN1Errata1542419 -XX:+UseZGC -XX:ZYoungGCThreads=1 -XX:ZOldGCThreads=1 -jar benchmarks.jar org.openjdk.bench.vm.gc.GCPatchingNmethodCost

Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt    Score    Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3   88.865 ± 19.299  ms/op
GCPatchingNmethodCost.fullGC                       2           5000  avgt    3  146.184 ± 11.531  ms/op
GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  186.429 ± 16.257  ms/op
GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  262.933 ± 13.071  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3   90.572 ± 14.750  ms/op
GCPatchingNmethodCost.systemGC                     2           5000  avgt    3  148.335 ± 21.456  ms/op
GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  190.828 ± 12.268  ms/op
GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  265.768 ± 46.669  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3   10.219 ±  0.877  ms/op
GCPatchingNmethodCost.youngGC                      2           5000  avgt    3   19.035 ±  2.699  ms/op
GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   26.005 ±  2.179  ms/op
GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   42.322 ± 85.691  ms/op

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GCs and JIT performance (Enhancement - P2)

Reviewers

Contributors

  • Axel Boldt-Christmas <aboldtch@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/28328/head:pull/28328
$ git checkout pull/28328

Update a local copy of the PR:
$ git checkout pull/28328
$ git pull https://git.openjdk.org/jdk.git pull/28328/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 28328

View PR using the GUI difftool:
$ git pr show -t 28328

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/28328.diff

Using Webrev

Link to Webrev Comment

@eastig
Copy link
Member Author

eastig commented Nov 14, 2025

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 14, 2025

👋 Welcome back eastigeevich! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 14, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@eastig
Copy link
Member Author

eastig commented Nov 14, 2025

/contributor add xmas92

@openjdk
Copy link

openjdk bot commented Nov 14, 2025

@eastig xmas92 was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

  • /contributor add @openjdk-bot
  • /contributor add duke
  • /contributor add J. Duke <duke@openjdk.org>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

@openjdk openjdk bot added the hotspot hotspot-dev@openjdk.org label Nov 14, 2025
@openjdk
Copy link

openjdk bot commented Nov 14, 2025

@eastig The following label will be automatically applied to this pull request:

  • hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@eastig
Copy link
Member Author

eastig commented Nov 14, 2025

/contributor add aboldtch

@openjdk
Copy link

openjdk bot commented Nov 14, 2025

@eastig
Contributor Axel Boldt-Christmas <aboldtch@openjdk.org> successfully added.

@eastig
Copy link
Member Author

eastig commented Nov 14, 2025

Hi @fisk @theRealAph @xmas92 @shipilev

I created this draft PR based on @xmas92 work master...xmas92:jdk:deferred_icache_invalidation

Alex wrote about his implementation in JDK-8370947:

The implementation I linked is very aarch64 centric. I would like to create a bit nicer abstraction for this to allow easier adaption for other platforms.

I see his changes touch other backends. I tried to minimize changes and to avoid them in other backends.
I don't think the concept of deferred icache invalidation will be used anywhere but for Neoverse-N1 errata.

This PR does not cover all cases in ZGC at the moment. It can be done as soon as we agree with a proper way to fix.

I'd like to hear your opinion which way we should choose:

  • Abstraction of deferred icache invalidation supported in all backends.
  • Concrete implementation focused on Neoverse-N1.

@openjdk
Copy link

openjdk bot commented Nov 20, 2025

@eastig this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8370947
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Nov 20, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Nov 20, 2025
@eastig
Copy link
Member Author

eastig commented Nov 20, 2025

/issue JDK-8370947

@eastig eastig marked this pull request as ready for review November 20, 2025 14:50
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 20, 2025
@openjdk
Copy link

openjdk bot commented Nov 20, 2025

@eastig This issue is referenced in the PR title - it will now be updated.

@mlbridge
Copy link

mlbridge bot commented Nov 20, 2025

@eastig
Copy link
Member Author

eastig commented Nov 20, 2025

@fisk, @xmas92
I added a JMH microbenchmark and its results from Graviton 2.
I added deferred icache invalidation to ZGC where it's needed.

I found one place where I am not sure:

virtual void do_nmethod(nmethod* nm) {

class ZHeapIteratorNMethodClosure : public NMethodClosure {
private:
  OopClosure* const        _cl;
  BarrierSetNMethod* const _bs_nm;

public:
  ZHeapIteratorNMethodClosure(OopClosure* cl)
    : _cl(cl),
      _bs_nm(BarrierSet::barrier_set()->barrier_set_nmethod()) {}

  virtual void do_nmethod(nmethod* nm) {
    // If ClassUnloading is turned off, all nmethods are considered strong,
    // not only those on the call stacks. The heap iteration might happen
    // before the concurrent processing of the code cache, make sure that
    // all nmethods have been processed before visiting the oops.
    _bs_nm->nmethod_entry_barrier(nm);

    ZNMethod::nmethod_oops_do(nm, _cl);
  }
};

For _bs_nm->nmethod_entry_barrier(nm) we use deferred icache invalidation. I'm not sure it is safe for ZNMethod::nmethod_oops_do(nm, _cl). It looks like it is safe because the code is called at a safepoint: ZHeap::object_iterate, ZHeap::object_and_field_iterate_for_verify and ZHeap::parallel_object_iterator.

Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the implementation is fine. We can always extend it later if we find that other platforms or hardware needs this sort of treatment.

My knowledge and experience with arm hardware implementation specifics are rather lacking. So I cannot comment on the validity of the assertions made here w.r.t. only invalidating the first instruction in the nmethod etc.

Hopefully some of our resident arm experts can chime in.

/reviewers 2 reviewer

@openjdk
Copy link

openjdk bot commented Nov 20, 2025

@xmas92
The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

@eastig
Copy link
Member Author

eastig commented Nov 20, 2025

@fisk, @xmas92
The added microbenchmark shows interesting regressions when an nmethod has no accesses to object's fields:

Benchmark                       Score     Error  Units
GCPatchingNmethodCost.fullGC:base                       73.937 ±  17.764  ms/op
GCPatchingNmethodCost.systemGC:base                     77.495 ±  11.963  ms/op
GCPatchingNmethodCost.youngGC:base                      9.955 ±   1.649  ms/op
GCPatchingNmethodCost.fullGC:fix                        88.865 ± 19.299  ms/op +20.1%
GCPatchingNmethodCost.systemGC:fix                      90.572 ± 14.750  ms/op +16.9%
GCPatchingNmethodCost.youngGC:fix                       10.219 ±  0.877  ms/op +2.7%

// the performance impact due to this workaround."
//
// As the address for icache invalidation is not relevant, we use the nmethod's code start address.
ICache::invalidate_word(_nm->code_begin());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than call ICache::invalidate_word(), I believe we should explicitly execute the instructions in the workaround.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot execute tlbi vae3is here because it requires EL3. We are at EL0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you mean IC IVAU?`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the call of ICache::invalidate_word() with:

    asm volatile("dsb ish       \n"
                 "ic  ivau, xzr \n"
                 "isb           \n"
                 : : : "memory");

The code executed in ICache::invalidate_word() when all checks are done:

dsb ish
ic  ivau
dsb ish
isb

I use xzr in ic ivau because an address in it does not matter. The instruction is trapped and ignored.
I think we don't need the second dsb because we will have dsb sy in the trap handler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we want to jeapordize the correctness of the JVM code based on the exact instructions that are currently used to mitigate this issue in the kernel. Eliding the trailing dsb ish because we know the kernel mitigation runs it, seems unnecessarily fragile to me; if the kernel comes up with some smarter and cheaper way of mitigating this in the future, using some other magic incantation, then I don't want to have a correctness issue because of that implicit assumption.

Is it noticeably expensive to run the trailing dsb again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need dsb if we use ic, according to the Arm manual. They are redundant if we have hardware instruction cache coherence enable. On one side we know that the hardware icache coherence is working and ic is ignored. On another side, we check the hardware icache coherence is disabled and we should follow Arm ARM.

I don't expect that having dsb has noticeable performance impact. I haven't seen any.
I agree with prioritizing correctness.

@theRealAph
Copy link
Contributor

I think we'll also want a workaround for CodeBuffer::relocate_code_to().

@eastig
Copy link
Member Author

eastig commented Nov 20, 2025

I think we'll also want a workaround for CodeBuffer::relocate_code_to().

Also we need to fix G1NMethodClosure::do_evacuation_and_fixup and ShenandoahNMethod::oops_do. They use nmethod::fix_oop_relocations.

Should we do it in this PR or in separate PRs?

@theRealAph
Copy link
Contributor

I think we'll also want a workaround for CodeBuffer::relocate_code_to().

Also we need to fix G1NMethodClosure::do_evacuation_and_fixup and ShenandoahNMethod::oops_do. They use nmethod::fix_oop_relocations.

Should we do it in this PR or in separate PRs?

Please, let's handle it all here.

Copy link
Member

@xmas92 xmas92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some style comments.

@eastig
Copy link
Member Author

eastig commented Nov 26, 2025

@xmas92
I fixed regressions for Java methods without field accesses I saw:

  • -XX:+NeoverseN1Errata1542419 before the fix
Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt    Score    Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3   88.865 ± 19.299  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3   90.572 ± 14.750  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3   10.219 ±  0.877  ms/op
  • -XX:+NeoverseN1Errata1542419 after the fix
Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt   Score    Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3  60.847 ± 23.735  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3  62.338 ±  5.663  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3   4.956 ±  1.440  ms/op
  • -XX:-NeoverseN1Errata1542419
Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt   Score    Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3  67.144 ± 15.187  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3  70.181 ± 30.271  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3   7.906 ±  2.118  ms/op

I'll check SpecJVM as well.

@eastig
Copy link
Member Author

eastig commented Nov 27, 2025

@theRealAph @fisk
As we have explicit synchronization for the patched code, I decided to run an experiment of deferred icache invalidation on Graviton 3(Neoverse V1).
Graviton 3 does not have Neoverse N1 bug. It has hardware dcache and icache coherence. Such full hardware coherence means all ICache:invalidate operations are just a bunch:

dsb ish
isb

From my experience of implementing spin pauses, we use isb for pauses. So our multiple ICache:invalidate are a bunch of pauses.

Without deferred icache invalidation (baseline):

Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt    Score    Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3   41.290 ±  7.596  ms/op
GCPatchingNmethodCost.fullGC                       2           5000  avgt    3   95.773 ±  6.059  ms/op
GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  137.183 ± 12.896  ms/op
GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  219.030 ± 19.101  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3   43.762 ±  3.818  ms/op
GCPatchingNmethodCost.systemGC                     2           5000  avgt    3   97.525 ±  8.434  ms/op
GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  139.555 ± 17.159  ms/op
GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  221.163 ±  8.908  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3    3.052 ±  2.823  ms/op
GCPatchingNmethodCost.youngGC                      2           5000  avgt    3   13.956 ±  1.984  ms/op
GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   22.364 ±  0.626  ms/op
GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   39.821 ±  0.241  ms/op

With deferred icache invalidation:

Benchmark                       (accessedFieldCount)  (methodCount)  Mode  Cnt    Score    Error  Units
GCPatchingNmethodCost.fullGC                       0           5000  avgt    3   41.212 ± 10.914  ms/op
GCPatchingNmethodCost.fullGC                       2           5000  avgt    3   83.059 ± 17.115  ms/op
GCPatchingNmethodCost.fullGC                       4           5000  avgt    3  110.061 ±  2.642  ms/op
GCPatchingNmethodCost.fullGC                       8           5000  avgt    3  161.202 ±  5.750  ms/op
GCPatchingNmethodCost.systemGC                     0           5000  avgt    3   44.061 ±  7.586  ms/op
GCPatchingNmethodCost.systemGC                     2           5000  avgt    3   84.262 ± 11.852  ms/op
GCPatchingNmethodCost.systemGC                     4           5000  avgt    3  112.317 ±  3.907  ms/op
GCPatchingNmethodCost.systemGC                     8           5000  avgt    3  163.684 ±  9.732  ms/op
GCPatchingNmethodCost.youngGC                      0           5000  avgt    3    2.949 ±  0.626  ms/op
GCPatchingNmethodCost.youngGC                      2           5000  avgt    3    9.997 ±  1.334  ms/op
GCPatchingNmethodCost.youngGC                      4           5000  avgt    3   14.953 ±  1.121  ms/op
GCPatchingNmethodCost.youngGC                      8           5000  avgt    3   23.966 ±  1.656  ms/op

Improvements:

  • 2 fields accessed
    • Full GC: 13%
    • System GC: 14%
    • Young GC: 28%
  • 4 fields accessed
    • Full GC: 20%
    • System GC: 20%
    • Young GC: 33%
  • 8 fields accessed
    • Full GC: 26%
    • System GC: 26%
    • Young GC: 40%

Copilot AI review requested due to automatic review settings December 3, 2025 14:35
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @copilot-pull-request-reviewer[bot], thanks for making a comment in an OpenJDK project!

All comments and discussions in the OpenJDK Community must be made available under the OpenJDK Terms of Use. If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please Use "Add GitHub user copilot-pull-request-reviewer[bot]" for the summary.

If you are not an OpenJDK Author, Committer or Reviewer, simply check the box below to accept the OpenJDK Terms of Use for your comments.

Your comment will be automatically restored once you have accepted the OpenJDK Terms of Use.

@openjdk openjdk bot added shenandoah shenandoah-dev@openjdk.org and removed rfr Pull request is ready for review labels Dec 3, 2025
@openjdk
Copy link

openjdk bot commented Dec 3, 2025

@eastig shenandoah has been added to this pull request based on files touched in new commit(s).

@eastig eastig changed the title 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GenZGC performance 8370947: Mitigate Neoverse-N1 erratum 1542419 negative impact on GCs and JIT performance Dec 3, 2025
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 3, 2025
@eastig
Copy link
Member Author

eastig commented Dec 3, 2025

@theRealAph @fisk @shipilev
I have updated all places to use optimized icache invalidation. Could you please have a look?
I am running different tests and benchmarks.

@fisk @shipilev

  • I added nmethod::has_non_immediate_oops. I think it's easy to detect them when we generate code. If this is OK, we might need to update ZNMethod::attach_gc_data and ShenandoahNMethod::detect_reloc_oops.
  • Code of G1NMethodClosure::do_evacuation_and_fixup(nmethod* nm) looks strange:
  _oc.set_nm(nm);

  // Evacuate objects pointed to by the nmethod
  nm->oops_do(&_oc);

  if (_strong) {
    // CodeCache unloading support
    nm->mark_as_maybe_on_stack();

    BarrierSetNMethod* bs_nm = BarrierSet::barrier_set()->barrier_set_nmethod();
    bs_nm->disarm(nm);
  }

  ICacheInvalidationContext icic(nm->has_non_immediate_oops());

  nm->fix_oop_relocations();

If _strong is true, we disarm nm and patch it with fix_oop_relocations. I have assertions checking we can defer icache invalidation. Neither of them are triggered. I thing this path always happens at a safepoint.

@eastig
Copy link
Member Author

eastig commented Dec 3, 2025

/cc hotspot-gc

@openjdk openjdk bot added the hotspot-gc hotspot-gc-dev@openjdk.org label Dec 3, 2025
@openjdk
Copy link

openjdk bot commented Dec 3, 2025

@eastig
The hotspot-gc label was successfully added.

@eastig eastig requested review from fisk and theRealAph December 3, 2025 15:52
Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting work! I was able to look through it very briefly:

Comment on lines +183 to +184
@Warmup(iterations = 0)
@Measurement(iterations = 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what is the intent here. Maybe you wanted @BenchmarkMode(OneShot) instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current algorithm:

  • Create an object used in Java methods.
  • Run the methods in the interpreter.
  • Compile the methods.
  • Make the object garbage collectable.
  • Run GC (we measure this).

There are not many things to warm-up. And setting up everything for multiple iterations of GC runs might be expensive. Instead we use forks.

IMO, Yes it is @BenchmarkMode(OneShot).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but first GC would likely be slower, because it would have more real work to do. So you probably want OneShot with the default number of iterations. It will warmup by doing a few GCs, and then do a few other GCs for measurement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have Thread.sleep(1000) in setupCodeCache() to let everything to settle down. I use it because I saw high variance in GC times. With it variance became OK.

Maybe I should use System.gc() instead of Thread.sleep.

So you probably want OneShot with the default number of iterations.

Will I need to recreate an object and to rerun Java methods before each iteration? The first iteration will collect garbage object fields. So following iterations running GC will do nothing. Or will they patch nmethods again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot hotspot-dev@openjdk.org hotspot-gc hotspot-gc-dev@openjdk.org rfr Pull request is ready for review shenandoah shenandoah-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

6 participants