Skip to content

Conversation

@rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Dec 2, 2024

8234003 (Improve IndexSet iteration) broke live in computation:

@@ -273,23 +276,25 @@ void PhaseLive::add_liveout( Block *p, IndexSet *lo, VectorSet &first_pass ) {
 // Add a vector of live-in values to a given blocks live-in set.
 void PhaseLive::add_livein(Block *p, IndexSet *lo) {
   IndexSet *livein = &_livein[p->_pre_order-1];
-  IndexSetIterator elements(lo);
-  uint r;
-  while ((r = elements.next()) != 0) {
-    livein->insert(r);         // Then add to live-in set
+  if (!livein->is_empty()) {
+    IndexSetIterator elements(lo);
+    uint r;
+    while ((r = elements.next()) != 0) {
+      livein->insert(r);         // Then add to live-in set
+    }
   }
 }

livein is initially empy and the patch above only adds element to it if:

if (!livein->is_empty()) {

which is never true.

This doesn't affect correctness as live in sets are only used to drive
scheduling.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8345287: C2: live in computation is broken (Bug - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22473/head:pull/22473
$ git checkout pull/22473

Update a local copy of the PR:
$ git checkout pull/22473
$ git pull https://git.openjdk.org/jdk.git pull/22473/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22473

View PR using the GUI difftool:
$ git pr show -t 22473

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22473.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 2, 2024

👋 Welcome back roland! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Dec 2, 2024

@rwestrel This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8345287: C2: live in computation is broken

Reviewed-by: kvn, dlong, rcastanedalo

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 171 new commits pushed to the master branch:

  • 2286fae: 8345159: RISCV: Fix -Wzero-as-null-pointer-constant warning in emit_static_call_stub
  • 8403285: 8268145: [macos] Rendering artifacts is seen when text inside the JTable with TableCellEditor having JTextfield
  • aa38284: 8345435: Eliminate tier1_compiler_not_xcomp group
  • 9284602: 8345628: [BACKOUT] JDK-8287122 Use gcc12 -ftrivial-auto-var-init=pattern in debug builds
  • 41c8971: 8287122: Use gcc12 -ftrivial-auto-var-init=pattern in debug builds
  • 5da0eee: 8285692: Enable _FORTIFY_SOURCE=2 when building with Clang
  • daa2ba5: 8339622: Regression in make open-hotspot-xcode-project
  • 6f6bce5: 8344559: Log is spammed by missing pandoc warnings when building man pages
  • 5f30a8d: 8345424: Move FindDebuginfoFiles out of FileUtils.gmk
  • bf0debc: 8343890: SEGV crash in RunTimeClassInfo::klass
  • ... and 161 more: https://git.openjdk.org/jdk/compare/3b21a298c29d88720f6bfb2dc1f3305b6a3db307...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 2, 2024
@openjdk
Copy link

openjdk bot commented Dec 2, 2024

@rwestrel The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Dec 2, 2024
@mlbridge
Copy link

mlbridge bot commented Dec 2, 2024

Webrevs

@robcasloz
Copy link
Contributor

Good catch! Do you have an example where the final schedule is affected by this issue?

@dafedafe
Copy link
Contributor

dafedafe commented Dec 2, 2024

Well spotted!
I've just got one doubt: do we actually need that if (!lo->is_empty()) at all?

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 2, 2024

Good catch! Do you have an example where the final schedule is affected by this issue?

I don't. I noticed that live in sets are always empty while looking at memory usage of some IndexSet. It is puzzling that this didn't cause any performance regression. So it may be worth exploring why.

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 2, 2024

Well spotted! I've just got one doubt: do we actually need that if (!lo->is_empty()) at all?

It's a performance optimization from https://bugs.openjdk.org/browse/JDK-8234003
Otherwise the iterator code will look for the next element in the set which is done by iterating over an array that can be large even when the set is empty.
8234003 introduced the bug here.

@robcasloz
Copy link
Contributor

Good catch! Do you have an example where the final schedule is affected by this issue?

I don't. I noticed that live in sets are always empty while looking at memory usage of some IndexSet. It is puzzling that this didn't cause any performance regression. So it may be worth exploring why.

OK, thanks. I will start by running some benchmarks on x64 with and without this fix. Will report results in a couple of days.

@dafedafe
Copy link
Contributor

dafedafe commented Dec 2, 2024

It's a performance optimization from https://bugs.openjdk.org/browse/JDK-8234003
... The biggest improvement comes from avoiding iterating over empty sets
altogether.

🤦‍♂️ of course! Thanks for the explanation (sorry that I missed that)!

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 2, 2024

OK, thanks. I will start by running some benchmarks on x64 with and without this fix. Will report results in a couple of days.

Sounds good. Thanks. If there happen to be a regression, I think it would make more sense to fix the code (with this patch) and disable OptoRegScheduling (until someone figures out what's going on) than keep code that doesn't make any sense.

@robcasloz
Copy link
Contributor

Test results on Oracle CI look good, benchmarks are still running but the results look rather neutral so far.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 3, 2024
@robcasloz
Copy link
Contributor

Test results on Oracle CI look good, benchmarks are still running but the results look rather neutral so far.

Benchmarking is done now, the results of applying this changeset are as follows:

  • neutral for DaCapo (bach and chopin)
  • slight overall regression for Renaissance
  • one slight improvement (crypto.signverify, around +1%) and one noticeably regression (scimark.monte_carlo, around -5%) for SPECjvm2008 on a Coffee Lake-B processor (3.0 GHz Intel Core i5-8500B).

Given these results, I think it might be good to investigate what is the overall effect of OptoRegScheduling and whether it makes sense to keep it enabled for x64 before integrating this fix. The opposite order (integrating and then investigating how to deal with the regressions) seems like a higher-risk approach. @rwestrel @vnkozlov @dean-long what do you think?

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 5, 2024

Given these results, I think it might be good to investigate what is the overall effect of OptoRegScheduling and whether it makes sense to keep it enabled for x64 before integrating this fix. The opposite order (integrating and then investigating how to deal with the regressions) seems like a higher-risk approach. @rwestrel @vnkozlov @dean-long what do you think?

What happens to the regresions with OptoRegScheduling turned off?
It may take a while to investigate the regressions. I wouldn't delay this fix to obviously very broken code until then if possible. So maybe wait for the fork, push this fix and then create a PR to disable OptoRegScheduling?

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 5, 2024

Benchmarking is done now, the results of applying this changeset are as follows:

* neutral for DaCapo (bach and chopin)

* slight overall regression for Renaissance

* one slight improvement (crypto.signverify, around +1%) and one noticeably regression (scimark.monte_carlo, around -5%) for SPECjvm2008 on a Coffee Lake-B processor (3.0 GHz Intel Core i5-8500B).

Thanks for running benchmarks!

@robcasloz
Copy link
Contributor

What happens to the regresions with OptoRegScheduling turned off?

I will do a new benchmark run of jdk-24+26 with default configuration vs. jdk-24+26 with -XX:-OptoRegScheduling on x64 and report the results in a few days (likely early next week).

@vnkozlov
Copy link
Contributor

vnkozlov commented Dec 5, 2024

All benefits described in JDK-8234003 could be due to not iterating at all in add_liveout(). Fixing it will place us at least at the same state as before those changes.

-5% regression in monte-carlo is seen only on old macos-x64 machine which does not have AVX512 (only AVX2). It would be interesting to investigate but I think it is fine if we do that after push. So I agree to integrate this fix into JDK 24. Fixing regression could be done in 24u update release.

OptoRegScheduling should be investigated separately (in JDK 25) regardless this fix.

@robcasloz
Copy link
Contributor

All benefits described in JDK-8234003 could be due to not iterating at all in add_liveout(). Fixing it will place us at least at the same state as before those changes.

-5% regression in monte-carlo is seen only on old macos-x64 machine which does not have AVX512 (only AVX2). It would be interesting to investigate but I think it is fine if we do that after push. So I agree to integrate this fix into JDK 24. Fixing regression could be done in 24u update release.

OptoRegScheduling should be investigated separately (in JDK 25) regardless this fix.

OK.

@robcasloz
Copy link
Contributor

I will do a new benchmark run of jdk-24+26 with default configuration vs. jdk-24+26 with -XX:-OptoRegScheduling on x64 and report the results in a few days (likely early next week).

Turning off -XX:-OptoRegScheduling seems to have a slight overall negative effect on Renaissance (would require a more thorough analysis to confirm) and no measurable effect on DaCapo and SPECjvm2008 except for a slight improvement (around 1%) in crypto.signverify for older Coffee Lake-B machines. I agree with Vladimir that it would be worth further investigating (separately) whether to disable (or improve) OptoRegScheduling.

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 9, 2024

Turning off -XX:-OptoRegScheduling seems to have a slight overall negative effect on Renaissance (would require a more thorough analysis to confirm) and no measurable effect on DaCapo and SPECjvm2008 except for a slight improvement (around 1%) in crypto.signverify for older Coffee Lake-B machines.

Thanks @robcasloz for the test results. I filed https://bugs.openjdk.org/browse/JDK-8345820

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 9, 2024

@robcasloz @vnkozlov @dean-long thanks for the reviews

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 9, 2024

/integrate

@openjdk
Copy link

openjdk bot commented Dec 9, 2024

Going to push as commit cc628a1.
Since your change was applied there have been 193 commits pushed to the master branch:

  • 480b508: 8345156: C2: Add bailouts next to a few asserts
  • b120404: 8345793: Update copyright year to 2024 for the build system in files where it was missed
  • 7aa0cbc: 8345614: Improve AnnotationFormatError message for duplicate annotation interfaces
  • 35c0053: 8345405: Add JMH showing the regression in 8341649
  • 166c127: 8345726: Update mx in RunTestPrebuiltSpec to reflect change in JDK-8345302
  • e821d59: 8345589: Simplify Windows definition of strtok_r
  • 153dc6d: 8345133: Test sun/security/tools/jarsigner/TsacertOptionTest.java failed: Warning found in stdout
  • d7ef3ac: 8345684: OperatingSystemMXBean.getSystemCpuLoad() throws NPE
  • 830173f: 8344068: Windows x86-64: Out of CodeBuffer space when generating final stubs
  • 69e664d: 8345632: [ASAN] memory leak in get_numbered_property_as_sorted_string function
  • ... and 183 more: https://git.openjdk.org/jdk/compare/3b21a298c29d88720f6bfb2dc1f3305b6a3db307...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 9, 2024
@openjdk openjdk bot closed this Dec 9, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 9, 2024
@openjdk
Copy link

openjdk bot commented Dec 9, 2024

@rwestrel Pushed as commit cc628a1.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

5 participants