Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK-8284944: assert(cnt++ < 40) failed: infinite cycle in loop optimization #8767

Closed
wants to merge 2 commits into from

Conversation

tobiasholenstein
Copy link
Contributor

@tobiasholenstein tobiasholenstein commented May 18, 2022

_loop_opts_cnt is set to LoopOptsCount which can have a maximum value of 43. _loop_opts_cnt is decremented in PHASE_PHASEIDEALLOOP1, PHASE_PHASEIDEALLOOP2 and PHASE_PHASEIDEALLOOP3 before it reaches PHASE_PHASEIDEALLOOP_ITERATIONS where it is decremented further in a loop until _loop_opts_cnt is 0. The assert assumes that _loop_opts_cnt has max. value 40 in PHASE_PHASEIDEALLOOP_ITERATIONS. But when PartialPeelLoop is turned off PHASE_PHASEIDEALLOOP2 is skipped and _loop_opts_cnt can have max. value 41 in PHASE_PHASEIDEALLOOP_ITERATIONS. Therefore the assert is wrong.

I propose to remove the assert entirely since the loop already has a condition _loop_opts_cnt > 0 and _loop_opts_cnt is decremented in every iteration.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed (1 review required, with at least 1 reviewer)

Issue

  • JDK-8284944: assert(cnt++ < 40) failed: infinite cycle in loop optimization

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/8767/head:pull/8767
$ git checkout pull/8767

Update a local copy of the PR:
$ git checkout pull/8767
$ git pull https://git.openjdk.java.net/jdk pull/8767/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 8767

View PR using the GUI difftool:
$ git pr show -t 8767

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/8767.diff

@bridgekeeper
Copy link

bridgekeeper bot commented May 18, 2022

👋 Welcome back tobiasholenstein! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label May 18, 2022
@openjdk
Copy link

openjdk bot commented May 18, 2022

@tobiasholenstein The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label May 18, 2022
@tobiasholenstein tobiasholenstein changed the title JDK-8284944: # assert(cnt++ < 40) failed: infinite cycle in loop optimization JDK-8284944: assert(cnt++ < 40) failed: infinite cycle in loop optimization May 18, 2022
@mlbridge
Copy link

mlbridge bot commented May 18, 2022

Webrevs

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having > 40 cycles in loopopts is bug - something wrong gone there. That is what this assert for. We should detect such case in loop opts and stop early.
I don't think removing this assert is correct thing.

@tobiasholenstein
Copy link
Contributor Author

tobiasholenstein commented May 23, 2022

Having > 40 cycles in loopopts is bug - something wrong gone there. That is what this assert for. We should detect such case in loop opts and stop early.
I don't think removing this assert is correct thing.

Thanks for your comment!

I have attached the the output of TestMaxLoopOptsCountReached.java run with -XX:+TraceLoopOpts in JIRA: https://bugs.openjdk.java.net/secure/attachment/99052/Output%20TraceLoopOpts.txt

It does a lot of loop unswitching but it seems that all loop optimizations are valid. The way the assert is implemented it can only be triggered if PartialPeelLoop is turned off. When PartialPeelLoop is turned on the assert can never be triggered and the loop optimizations are stopped after LoopOptsCount iterations (see PR description). I don't think it necessarily means that there is a bug if LoopOptsCount many loop optimizations are performed.

I agree that this could help detect potential bugs, but there can also be false positives (e.g. TestMaxLoopOptsCountReached.java). I am not sure if there is a reliable way to detect such false positives. What do you think?

@vnkozlov
Copy link
Contributor

As I commented in bug report, it took 54 (cnt == 52) iterations to finish compilation TestMaxLoopOptsCountReached::test method.
Should we just rise default LoopOptsCount flag's value and limit in the assert?

@vnkozlov
Copy link
Contributor

The original Test.java did not finish after 1000 iterations and I don't see what loop opts it executed.

@vnkozlov
Copy link
Contributor

I think first thing to do is to make sure we got report about what cause major_progress to be true in those loop opts iteration. Currently I see only repeated output in Test.java.

I also found that without "Strip Mining" (I used -XX:+UseParallelGC because I can't find how to switch it off otherwise) the Test.java finished in 9 iterations (cnt == 7) !!! So it is definitely something wrong when we use "Strip Mining" in this test.

@vnkozlov
Copy link
Contributor

And I don't know how they reproduced failure with JDK 8 and 11.
I run it with java -XX:+TraceLoopOpts -Xcomp -XX:-PartialPeelLoop -XX:+PrintCompilation -XX:+UseParallelGC Test

@chhagedorn
Copy link
Member

chhagedorn commented May 24, 2022

As I commented in bug report, it took 54 (cnt == 52) iterations to finish compilation TestMaxLoopOptsCountReached::test method. Should we just rise default LoopOptsCount flag's value and limit in the assert?

I think the problem is to determine what a good value should be. I'm afraid we could probably come up with some other cases where we can reach the limit again and hit the assert with a larger value. I agree that there might be a problem with major_progess and we should not do so many optimizations. But in general, I'm not sure how we can prove that we will only hit the assert in case of a real bug and not just a false positive. Maybe we would need some additional heuristics to make a decision (like handling the live node limit)? This would need some more careful investigation.

The way the assert is currently implemented does not help us. It will never be hit without explicitly disabling partial peeling. So, I think we should either remove it or change its value to 39 to possibly let it fail on the very last loop opts iteration. However, since we already have some different cases where we would fail with 39, this is currently not a good option. But I agree with Vladimir that it would be good to have some mechanism to warn us about such problems in the future.

So, I think there are 2 problems to solve:

  • Investigating the known cases so far and figure out why they need so many loop opts iterations.
  • Make the assert useful again (blocked by the first problem).

I'd suggest to investigate the cases we've found so far and file separate bugs for them if they turn out to be real bugs. Additionally, I suggest the following 2 options:

  • Remove the assert for now (given that it has never worked as it was supposed to work) and file an RFE to check if we can reintroduce it once the known cases/bugs are fixed and we have some confidence to not hit it again (could include tweaking the LoopOptsCount max value or introducing some other heuristics to avoid false positives).
  • Keep the assert as we have it today and just defer this bug and treat it like the RFE above.

I would opt for option 1 given that the assert has no real value as of today.

What do you think?

Thanks,
Christian

@vnkozlov
Copy link
Contributor

  • Remove the assert for now (given that it has never worked as it was supposed to work) and file an RFE to check if we can reintroduce it once the known cases/bugs are fixed and we have some confidence to not hit it again (could include tweaking the LoopOptsCount max value or introducing some other heuristics to avoid false positives).
  • Keep the assert as we have it today and just defer this bug and treat it like the RFE above.

I would opt for option 1 given that the assert has no real value as of today.

Okay, I agree.

@openjdk
Copy link

openjdk bot commented May 24, 2022

@tobiasholenstein This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8284944: assert(cnt++ < 40) failed: infinite cycle in loop optimization

Reviewed-by: kvn, chagedorn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 103 new commits pushed to the master branch:

  • 6585043: 8287181: Avoid redundant HashMap.containsKey calls in InternalLocaleBuilder.setExtension
  • a0cccb5: 8286490: JvmtiEventControllerPrivate::set_event_callbacks CLEARING_MASK computation is incorrect
  • 742644e: 8287263: java/nio/channels/FileChannel/LargeMapTest.java times out on Windows
  • d792cbc: 8286211: Update PCSC-Lite for Suse Linux to 1.9.5
  • 9b7e42c: 8284209: Replace remaining usages of 'a the' in source code
  • 6cc4bb1: 6725221: Standardize obtaining boolean properties with defaults
  • 194bc08: 8287249: Zero: Missing BarrierSetNMethod::arm() method
  • 25669bb: 8275170: Some jtreg sound tests should be marked with sound keyword
  • 4518063: 8281682: Redundant .ico files in Windows app-image cause unnecessary bloat
  • fdece9a: 8287169: compiler/arguments/TestCompileThresholdScaling.java fails on x86_32 after JDK-8287052
  • ... and 93 more: https://git.openjdk.java.net/jdk/compare/69ff86a32088d9664e5e0dae12edddc0643e3fd3...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@vnkozlov, @chhagedorn) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label May 24, 2022
@chhagedorn
Copy link
Member

  • Remove the assert for now (given that it has never worked as it was supposed to work) and file an RFE to check if we can reintroduce it once the known cases/bugs are fixed and we have some confidence to not hit it again (could include tweaking the LoopOptsCount max value or introducing some other heuristics to avoid false positives).
  • Keep the assert as we have it today and just defer this bug and treat it like the RFE above.

I would opt for option 1 given that the assert has no real value as of today.

Okay, I agree.

Thanks for your feedback, Vladimir. If @tobiasholenstein also agrees, we can file the follow-up RFE and integrate this change.

@tobiasholenstein
Copy link
Contributor Author

Thanks @vnkozlov and @chhagedorn for you feedback. I will file a follow-up RFE to reintroduce a better check for loop optimization progress and a Bug for the infinite loop in loop optimizations.

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label May 25, 2022
@openjdk
Copy link

openjdk bot commented May 25, 2022

@tobiasholenstein
Your change (at version 73871c7) is now ready to be sponsored by a Committer.

@chhagedorn
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented May 25, 2022

Going to push as commit 796494d.
Since your change was applied there have been 113 commits pushed to the master branch:

  • e534c13: 8286398: Address possibly lossy conversions in jdk.internal.le
  • e21b527: 8228990: JFR: TestNetworkUtilizationEvent.java expects 2+ Network interfaces on Linux but finding 1
  • e9bddc1: 8262889: Compiler implementation for Record Patterns
  • ebfa27b: 8284690: [macos] VoiceOver : Getting java.lang.IllegalArgumentException: Invalid location on Editable JComboBox
  • e32377e: 8286304: Removal of diagnostic flag GCParallelVerificationEnabled
  • e1f140d: 8287206: Use WrongThreadException for confinement errors
  • e17118a: 8272791: java -XX:BlockZeroingLowLimit=1 crashes after 8270947
  • f7a37f5: 8287202: GHA: Add macOS aarch64 to the list of default platforms for workflow_dispatch event
  • d889319: 8287254: Clean up Xcode sysroot logic
  • 593d2b7: 8287091: aarch64 : guarantee(val < (1ULL << nbits)) failed: Field too big for insn
  • ... and 103 more: https://git.openjdk.java.net/jdk/compare/69ff86a32088d9664e5e0dae12edddc0643e3fd3...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 25, 2022
@openjdk openjdk bot closed this May 25, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 25, 2022
@openjdk openjdk bot removed the sponsor Pull request is ready to be sponsored label May 25, 2022
@openjdk
Copy link

openjdk bot commented May 25, 2022

@chhagedorn @tobiasholenstein Pushed as commit 796494d.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@tobiasholenstein tobiasholenstein deleted the JDK-8284944 branch August 29, 2022 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
3 participants