Skip to content

8276116: C2: optimize long range checks in int counted loops #6576

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Nov 26, 2021

Maurizio noticed that some of his panama micro benchmarks don't
perform better avec 8259609 (C2: optimize long range checks in long
counted loops). The reason is that 8259609 optimizes long range checks
in long counted loops but some of his benchmarks include long range
checks in int counted loops:

for (int i = start; i < stop; i += inc) {
Objects.checkIndex(scale * ((long)i) + offset, length);
}

This change applies the transformation from 8259609 for long counted
loop/long range checks to int counted loop/long range checks. That
includes creating a loop nest and transforming the long range check to
an int range check that's subject to range elimination in the inner
loop.

The reason it's required to create a loop nest is that the long range
check transformation logic depends on no overflow of scale * i for the
range of values that the transformed range check is applied to.

As a consequence, this change is mostly refactoring to make the loop
nest creation and range check transformation parameterized by the type
of the transformed loop.

I think this transformation needs to be applied as late as possible
but, in the case of an int counted loop, before pre/main/post loops
are created. I had to move it to IdealLoopTree::iteration_split_impl()
because of that.

There's an alternate shape for a long range check in an int counted
loop that Maurizio insisted needs to be supported:

for (int i = start; i < stop; i += inc) {
Objects.checkIndex(((long)(scale * i)) + offset, length);
}

scale * i can overflow in that case. This is also supported but as a
corner case of the previous one. The code in
PhaseIdealLoop::transform_long_range_checks() has a comment about
that.

Note also that this transformation works best if loop strip mining is
enabled (that is for G1, ZGC, Shenandoah by default). The reason is
that it needs a safepoint and when loop strip mining is enabled, the
outer loop contains one that's always available. A way to have this
work as well for all GCs would be to always construct the loop strip
mining loop nest (whether loop strip mining is enabled or not) and
then only once loop opts are over remove the outer loop when loop
strip mining is disabled. I'm looking for feedback on this.

BTW, something doesn't seem right in IdealLoopTree::iteration_split_impl():

https://github.com/rwestrel/jdk/blob/master/src/hotspot/share/opto/loopTransform.cpp#L3475

should_peel causes transformations to be skipped but peeling is never
applied AFAICT. Does it make sense to anyone?


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8276116: C2: optimize long range checks in int counted loops

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/6576/head:pull/6576
$ git checkout pull/6576

Update a local copy of the PR:
$ git checkout pull/6576
$ git pull https://git.openjdk.java.net/jdk pull/6576/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 6576

View PR using the GUI difftool:
$ git pr show -t 6576

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/6576.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 26, 2021

👋 Welcome back roland! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 26, 2021

@rwestrel The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Nov 26, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Nov 26, 2021
@mlbridge
Copy link

mlbridge bot commented Nov 26, 2021

Webrevs

Copy link
Member

@TobiHartmann TobiHartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No review yet, just run this through testing and TestLongRangeCheck.java fails with:

java.lang.RuntimeException: should have been deoptimized
	at TestLongRangeCheck.assertIsNotCompiled(TestLongRangeCheck.java:60)
	at TestLongRangeCheck.test(TestLongRangeCheck.java:127)
	at TestLongRangeCheck.main(TestLongRangeCheck.java:215)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:577)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127)
	at java.base/java.lang.Thread.run(Thread.java:833)

Flags are -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation

@rwestrel
Copy link
Contributor Author

@TobiHartmann thanks for running testing. That one should be fixed now.

@TobiHartmann
Copy link
Member

New round of testing all passed.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good to me.

@@ -13129,6 +13129,24 @@ instruct cmovLL_mem_LTGE(cmpOp cmp, flagsReg_long_LTGE flags, eRegL dst, load_lo
ins_pipe( pipe_cmov_reg_long );
%}

instruct cmovLL_reg_LTGE_U(cmpOpU cmp, flagsReg_ulong_LTGE flags, eRegL dst, eRegL src) %{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How it is related to these changes? Seems like addition to 8277324 changes. Could be pushed separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good to me.

Thanks for reviewing this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How it is related to these changes? Seems like addition to 8277324 changes. Could be pushed separately.

That showed on github testing because of the new unsigned_min I think. So not including it would break x86_32.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay then.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing and performance results looks fine.

@openjdk
Copy link

openjdk bot commented Dec 8, 2021

@rwestrel This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8276116: C2: optimize long range checks in int counted loops

Reviewed-by: kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1 new commit pushed to the master branch:

  • 37921e3: 8269258: java/net/httpclient/ManyRequestsLegacy.java failed with connection timeout

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 8, 2021
@vnkozlov
Copy link
Contributor

vnkozlov commented Dec 8, 2021

Tobias's tier6-7 passed. @rwestrel you can integrate.

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 8, 2021

Tobias's tier6-7 passed. @rwestrel you can integrate.

Thanks for the review @vnkozlov

@rwestrel
Copy link
Contributor Author

rwestrel commented Dec 8, 2021

/integrate

@openjdk
Copy link

openjdk bot commented Dec 8, 2021

Going to push as commit b3faecf.
Since your change was applied there have been 18 commits pushed to the master branch:

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Dec 8, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 8, 2021
@openjdk
Copy link

openjdk bot commented Dec 8, 2021

@rwestrel Pushed as commit b3faecf.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

exp = exp->in(1);
bt = T_INT;
if (converted != NULL) {
*converted = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's best not to assign *converted until the function returns success.

It might also be wise to assign *converted to false as well as true, as the case may be.

I noticed that there are uses of the function on several different inputs but with the same converted pointer. If one use sets converted to true but returns false, and another use returns true, then the original caller can get a bad converted flag out of the deal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

4 participants