Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8252372: Check if cloning is required to move loads out of loops in PhaseIdealLoop::split_if_with_blocks_post() #3689

Closed
wants to merge 6 commits into from

Conversation

@rwestrel
Copy link
Contributor

@rwestrel rwestrel commented Apr 26, 2021

Sinking data nodes out of a loop when all uses are out of a loop has
several issues that this attempts to fix.

1- Only non control uses are considered which makes little sense (why
not sink if the data node is an argument to a call or a returned
value?)

2- Sinking of Loads is broken because of the handling of
anti-dependence: the get_late_ctrl(n, n_ctrl) call returns a control
in the loop because it takes all uses into account.

3- For data nodes for which a control edge can't be set, commoning of
clones back in the loop is prevented with:
_igvn._worklist.yank(x);
which gives no guarantee

This patch tries to address all issues:

1- it looks at all uses, not only non control uses

2- anti-dependences are computed for each use independently

3- Cast nodes are used to pin clones out of loop

2- requires refactoring of the PhaseIdealLoop::get_late_ctrl()
logic. While working on this, I noticed a bug in anti-dependence
analysis: when the use is a cfg node, the code sometimes looks at uses
of the memory state of the cfg. The logic uses the use of the cfg
which is a projection of adr_type identical to the cfg. It should
instead look at the use of the memory projection.

The existing logic for sinking loads calls clear_dom_lca_tags() for
every load which seems like quite a waste. I added a
_dom_lca_tags_round variable that's or'ed with the tag_node's _idx. By
incrementing _dom_lca_tags_round, new tags that don't conflict with
existing ones are produced and there's no need for
clear_dom_lca_tags().

For anti-dependence analysis to return a correct result, early control
of the load is needed. The only way to get it at this stage, AFAICT,
is to compute it by following the load's input until a pinned node is
reached.

The existing logic pins cloned nodes next to their use. The logic I
propose pins them right out of the loop. This could possibly avoid
some redundant clones. It also makes some special handling for corner
cases with loop strip mining useless.

For 3-, I added extra Cast nodes for float types. If a chain of data
nodes are sunk, the new logic tries to keep a single Cast for the
entire chain rather than one Cast per node.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8252372: Check if cloning is required to move loads out of loops in PhaseIdealLoop::split_if_with_blocks_post()

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/3689/head:pull/3689
$ git checkout pull/3689

Update a local copy of the PR:
$ git checkout pull/3689
$ git pull https://git.openjdk.java.net/jdk pull/3689/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 3689

View PR using the GUI difftool:
$ git pr show -t 3689

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/3689.diff

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Apr 26, 2021

👋 Welcome back roland! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr label Apr 26, 2021
@openjdk
Copy link

@openjdk openjdk bot commented Apr 26, 2021

@rwestrel The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mlbridge
Copy link

@mlbridge mlbridge bot commented Apr 26, 2021

Webrevs

@TobiHartmann
Copy link
Member

@TobiHartmann TobiHartmann commented Apr 27, 2021

Hi Roland,

I didn't look at this in detail yet but gave it a quick run through our testing. I'm seeing many of the following failures with the jdk/incubator/vector/ tests:

assert(cast != __null) failed: must have added a cast to pin the node

Current CompileTask:
C2:  48324 7090    b        ShortMaxVectorTests::getShortMaxVectorTests (1916 bytes)

Stack: [0x00007f13688f8000,0x00007f13689f9000],  sp=0x00007f13689f3250,  free space=1004k
Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x13293b4]  PhaseIdealLoop::try_sink_out_of_loop(Node*)+0x8d4
V  [libjvm.so+0x132969c]  PhaseIdealLoop::split_if_with_blocks_post(Node*)+0x5c
V  [libjvm.so+0x132a311]  PhaseIdealLoop::split_if_with_blocks(VectorSet&, Node_Stack&)+0x201
V  [libjvm.so+0x131d2ad]  PhaseIdealLoop::build_and_optimize(LoopOptsMode)+0x121d
V  [libjvm.so+0xa4193a]  PhaseIdealLoop::optimize(PhaseIterGVN&, LoopOptsMode)+0x2da
V  [libjvm.so+0xa3d1f6]  Compile::Optimize()+0x586
V  [libjvm.so+0xa40946]  Compile::Compile(ciEnv*, ciMethod*, int, bool, bool, bool, bool, DirectiveSet*)+0x2276
V  [libjvm.so+0x861cfa]  C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x1ea
V  [libjvm.so+0xa50a59]  CompileBroker::invoke_compiler_on_method(CompileTask*)+0xfb9
V  [libjvm.so+0xa517c8]  CompileBroker::compiler_thread_loop()+0x5a8
V  [libjvm.so+0x18bc4d1]  JavaThread::thread_main_inner()+0x271
V  [libjvm.so+0x18c3f10]  Thread::call_run()+0x100
V  [libjvm.so+0x159a26e]  thread_native_entry(Thread*)+0x10e

jdk/incubator/vector/ShortMaxVectorTests.java with -ea -esa -XX:CompileThreshold=100 -XX:+UnlockExperimentalVMOptions -Xcomp

@rwestrel
Copy link
Contributor Author

@rwestrel rwestrel commented Apr 27, 2021

I didn't look at this in detail yet but gave it a quick run through our testing. I'm seeing many of the following failures with the jdk/incubator/vector/ tests:

Thanks for running the tests. I will work on the failures.

@rwestrel
Copy link
Contributor Author

@rwestrel rwestrel commented Apr 27, 2021

That issue should be fixed. I added a CastVV node for vectors.

@TobiHartmann
Copy link
Member

@TobiHartmann TobiHartmann commented May 3, 2021

Thanks, testing looks good now. I need some more time to review this though.

Copy link
Member

@TobiHartmann TobiHartmann left a comment

This is hard to review but looks reasonable to me. Performance and correctness testing also looks good.

if (u_loop->_child) {
if (useblock == u_loop->_head && u_loop->_head->is_OuterStripMinedLoop()) {
return u_loop->_head->in(LoopNode::EntryControl);
Node* PhaseIdealLoop::place_near_use(Node* useblock, IdealLoopTree* loop) const {
Copy link
Member

@TobiHartmann TobiHartmann May 11, 2021

Maybe the name and comment should be adjusted since we no longer place it next to the use but right outside of the loop.

Copy link
Contributor Author

@rwestrel rwestrel May 17, 2021

Right. Updated.

@openjdk
Copy link

@openjdk openjdk bot commented May 11, 2021

@rwestrel This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8252372: Check if cloning is required to move loads out of loops in PhaseIdealLoop::split_if_with_blocks_post()

Reviewed-by: thartmann, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 94 new commits pushed to the master branch:

  • fe33343: 8256304: should MonitorUsedDeflationThreshold be experimental or diagnostic
  • 8f10c5a: 8267190: Optimize Vector API test operations
  • 94cfeb9: 8256155: Allow multiple large page sizes to be used on Linux
  • ec8a809: 8267119: switch expressions lack support for deferred type-checking
  • 4ba7613: 8267332: xor value should handle bounded values
  • ee2651b: 8203359: Container level resources events
  • b5d32bb: 8260690: JConsole User Guide Link from the Help menu is not accessible by keyboard
  • e48d7d6: 8264218: Public method javax.swing.JMenu.setComponentOrientation() has no spec
  • 9eaa4af: 8267056: tools/jpackage/share/RuntimePackageTest.java fails with NoSuchFileException
  • e094f3f: 8266856: Make element void
  • ... and 84 more: https://git.openjdk.java.net/jdk/compare/f4227879b0504bc4656bd84153b13b04c007aa66...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label May 11, 2021
@rwestrel
Copy link
Contributor Author

@rwestrel rwestrel commented May 17, 2021

This is hard to review but looks reasonable to me. Performance and correctness testing also looks good.

Thanks for reviewing it.

Copy link
Contributor

@vnkozlov vnkozlov left a comment

This looks reasonable to me.
Did we got performance results for it?

@TobiHartmann
Copy link
Member

@TobiHartmann TobiHartmann commented May 19, 2021

Yes, I've linked it to the bug.

@rwestrel
Copy link
Contributor Author

@rwestrel rwestrel commented May 26, 2021

@vnkozlov thanks for the review

@rwestrel
Copy link
Contributor Author

@rwestrel rwestrel commented May 26, 2021

/integrate

@openjdk openjdk bot closed this May 26, 2021
@openjdk openjdk bot added integrated and removed ready rfr labels May 26, 2021
@openjdk
Copy link

@openjdk openjdk bot commented May 26, 2021

@rwestrel Since your change was applied there have been 141 commits pushed to the master branch:

  • 0394416: 8267468: Rename refill waster counters in ThreadLocalAllocBuffer
  • b33b8bc: 8267750: Incomplete fix for JDK-8267683
  • ac36b7d: 8267452: Delegate forEachRemaining in Spliterators.iterator()
  • d0d2ddc: 8267651: runtime/handshake/HandshakeTimeoutTest.java times out when dumping core
  • a98e476: 8267311: vmTestbase/gc/gctests/StringInternGC/StringInternGC.java eventually OOMEs
  • 5aa45f2: 8267403: tools/jpackage/share/FileAssociationsTest.java#id0 failed with "Error: Bundler "Mac PKG Package" (pkg) failed to produce a package"
  • c20ca42: 8267691: Change table to obsolete CriticalJNINatives in JDK 18, not 17
  • e751b7b: 8267683: rfc7301Grease8F value not displayed correctly in SSLParameters javadoc
  • 0b77359: 8224243: Add implSpec's to AccessibleObject and seal Executable
  • 594d454: 8267574: Dead code in HtmlStyle/HtmlDocletWriter
  • ... and 131 more: https://git.openjdk.java.net/jdk/compare/f4227879b0504bc4656bd84153b13b04c007aa66...master

Your commit was automatically rebased without conflicts.

Pushed as commit 9d305b9.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
3 participants