Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8254317: C2: Resource consumption of ConvI2LNode::Ideal() grows exponentially #727

Closed
wants to merge 7 commits into from

Conversation

@robcasloz
Copy link
Contributor

@robcasloz robcasloz commented Oct 19, 2020

Prevent exponential number of calls to ConvI2LNode::Ideal() when AddIs are used multiple times by other AddIs in the optimization ConvI2L(AddI(x, y)) -> AddL(ConvI2L(x), ConvI2L(y)). This is achieved by (1) reusing existing ConvI2Ls if possible rather than eagerly creating new ones and (2) postponing the optimization of newly created ConvI2Ls. Remove hook node solution introduced in 8217359, since this is subsumed by (2). Use phase->is_IterGVN() rather than can_reshape to check if ConvI2LNode::Ideal() is called within iterative GVN, for clarity. Add regression tests that cover different shapes and sizes of AddI subgraphs, implicitly checking (by not timing out) that there is no combinatorial explosion.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Testing

Linux x64 Linux x86 Windows x64 macOS x64
Build ✔️ (5/5 passed) ✔️ (2/2 passed) ✔️ (2/2 passed) ✔️ (2/2 passed)
Test (tier1) ✔️ (9/9 passed) ✔️ (9/9 passed) ✔️ (9/9 passed) ✔️ (9/9 passed)

Issue

  • JDK-8254317: C2: Resource consumption of ConvI2LNode::Ideal() grows exponentially

Reviewers

Contributors

  • Vladimir Ivanov <vlivanov@openjdk.org>

Download

$ git fetch https://git.openjdk.java.net/jdk pull/727/head:pull/727
$ git checkout pull/727

robcasloz added 2 commits Oct 16, 2020
…entially

In the optimization ConvI2L(AddI(x, y)) -> AddL(ConvI2L(x), ConvI2L(y)) within
ConvI2LNode::Ideal(), handle the special case x = y by feeding both inputs of
AddL from a single ConvI2L node rather than creating two semantically equivalent
ConvI2L nodes. This avoids an exponential number of calls to
ConvI2LNode::Ideal() when dealing with long chains of AddI nodes. Disable the
optimization for the pattern ConvI2L(SubI(x, x)), which is simplified to zero
during parsing anyway. Add a set of regression tests for the transformation that
cover different shapes of AddI subgraphs. Also add a microbenchmark that
exercises the special case, for performance regression testing.
@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Oct 19, 2020

👋 Welcome back rcastanedalo! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Oct 19, 2020

@robcasloz The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Oct 19, 2020

Tested on tier1-3 on windows-x64, linux-x64, linux-aarch64, and macosx-x64 in both release and debug mode.

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Oct 19, 2020

The microbenchmark gives the following results before and after the proposed fix (linux-x86_64-server-release build run on Intel Core i7-9850H CPU @ 2.60GHz with 32 GB main memory):

Version Mode Cnt Score Error Units
before the fix ss 10 15402.110 171.255 ms/op
after the fix ss 10 0.976 0.069 ms/op

"Single-shot" (ss) benchmarking mode is used here because the focus is on measuring C2 execution time.

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Oct 19, 2020

/summary
In the optimization ConvI2L(AddI(x, y)) -> AddL(ConvI2L(x), ConvI2L(y)) within
ConvI2LNode::Ideal(), handle the special case x = y by feeding both inputs of
AddL from a single ConvI2L node rather than creating two semantically equivalent
ConvI2L nodes. This avoids an exponential number of calls to
ConvI2LNode::Ideal() when dealing with long chains of AddI nodes. Disable the
optimization for the pattern ConvI2L(SubI(x, x)), which is simplified to zero
during parsing anyway. Add a set of regression tests for the transformation that
cover different shapes of AddI subgraphs. Also add a microbenchmark that
exercises the special case, for performance regression testing.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Oct 19, 2020

@robcasloz Setting summary to:

In the optimization ConvI2L(AddI(x, y)) -> AddL(ConvI2L(x), ConvI2L(y)) within
ConvI2LNode::Ideal(), handle the special case x = y by feeding both inputs of
AddL from a single ConvI2L node rather than creating two semantically equivalent
ConvI2L nodes. This avoids an exponential number of calls to
ConvI2LNode::Ideal() when dealing with long chains of AddI nodes. Disable the
optimization for the pattern ConvI2L(SubI(x, x)), which is simplified to zero
during parsing anyway. Add a set of regression tests for the transformation that
cover different shapes of AddI subgraphs. Also add a microbenchmark that
exercises the special case, for performance regression testing.

Loading

@robcasloz robcasloz marked this pull request as ready for review Oct 19, 2020
@openjdk openjdk bot added the rfr label Oct 19, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 19, 2020

Webrevs

Loading

Copy link

@iwanowww iwanowww left a comment

2 concerns with the proposed fix on my side:

  • I’m not persuaded that covering “x == y” is enough to completely eliminates the issue;

  • The issue demonstrates there’s still a chance to introduce very deep recursion involving Compile::constrained_convI2L and PhaseIterGVN which can cause a crash.

IMO the root cause is an eager transformation happening top-down on ConvI2L nodes and it defeats memoization GVN naturally provides, so it causes a combinatorial explosion. If subsequent Compile::constrained_convI2L() calls could share the same ConvI2L node for the same input, it would be a more reliable fix for the problem.

Otherwise, the transformation may be extracted from GVN and turned into a separate pass (take a look at Compile::optimize_logic_cones as an example).

Some comments on the tests: (1) please, group the individual test cases into a single test class; and (2) I suggest to turn the benchmark into a test case which fails with timeout when fix is absent.

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Oct 19, 2020

2 concerns with the proposed fix on my side:

* I’m not persuaded that covering “x == y” is enough to completely eliminates the issue;

* The issue demonstrates there’s still a chance to introduce very deep recursion involving `Compile::constrained_convI2L` and `PhaseIterGVN` which can cause a crash.

IMO the root cause is an eager transformation happening top-down on ConvI2L nodes and it defeats memoization GVN naturally provides, so it causes a combinatorial explosion. If subsequent Compile::constrained_convI2L() calls could share the same ConvI2L node for the same input, it would be a more reliable fix for the problem.

Otherwise, the transformation may be extracted from GVN and turned into a separate pass (take a look at Compile::optimize_logic_cones as an example).

Some comments on the tests: (1) please, group the individual test cases into a single test class; and (2) I suggest to turn the benchmark into a test case which fails with timeout when fix is absent.

Thanks Vladimir for the thorough review! I will explore your suggestions to generalize the fix and see what can be done.

Loading

@robcasloz robcasloz marked this pull request as draft Oct 19, 2020
@openjdk openjdk bot removed the rfr label Oct 19, 2020
robcasloz added 2 commits Oct 21, 2020
…times by

other AddIs, which could also lead to an exponential number of calls to
ConvI2LNode::Ideal(). This is achieved by (1) reusing existing ConvI2Ls if
possible rather than eagerly creating new ones and (2) postponing the
optimization of newly created ConvI2Ls. Remove "hook" node solution introduced
in JDK-8217359 since this is subsumed by (2). Test that ConvI2LNode::Ideal() is
called within iterative GVN using phase->is_IterGVN() rather than can_reshape,
for clarity.

Merge all tests into a single class. Reimplement the microbenchmark as a test
case that should time out in case of a combinatorial explosion. Add a second
similar microbenchmark that demonstrates the need for this generalization.
@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Oct 21, 2020

/summary
Prevent exponential number of calls to ConvI2LNode::Ideal() when AddIs are used
multiple times by other AddIs in the optimization ConvI2L(AddI(x, y)) ->
AddL(ConvI2L(x), ConvI2L(y)). This is achieved by (1) reusing existing ConvI2Ls
if possible rather than eagerly creating new ones and (2) postponing the
optimization of newly created ConvI2Ls. Remove hook node solution introduced in
8217359, since this is subsumed by (2). Use phase->is_IterGVN() rather than
can_reshape to check if ConvI2LNode::Ideal() is called within iterative GVN, for
clarity. Add regression tests that cover different shapes and sizes of AddI
subgraphs, implicitly checking (by not timing out) that there is no
combinatorial explosion.

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Oct 21, 2020

@robcasloz Updating existing summary to:

Prevent exponential number of calls to ConvI2LNode::Ideal() when AddIs are used
multiple times by other AddIs in the optimization ConvI2L(AddI(x, y)) ->
AddL(ConvI2L(x), ConvI2L(y)). This is achieved by (1) reusing existing ConvI2Ls
if possible rather than eagerly creating new ones and (2) postponing the
optimization of newly created ConvI2Ls. Remove hook node solution introduced in
8217359, since this is subsumed by (2). Use phase->is_IterGVN() rather than
can_reshape to check if ConvI2LNode::Ideal() is called within iterative GVN, for
clarity. Add regression tests that cover different shapes and sizes of AddI
subgraphs, implicitly checking (by not timing out) that there is no
combinatorial explosion.

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Oct 21, 2020

Re-tested on tier1-3 and by running the new tests 10 times on all platforms (windows-x64, linux-x64, linux-aarch64, and macosx-x64) in both release and debug mode, to check that the timeout is sufficiently long.

Loading

@robcasloz robcasloz marked this pull request as ready for review Oct 21, 2020
@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Oct 21, 2020

/contributor add vlivanov

Loading

@openjdk openjdk bot added the rfr label Oct 21, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Oct 21, 2020

@robcasloz
Contributor Vladimir Ivanov <vlivanov@openjdk.org> successfully added.

Loading

Simplify JVM arguments and run each test case 100000 times to still trigger
C2. Use randomization to avoid constant propagation in C2. Increase the load of
the stress tests and their timeout to 30s to further reduce the risk of false
positives.
Copy link
Contributor

@vnkozlov vnkozlov left a comment

Good. Thanks!

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Oct 22, 2020

@robcasloz This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8254317: C2: Resource consumption of ConvI2LNode::Ideal() grows exponentially

Prevent exponential number of calls to ConvI2LNode::Ideal() when AddIs are used
multiple times by other AddIs in the optimization ConvI2L(AddI(x, y)) ->
AddL(ConvI2L(x), ConvI2L(y)). This is achieved by (1) reusing existing ConvI2Ls
if possible rather than eagerly creating new ones and (2) postponing the
optimization of newly created ConvI2Ls. Remove hook node solution introduced in
8217359, since this is subsumed by (2). Use phase->is_IterGVN() rather than
can_reshape to check if ConvI2LNode::Ideal() is called within iterative GVN, for
clarity. Add regression tests that cover different shapes and sizes of AddI
subgraphs, implicitly checking (by not timing out) that there is no
combinatorial explosion.

Co-authored-by: Vladimir Ivanov <vlivanov@openjdk.org>
Reviewed-by: vlivanov, kvn

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 25 new commits pushed to the master branch:

  • bd3e65b: 8256052: Remove unused allocation type from fieldInfo
  • 6d8acd2: 8256066: Tests use deprecated TestNG API that is no longer available in new versions
  • 643969a: 8255822: Zero: improve build-time JVMTI handling
  • 6ae5e5b: 8221404: C2: Convert RegMask and IndexSet to use uintptr_t
  • 6555996: 8253600: G1: Fully support pinned regions for full gc
  • 97d6e4a: 8256046: Shenandoah: Mix-in NULL_PTR in non-strong ShLRBNode's type
  • a1d4b9f: 8256009: Remove src/hotspot/share/adlc/Test/i486.ad
  • 3455fa9: 8256050: JVM crashes with -XX:+PrintDeoptimizationDetails
  • e6df13e: 8256054: C2: Floating-point min/max operations on vectors intermittently produce wrong results for NaN values
  • 52805f5: 8256048: Incomplete gitignore setting for netbeans project
  • ... and 15 more: https://git.openjdk.java.net/jdk/compare/c7551c37c7e9be7112371c351a4cc0d0d817cb46...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@iwanowww, @vnkozlov) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

Loading

@openjdk openjdk bot added the ready label Oct 22, 2020
@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Nov 10, 2020

I just simplified the PR by using IGVN's hash table mechanism rather than custom logic to find existing ConvI2L nodes (thanks to Vladimir Ivanov for the suggestion). The new version is tested on tier1-3. Please review.

Loading

Copy link

@iwanowww iwanowww left a comment

Looks good.

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Nov 10, 2020

Looks good.

Thanks for reviewing Vladimir!

Loading

Copy link
Contributor

@vnkozlov vnkozlov left a comment

Good

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Nov 10, 2020

Good

Thanks Vladimir!

Loading

@robcasloz
Copy link
Contributor Author

@robcasloz robcasloz commented Nov 10, 2020

/integrate

Loading

@openjdk openjdk bot added the sponsor label Nov 10, 2020
@openjdk
Copy link

@openjdk openjdk bot commented Nov 10, 2020

@robcasloz
Your change (at version 6342b0e) is now ready to be sponsored by a Committer.

Loading

@TobiHartmann
Copy link
Member

@TobiHartmann TobiHartmann commented Nov 11, 2020

/sponsor

Loading

@openjdk
Copy link

@openjdk openjdk bot commented Nov 11, 2020

@TobiHartmann @robcasloz Since your change was applied there have been 38 commits pushed to the master branch:

  • 79ac041: 8256025: AArch64: MachCallRuntimeNode::ret_addr_offset() is incorrect for stub calls
  • 6e8b862: 8255559: Leak File Descriptors Because of ResolverLocalFilesystem#engineResolveURI()
  • 129ff97: 8231599: NPE when loading a preview classfile from a future Java version
  • 5181f9c: 7190978: javax/swing/JComponent/7154030/bug7154030.java fails on mac
  • 35284e4: 8255916: [macos] javax/swing/JInternalFrame/6647340/bug6647340.java timed out
  • 8638cd9: 8255625: AArch64: Implement Base64.encodeBlock accelerator/intrinsic
  • 5de99da: 8237495: Java MIDI fails with a dereferenced memory error when asked to send a raw 0xF7
  • be63525: 8211999: Window positioning bugs due to overlapping GraphicsDevice bounds (Windows/HiDPI)
  • 0a41ca6: 8254354: Add a withInvokeExactBehavior() VarHandle combinator
  • d6f1463: 8233332: Need to create exploded tests covering all forms of modules
  • ... and 28 more: https://git.openjdk.java.net/jdk/compare/c7551c37c7e9be7112371c351a4cc0d0d817cb46...master

Your commit was automatically rebased without conflicts.

Pushed as commit 432c387.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Loading

openjdk-notifier bot referenced this issue Nov 11, 2020
…entially

Prevent exponential number of calls to ConvI2LNode::Ideal() when AddIs are used
multiple times by other AddIs in the optimization ConvI2L(AddI(x, y)) ->
AddL(ConvI2L(x), ConvI2L(y)). This is achieved by (1) reusing existing ConvI2Ls
if possible rather than eagerly creating new ones and (2) postponing the
optimization of newly created ConvI2Ls. Remove hook node solution introduced in
8217359, since this is subsumed by (2). Use phase->is_IterGVN() rather than
can_reshape to check if ConvI2LNode::Ideal() is called within iterative GVN, for
clarity. Add regression tests that cover different shapes and sizes of AddI
subgraphs, implicitly checking (by not timing out) that there is no
combinatorial explosion.

Co-authored-by: Vladimir Ivanov <vlivanov@openjdk.org>
Reviewed-by: vlivanov, kvn
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
5 participants