Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8319220: Pattern matching switch with a lot of cases is unduly slow #16489

Closed
wants to merge 8 commits into from

Conversation

lahodaj
Copy link
Contributor

@lahodaj lahodaj commented Nov 3, 2023

Consider code like:

void test(Object o) {
    switch (o) {
        case X1 -> {}
        case X2 -> {}
...(about 100 cases)

javac will compile the switch into a switch whose selector is an indy invocation to SwitchBootstraps.typeSwitch, with static arguments being the types in the cases.

SwitchBootstraps.typeSwitch will then create a chain of MethodHandles performing instanceof checks between the switch's selector and the given case type. The problem is that when the number of cases is high enough, (more than ~40-50), the chain gets too long, and the tests won't inline anymore. This then leads to a very bad performance, when compared to manually written if-instanceof-else-if-instanceof- chain.

The proposal herein is to use bytecode (written using the ClassFile API/library) instead of the MethodHandles chain. The overall performance of this seems to be similar to the manually written if-instanceof-else-if-instanceof- chain.

Using the benchmark from the bug, and this patch, I am getting:

MyBenchmark.testIfElse100  thrpt    5  521826.326 ± 7510.042  ops/s
MyBenchmark.testSwitch100  thrpt    5  505440.170 ± 3757.178  ops/s

The most tricky part of this new way to generate the tests is handling of non-type case labels, and in particular cases with enum constant labels. The resolution of enum constants is deferred as much as possible, by using an indirection through the ResolvedEnumLabels.

Further improvements may be possible, esp. for some specific cases (like all cases having a type, and the type being a final class).


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8319220: Pattern matching switch with a lot of cases is unduly slow (Bug - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/16489/head:pull/16489
$ git checkout pull/16489

Update a local copy of the PR:
$ git checkout pull/16489
$ git pull https://git.openjdk.org/jdk.git pull/16489/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 16489

View PR using the GUI difftool:
$ git pr show -t 16489

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/16489.diff

Webrev

Link to Webrev Comment

if (labels.length == 0) {
cb.constantInstruction(0)
.ireturn();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this generating a dead code for labels.length == 0 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it does (did). Fixed. Thanks!

cases.add(new Element(target, next, currentLabel));
switchCases.add(SwitchCase.of(idx, target));
}
cases = cases.reversed();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switchCases for tableswitch do not need to be pre-ordered, code builder does the processign

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I would prefer to keep the reverse here, to reduce the cognitive load, as cases must be processed in original order (and hence reversed here), and it may not be clear why reverse one, and not the other. (Given the List is an ArrayList, the reverse operation should not have much impact on anything, if I read how it is implemented correctly.)

Copy link
Member

@asotona asotona left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 3, 2023

👋 Welcome back jlahoda! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Nov 3, 2023

@lahodaj This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8319220: Pattern matching switch with a lot of cases is unduly slow

Reviewed-by: asotona, vromero

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 63 new commits pushed to the master branch:

  • 115b074: 8319944: Remove DynamicDumpSharedSpaces
  • c0507af: 8319818: Address GCC 13.2.0 warnings (stringop-overflow and dangling-pointer)
  • 3684b4b: 8306116: Update CLDR to Version 44.0
  • 88ccd64: 8296250: Update ICU4J to Version 74.1
  • 03db828: 8319650: Improve heap dump performance with class metadata caching
  • b41b00a: 8319820: Use unnamed variables in the FFM implementation
  • 4d650fe: 8319704: LogTagSet::set_output_level() should not accept NULL as LogOutput
  • 6f863b2: 8318636: Add jcmd to print annotated process memory map
  • e035637: 8319375: test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/RedefineLeakThrowable.java runs into OutOfMemoryError: Metaspace on AIX
  • 50f41d6: 8309893: Integrate ReplicateB/S/I/L/F/D nodes to Replicate node
  • ... and 53 more: https://git.openjdk.org/jdk/compare/bfafb27e273819fb51639daa993979408dfb0c54...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 3, 2023
@openjdk
Copy link

openjdk bot commented Nov 3, 2023

@lahodaj The following label will be automatically applied to this pull request:

  • core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the core-libs core-libs-dev@openjdk.org label Nov 3, 2023
@mlbridge
Copy link

mlbridge bot commented Nov 3, 2023

Webrevs

}
cb.iload(1);
Label dflt = cb.newLabel();
record Element(Label target, Label next, Object label) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'label' is not a Label, is there a better name to make the difference between the switch label and the bytecode label

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to caseLabel.

Element element = cases.get(idx);
Label next = element.next();
cb.labelBinding(element.target());
if (element.label() instanceof Class<?> classLabel) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too bad we can not use a switch on the label here instead of a bunch of instanceof :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we could, if we tweaked javac so that it would produce a different/simplified code for java.base for pattern switches (i.e. hardcode an if-else cascade). Which we may or may not do at some point.

@forax
Copy link
Member

forax commented Nov 3, 2023

Here is a test that uses a hidden class that works with the current implementation.
If i'm not mistaken, the proposed implementation fails that test.

public class SwitchBootstrapTest {
  public static void main(String[] args) throws Throwable {
    var className = SwitchBootstrapTest.class.getName();
    byte[] data;
    try(var input = SwitchBootstrapTest.class.getResourceAsStream('/' + className.replace('.', '/') + ".class")) {
      data = input.readAllBytes();
    }

    var lookup = MethodHandles.lookup().defineHiddenClass(data, true, ClassOption.NESTMATE, ClassOption.STRONG);
    var hiddenClass = lookup.lookupClass();
    var constructor = lookup.findConstructor(hiddenClass, methodType(void.class));
    var instance = constructor.invoke();

    var methodType = methodType(int.class, Object.class, int.class);
    var callSite = SwitchBootstraps.typeSwitch(lookup, "", methodType, hiddenClass, Object.class);
    var index = (int) callSite.getTarget().invokeExact(instance, 0);
    System.out.println("index " + index);
  }
}

@lahodaj
Copy link
Contributor Author

lahodaj commented Nov 3, 2023

Thanks for all the comments so far - I think I've either reflected them, or wrote a comment to each of them. Please let me know if there's something else, or if I've forgotten something.

@@ -375,125 +379,128 @@ private static final class EnumMap {
@SuppressWarnings("removal")
private static MethodHandle generateInnerClass(MethodHandles.Lookup caller, Object[] labels) {
List<EnumDesc<?>> enumDescs = new ArrayList<>();
List<Class<?>> extraClassLabels = new ArrayList<>();

byte[] classBytes = Classfile.of().build(ClassDesc.of(typeSwitchClassName(caller.lookupClass())), clb -> {
clb.withFlags(AccessFlag.FINAL, AccessFlag.SYNTHETIC)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AccessFlag.SUPER is missing, this will make this class a value class in the Valhalla repo

@forax
Copy link
Member

forax commented Nov 3, 2023

Thanks for all the comments so far - I think I've either reflected them, or wrote a comment to each of them. Please let me know if there's something else, or if I've forgotten something.

You idea to use an extra array is clever. Using an immutable List instead of an array should allow the VM to constant fold the Class.isInstance (see above).

@@ -71,7 +72,7 @@ private SwitchBootstraps() {}
private static final MethodHandle MAPPED_ENUM_LOOKUP;

private static final MethodTypeDesc typesSwitchDescriptor =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that it's a static final, the name should be in uppercase, TYPES_SWITCH_DESCRIPTOR

@forax
Copy link
Member

forax commented Nov 3, 2023

Looks good to me :)

Copy link
Contributor

@vicente-romero-oracle vicente-romero-oracle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, very interesting!

@lahodaj
Copy link
Contributor Author

lahodaj commented Nov 24, 2023

/integrate

@openjdk
Copy link

openjdk bot commented Nov 24, 2023

Going to push as commit 0c9a61c.
Since your change was applied there have been 222 commits pushed to the master branch:

  • 26c3390: 8320383: refresh libraries cache on AIX in VMError::report
  • fc31474: 8318913: The module-infos for --release data do not contain pre-set versions
  • df1b896: 8320679: [JVMCI] invalid code in PushLocalFrame event message
  • c75c388: 8318776: Require supports_cx8 to always be true
  • 14557e7: 8314502: Change the comparator taking version of GrowableArray::find to be a template method
  • 2802643: 8314243: Make VM_Exit::wait_for_threads_in_native_to_block wait for user threads time configurable
  • 6f26311: 8318490: Increase timeout for JDK tests that are close to the limit when run with libgraal
  • cb95e39: 8224261: JProgressBar always with border painted around it
  • 6d79e0a: 8318159: RISC-V: Improve itable_stub
  • 06f040b: 8320645: DocLint should use javax.lang.model to detect default constructors
  • ... and 212 more: https://git.openjdk.org/jdk/compare/bfafb27e273819fb51639daa993979408dfb0c54...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Nov 24, 2023
@openjdk openjdk bot closed this Nov 24, 2023
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Nov 24, 2023
@openjdk
Copy link

openjdk bot commented Nov 24, 2023

@lahodaj Pushed as commit 0c9a61c.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-libs core-libs-dev@openjdk.org integrated Pull request has been integrated
Development

Successfully merging this pull request may close these issues.

5 participants