-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String pattern matching slower than Java String switches, doesn't generate a tableswitch #11740
Comments
I also got a smaller speedup (~5%) using the same technique pattern matching on a sealed trait of 20 case classes, using |
@lihaoyi-databricks it depends on number of cases and a pattern of input data for the switch. |
I tried this once in a different context and ran into the problem that this is really difficult to benchmark, because String memoizes its hashcode -- checking the same string in a loop is not a good benchmark, you have to re-create the string instances, and then to properly compare take a benchmark where you only loop through and re-create the string instances. EDIT: depending on the string, it's likely still faster. The context I tried this in at the time was case class matching on class name, but Class also memoizes its name (which then memoizes its hashcode), so there is a time/space tradeoff there that's not obvious in the benchmark. |
For a potential contributor: The relevant code might be around // Constant folding sets the type of a constant tree to `ConstantType(Constant(folded))`
// The tree itself can be a literal, an ident, a selection, ...
object SwitchablePattern { def unapply(pat: Tree): Option[Tree] = pat.tpe match {
case const: ConstantType if const.value.isIntRange =>
Some(Literal(Constant(const.value.intValue))) // TODO: Java 7 allows strings in switches
case _ => None
}} The todo traces itself back to virtual pat mat PR (scala/scala#202). |
I think I said somewhere, back in the day, that this optimization should be performed by the backend, because it is very platform-dependent. I'm pretty sure the strategy would yield worse performance on JS (it's just a guess, though). |
Right, that's a different issue. Emitting a lookup switch by detecting and re-writing branching string equality checks directly to bytecode is not easy. |
You do it in |
@sjrd I have confirmed that hashcode-based switching performs worse in Scala.js (Chrome 77.0.3865.90) than Scala-JVM (Java 1.8.0_131):
Surprisingly the raw string matching in JS performs best of all, even better than the hashcode-based matching on the JVM; I assume that the JS optimizer has some special casing to recognize this as a special form and provide bespoke optimizations for it |
Thanks for doing the test! |
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740 Co-Authored-By: "Jason Zaugg" <jzaugg@gmail.com>
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740 Co-Authored-By: "Jason Zaugg" <jzaugg@gmail.com>
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740 Co-Authored-By: "Jason Zaugg" <jzaugg@gmail.com>
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740 Co-Authored-By: "Jason Zaugg" <jzaugg@gmail.com>
Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of `CaseDef`s with empty guard and literal pattern. Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation. Labels were used to avoid a proliferation of `throw new MatchError`. Works with nulls. Works with Unit. Enclosed "pos" test stands for positions, not positivity. Fixes scala/bug#11740 Co-Authored-By: "Jason Zaugg" <jzaugg@gmail.com>
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings.
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings.
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings.
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings. Fixes scala#11923
The pattern matcher will now emit `Match` with `String` scrutinee as well as the existing `Int` scrutinee. The JVM backend handles this case by emitting bytecode that switches on the String's `hashCode` (this matches what Java does). The SJS already handles `String` matches. The approach is similar to scala/scala#8451 (see scala/bug#11740 too), except that instead of doing a transformation on the AST, we just emit the right bytecode straight away. This is desirable since it means that Scala.js (and any other backend) can choose their own optimised strategy for compiling a match on strings. Fixes scala#11923
The following Scala code:
Runs about 15-20% slower than the equivalent Java code:
The Scala code takes 10600 ± 500 milliseconds, while the Java code takes 8600 ± 200ms (on my 2015 Macbook Pro running Java 1.8.0_131 Scala 2.13.0), a difference of 20% or so. This is because the Scala code generates a naive sequence of cascading if-else checks, while the Java code generates first a
tableswitch
instruction on the pre-computed hash-codes of the various case string literals, and only after looking up the hash code does it perform an equality check for confirmation (more than one check in the case of a hash collision)I don't see any reason why the Scala code should not run just as fast as the Java equivalent, and anyway it's surprising that the roughly-equivalent source code doesn't generate roughly-equivalent bytecode.
The text was updated successfully, but these errors were encountered: