Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching strings makes switches #8451

Merged
merged 1 commit into from Nov 19, 2019

Conversation

@hrhino
Copy link
Member

hrhino commented Oct 1, 2019

Switchable matches with string-typed scrutinee survive the pattern matcher in the same way as those on integer types do: as a series of CaseDefs with empty guard and literal pattern.

Cleanup collates them by hash code and emits a switch on that. No sooner, so scala.js can emit a more JS-friendly implementation.

Labels were used to avoid a proliferation of throw new MatchError.

Works with nulls. Works with Unit.

Enclosed "pos" test stands for positions, not positivity.

Fixes scala/bug#11740.

@scala-jenkins scala-jenkins added this to the 2.13.2 milestone Oct 1, 2019
@hrhino hrhino added the WIP label Oct 1, 2019
@hrhino

This comment has been minimized.

Copy link
Member Author

hrhino commented Oct 1, 2019

Darn, I missed the chance to use the shiny "WIP" feature.

@sjrd: the positions test is implicitly a scala.js-compatibility test. The trees coming out of patmat are hopefully not too difficult for the SJS backend to handle. Maybe this is a good time for me to get started over there.

@sjrd

This comment has been minimized.

Copy link
Member

sjrd commented Oct 1, 2019

Cool :) Thanks for the separation in cleanup.

Maybe this is a good time for me to get started over there.

Sure, why not? :) If you do, you'll want to work in the 0.6.x branch. We forward-merge 0.6.x into master every now and then.

@hrhino hrhino force-pushed the hrhino:t11740 branch from 2b3c130 to ac0295f Oct 1, 2019
@retronym

This comment has been minimized.

Copy link
Member

retronym commented Oct 4, 2019

The tests fail when the compiler has been bootstrapped through this change. A pattern match in ScalaVersion$.<init> now throws a MatchError.

Minimized:

import annotation.switch

object Test {
  def test(s: String): Int = {
    (s : @switch) match {
      case null | "" => 0
      case _ => s.toInt
    }
  }

  def main(args: Array[String]): Unit = {
    println(test("2"))
  }
}
% qscalac sandbox/test.scala && qscala Test
scala.MatchError: 2 (of class java.lang.String)
	at Test$.test(test.scala:5)
      case <synthetic> val x1: String = (s: String);
      {
        matchEnd1(if (x1.eq(null))
          0
        else
          x1.hashCode() match {
          case 0 => if ("".equals(x1))
            0
          else
            matchEnd2()
          case 3392903 => if ("null".equals(x1))
            0
          else
            matchEnd2()
          case _ => matchEnd2()
        });
        matchEnd2(){
          throw new MatchError(x1)
        };
        matchEnd1(x$1: Int){
          x$1
        }
      }
    };
@retronym

This comment has been minimized.

Copy link
Member

retronym commented Oct 4, 2019

There appear to be two separate problems.

First, if we want to make this work for null patterns, we need special care in Cleanup to avoid the fact that Constant(null).stringValue == "null" (rather than == null). We also need to avoid a NPE by using null eq value rather than null.equals(value).

The MatchError above does not rely on null, though, as

import annotation.switch

object Test {
  def test(s: String): Int = {
    (s : @switch) match {
      case "" => 0
      case _ => s.toInt
    }
  }

  def main(args: Array[String]): Unit = {
    println(test("2"))
  }
}

Also fails.

@retronym retronym force-pushed the hrhino:t11740 branch from ac0295f to 0fc5566 Oct 4, 2019
@retronym

This comment has been minimized.

Copy link
Member

retronym commented Oct 4, 2019

I push-f-ed fixes for these problems.

@retronym retronym force-pushed the hrhino:t11740 branch from 0fc5566 to de96d7d Oct 4, 2019
@hrhino

This comment has been minimized.

Copy link
Member Author

hrhino commented Oct 4, 2019

Thanks! I'm also not correct in my handling of Alternative... Right now it copies the body of the case for each pattern. I'll push a fix for that.

@hrhino hrhino force-pushed the hrhino:t11740 branch from de96d7d to 110e5d2 Oct 4, 2019
*/

val stats = mutable.ListBuffer.empty[Tree]
var failureBody = Throw(New(definitions.MatchErrorClass.tpe_*, REF(sel))) : Tree

This comment has been minimized.

Copy link
@hrhino

hrhino Oct 4, 2019

Author Member

I can't remember why I thought I couldn't use the synth default case from patmat. This should never be used (the Ident(WILDCARD, _, body) case should always be hit but it seems safer.

case cd@CaseDef(StringsPattern(strs), _, body) =>
val jump = currentOwner.newLabel(unit.freshTermName("case"), swPos).setInfo(MethodType(Nil, restpe))
stats += LabelDef(jump, Nil, succeed(body))
strs.map((_, jump, cd.pat.pos))

This comment has been minimized.

Copy link
@hrhino

hrhino Oct 4, 2019

Author Member

I used to copy the body of a case if there were alternatives, so

x match { case "a" | "b" => body }

would introduce tree sharing.

@@ -498,6 +595,9 @@ abstract class CleanUp extends Statics with Transform with ast.TreeDSL {
super.transform(localTyper.typedPos(tree.pos)(consed))
}

case switch: Match =>
super.transform(transformSwitch(switch))

This comment has been minimized.

Copy link
@hrhino

hrhino Oct 4, 2019

Author Member

Thanks... I can't believe I forgot this

@hrhino hrhino requested a review from retronym Oct 4, 2019
@retronym

This comment has been minimized.

Copy link
Member

retronym commented Oct 7, 2019

Probably worth a test to show that the scrutinee is only evaluated once. Cleanup.transformSwitch relies on the fact that patmat stabilizes the scrutinee into temp val.

I've manually tested with the following.

object Test {

  def test(s: String, expected: Int): Unit = {
    val it = Iterator(s)
    val r = (it.next(): @annotation.switch) match {
        // collision
        case "AaAa" => 1
        case "BBBB" => 2
      
        // collision with "FB"
        case "Ea"   => 3

        case _      => 4
     }
     assert(expected == r, (expected, r))
  }

  def main(args: Array[String]): Unit = {
    test("AaAa", 1)
    test("BBBB", 2)
    test("Ea", 3)
    test("FB", 4)
    test("XXX", 4)
    test(null, 4)
  }
}
@hrhino hrhino force-pushed the hrhino:t11740 branch from 110e5d2 to da0181a Oct 7, 2019
@hrhino hrhino removed the WIP label Oct 7, 2019
/* From this:
* string match { case "AaAa" => 1 case "BBBB" | "c" => 2 case _ => 3}
* Generate this:
* goto matchSuccess (string.## match {

This comment has been minimized.

Copy link
@retronym

retronym Oct 8, 2019

Member

Why "goto matchSuccess" here? All cases of the match branch so it seems like a no-op to me.

This comment has been minimized.

Copy link
@hrhino

hrhino Oct 8, 2019

Author Member

Nice catch. I had it before I realized the problem with Alternatives

@retronym

This comment has been minimized.

Copy link
Member

retronym commented Oct 8, 2019

Comparing javac and scalac:

retronym/string-switch@c2820bb

class Test {
    int test(String s) {
        return switch(s) {
            case "AaAa" -> 1;
            case "BBBB" -> 2;
            case "Ea"   -> 3;
            case "xx"   -> 4;
            default     -> 5;
        };
    }
}
class Test {

  def test(s: String, expected: Int) = {
     s match {
        case "AaAa" => 1
        case "BBBB" => 2
        case "Ea"   => 3
        case "xx"   => 4
        case _      => 5
     }
  }
}
@hrhino hrhino force-pushed the hrhino:t11740 branch from da0181a to 47691e1 Oct 8, 2019
@hrhino

This comment has been minimized.

Copy link
Member Author

hrhino commented Oct 8, 2019

That's interesting. Seems like Java emits two switch instructions: one to get a number representing what code block to run, and one to run it. I'm not sure I see a benefit there compared to this.

@hrhino

This comment has been minimized.

Copy link
Member Author

hrhino commented Oct 9, 2019

I'm honored as always, @SethTisue

@hrhino hrhino force-pushed the hrhino:t11740 branch from 47691e1 to 6146128 Oct 10, 2019
@martijnhoekstra

This comment has been minimized.

Copy link
Contributor

martijnhoekstra commented Oct 10, 2019

There is a comment in the javac source that they do it to keep fall through simpler, so that doesn't apply to match.

@hrhino

This comment has been minimized.

Copy link
Member Author

hrhino commented Oct 10, 2019

Nice find; thanks!

@retronym

This comment has been minimized.

Copy link
Member

retronym commented Oct 11, 2019

Another advantage the double-lookup would be the opportunity to use an invokedynamic based implementation like is being added to JDK 14 (https://github.com/openjdk/amber/blob/pattern-runtime/src/java.base/share/classes/java/lang/runtime/PatternHandles.java). That's not something we're targetting right now as we obviously can't depend on JDK14, but something for later on.

@retronym

This comment has been minimized.

Copy link
Member

retronym commented Oct 11, 2019

I guess the cleanup phase is de-facto part of the JVM backend as ScalaJS doesn't run it, but it does seem a little strange (but not a blocker for the change) to desugar match-on-int and match-on-string in different phases.

@retronym retronym requested a review from lrytz Oct 11, 2019
@sjrd

This comment has been minimized.

Copy link
Member

sjrd commented Oct 11, 2019

it does seem a little strange (but not a blocker for the change) to desugar match-on-int and match-on-string in different phases.

The way I see it, match-on-string is desugared into match-on-int, but the latter is compiled as is to a JVM instruction. Match-on-int is never really desugared.

@lrytz
lrytz approved these changes Oct 11, 2019
Copy link
Member

lrytz left a comment

LGTM!

test/files/jvm/string-switch/Test.scala Outdated Show resolved Hide resolved
@sjrd

This comment has been minimized.

Copy link
Member

sjrd commented Oct 11, 2019

@hrhino Are you interested in trying to make the required changes in the Scala.js compiler backend, or should I do them?

@hrhino

This comment has been minimized.

Copy link
Member Author

hrhino commented Oct 11, 2019

@sjrd I'll give it a go.

Switchable matches with string-typed scrutinee survive the pattern
matcher in the same way as those on integer types do: as a series of
`CaseDef`s with empty guard and literal pattern.

Cleanup collates them by hash code and emits a switch on that.
No sooner, so scala.js can emit a more JS-friendly implementation.

Labels were used to avoid a proliferation of `throw new MatchError`.

Works with nulls. Works with Unit.

Enclosed "pos" test stands for positions, not positivity.

Fixes scala/bug#11740

Co-Authored-By: "Jason Zaugg" <jzaugg@gmail.com>
@hrhino hrhino force-pushed the hrhino:t11740 branch from 6146128 to 9f25ca0 Oct 11, 2019
@lrytz
lrytz approved these changes Oct 14, 2019
Copy link
Member

lrytz left a comment

Nice work! @sjrd should we wait with the merge until Scala.js is ready?

@sjrd

This comment has been minimized.

Copy link
Member

sjrd commented Oct 14, 2019

should we wait with the merge until Scala.js is ready?

Yes please, otherwise the community build will break. @hrhino told me he intended to do it, but he needs some approval.

@sjrd

This comment has been minimized.

Copy link
Member

sjrd commented Nov 11, 2019

So ... it turns out that in Scala.js 0.6.x, this actually just works out of the box 😆 The thing is that in 0.6.x, our IR supports any kind of literal in our Match nodes (equivalent to JVM switches), and our entire pipeline is ready to accept constant strings and nulls in there, and emit them as is in JS, which is valid JS. Even though none of it was ever exercised nor tested before, it just worked 😎

Now, ironically, Scala.js 1.x crashes and burns with this change, because in 1.x we restricted our Match nodes to support only Ints (like the JVM). So I'll have to work on that.

sjrd added a commit to sjrd/scala-js that referenced this pull request Nov 11, 2019
Since Scala 2.13.2, the pattern matcher will keep `Match` nodes
that match on `String`s and `null`s as is, to be desugared later
in `cleanup`. This was implemented upstream in
scala/scala#8451

We implement a more general translation that will accept any kind
of scalac `Literal`. If the scrutinee is an integer and all cases
are int literals, we emit a `js.Match` as before. Otherwise, we
emit an `if..else` chain with `===` comparisons.
@sjrd

This comment has been minimized.

Copy link
Member

sjrd commented Nov 11, 2019

@sjrd

This comment has been minimized.

Copy link
Member

sjrd commented Nov 13, 2019

OK we've got a PR for Scala.js with green CI and review, ready to be merged. So feel free to merge this PR in Scala, and we'll follow suit in Scala.js.

@sjrd
sjrd approved these changes Nov 13, 2019
@lrytz lrytz merged commit 48f73bc into scala:2.13.x Nov 19, 2019
3 checks passed
3 checks passed
cla @hrhino signed the Scala CLA. Thanks!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
validate-main [9944] SUCCESS. Took 36 min.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.