Pattern matcher: extractors become name-based. #2848

Merged
merged 45 commits into from Aug 20, 2013

Conversation

Projects
None yet
4 participants
@paulp
Contributor

paulp commented Aug 17, 2013

An extractor is no longer required to return Option[T], and
can instead return anything which directly contains methods
with these signatures:

  def isEmpty: Boolean
  def get: T

If the type of get contains methods with the names of
product selectors (_1, _2, etc.) then the type and arity
of the extraction is inferred from the type of get. If
it does not contain _1, then it is a single value
extractor analogous like Option[T].

This has significant benefits and opens new territory:

  • an AnyVal based Option-like class can be used which
    leverages null as None, and no allocations are necessary
  • for primitive types the benefit is squared (see below)
  • the performance difference between case classes and
    extractors should now be largely eliminated
  • this in turn allows us to recapture great swaths of
    memory which are currently squandered (e.g. every
    TypeRef has fields for pre and args, even though these
    are more than half the time NoPrefix and Nil)

Here is a primitive example:

  final class OptInt(val x: Int) extends AnyVal {
    def get: Int = x
    def isEmpty = x == Int.MinValue // or whatever is appropriate
  }
  // This boxes TWICE: Int => Integer => Some(Integer)
  def unapply(x: Int): Option[Int]
  // This boxes NONCE
  def unapply(x: Int): OptInt

As a multi-value example, after I contribute some methods to TypeRef:

  def isEmpty = false
  def get     = this
  def _1      = pre
  def _2      = sym
  def _3      = args

Then its extractor becomes

  def unapply(x: TypeRef) = x

Which, it need hardly be said, involves no allocations.

paulp added some commits Aug 15, 2013

Make memberType less crashy.
It's a source of constant peril that sym.tpe on NoSymbol
is fine (it's NoType) but tpe memberType sym on NoSymbol throws
a NSDNHO. The last thing we should be doing is discouraging
people from using memberType in favor of sym.tpe, the latter
being almost always the wrong thing.
Some general purpose methods.
Motivated by pattern matcher work, also useful elsewhere.
Add a helper method drop to ScalaRunTime.
We should do a lot more of this - it's ridiculously difficult
and error prone to generate code of this kind involving implicits,
type inference, etc. where the same goal is trivially accomplished
by generating a method call and letting the typer work out the
details.
Expand the understanding of bytecode tests.
The new method is the same as sameMethodAndFieldSignatures, but ignores
generic signatures. This allows for testing methods which receive the same
descriptor but differing generic signatures. In particular, this happens
with value classes, which get a generic signature where a method written
in terms of the underlying values does not.
Positioned variations of inform/warning/globalError.
Because who doesn't want a little positioning in their life.
Minor improvement in pattern typer inference.
This exploits the infrastructure developed for checking the
checkability of type patterns to improve pattern type inference,
which suffered from a similar deficit. There was a hack for
SI-2486, the best I could manage at the time I wrote it; that
is replaced with the principled approach.
Add checkability condition.
All parents of an intersection type must be checkable for the
type to be checkable.
An Unapplied extractor.
This makes it a lot less error prone and redundant to find the
part you need when unwrapping an UnApply tree.
Remedied glaring omission in error output.
Catching an assert and providing beautifully formatted contextual
information is a questionable service if you forget to provide the
error message from the assert.
Broke up typed1's giant pattern match.
Another level of dispatch based on what trees on can expect
to see during what mode. This should be beneficial for both
performance (smaller methods, fewer type tests) and correctness
(prevent trees from reaching inappropriate typing methods by
construction rather than via ad hoc checks.)

This work also revealed that UnApply trees never reach here,
so I removed typedUnApply as dead code.
Pulled pattern typing methods from Typers.
To the extent possible this commit is purely the extraction
of those methods into the newly created PatternTypers trait.
The slicing and dicing of those methods will follow shortly.
Compressed central TreeMaker pattern match.
It's easier to follow the logic when it can all be seen at once.
I moved the specification excerpts down below for continued reference.
Turned TreeMaker into case class.
Type aliases are better than naked tuples, but only barely.
Move most of Typers#Typer#typedTyped into PatternTypers.
Again moving pattern-typing logic out of Typers.

You can tell I like writing Typers#Typer#typedTyped.
SI-4425 do some validity checking on unapplies.
Filter out unapplies which can't be called (such as those with
a second non-implicit parameter list) and report the error in a
meaningful fashion.
Expanded logic in formalTypes.
Made the super unusual move (for me) of making a method longer
instead of shorter, because I had quite a time modifying the logic
as it was. Ideally we wouldn't use this method because it is much
less efficient than necessary. If one is typing a call like this:

  def f(xs: Int*)
  f(p1, p2, p3, ..., p500)

Then it is not necessary to create a list with 500 IntTpes. Once
you run out of non-varargs types, every argument which remains can
be typed against the vararg type directly.
Add some logging to instantiateTypeVar.
This method is right at the center of ALL KINDS of bad.
"this is quite nasty" begins the comment, and I credit its
deadpan understatement. A little logging is the least we
can offer the next guy.
Cleanups in Typers.
Should be behaviorally neutral. A little more logging and a
little better style.
Simplify management of pattern vars.
Always set their info the same way, and when doing
so always translate repeated types into sequence types.
Stylistic cleanups in patmat.
Should be behaviorally neutral.
Introduced classes to encapsulate extractor info.
To be put to use in a commit shortly to follow.
Pattern matcher: extractors become name-based.
An extractor is no longer required to return Option[T], and
can instead return anything which directly contains methods
with these signatures:

  def isEmpty: Boolean
  def get: T

If the type of get contains methods with the names of
product selectors (_1, _2, etc.) then the type and arity
of the extraction is inferred from the type of get. If
it does not contain _1, then it is a single value
extractor analogous like Option[T].

This has significant benefits and opens new territory:

  - an AnyVal based Option-like class can be used which
    leverages null as None, and no allocations are necessary
  - for primitive types the benefit is squared (see below)
  - the performance difference between case classes and
    extractors should now be largely eliminated
  - this in turn allows us to recapture great swaths of
    memory which are currently squandered (e.g. every
    TypeRef has fields for pre and args, even though these
    are more than half the time NoPrefix and Nil)

Here is a primitive example:

  final class OptInt(val x: Int) extends AnyVal {
    def get: Int = x
    def isEmpty = x == Int.MinValue // or whatever is appropriate
  }
  // This boxes TWICE: Int => Integer => Some(Integer)
  def unapply(x: Int): Option[Int]
  // This boxes NONCE
  def unapply(x: Int): OptInt

As a multi-value example, after I contribute some methods to TypeRef:

  def isEmpty = false
  def get     = this
  def _1      = pre
  def _2      = sym
  def _3      = args

Then it's extractor becomes

  def unapply(x: TypeRef) = x

Which, it need hardly be said, involves no allocations.
@paulp

This comment has been minimized.

Show comment
Hide comment
@paulp

paulp Aug 17, 2013

Contributor

I didn't want to mess with #2835 because it has those green check marks. This commit ends up at almost exactly the same place, but with much more granularity in the key commit.

was: 14 files changed, 1011 insertions(+), 806 deletions(-)
now: 9 files changed, 333 insertions(+), 419 deletions(-)

The only change in the end result is a bugfix which I found trying to compile scalariform involving nested bindings like b1 @ (b2: Foo). I couldn't reproduce the overloading issue seen in jenkins, but maybe somehow it will turn out to be related to that fix.

Contributor

paulp commented Aug 17, 2013

I didn't want to mess with #2835 because it has those green check marks. This commit ends up at almost exactly the same place, but with much more granularity in the key commit.

was: 14 files changed, 1011 insertions(+), 806 deletions(-)
now: 9 files changed, 333 insertions(+), 419 deletions(-)

The only change in the end result is a bugfix which I found trying to compile scalariform involving nested bindings like b1 @ (b2: Foo). I couldn't reproduce the overloading issue seen in jenkins, but maybe somehow it will turn out to be related to that fix.

paulp and others added some commits Aug 17, 2013

Introduced case class BoundTree.
The first step in improving the handling of (Symbol, Tree).
SI-5903 extractor macros
Establishes a pattern that can be used to implement extractor macros
that give the programmer control over signatures of unapplications
at compile-time.

=== The pattern ===

In a nutshell, given an unapply method (for simplicity, in this
example the scrutinee is of a concrete type, but it's also possible
to have the extractor be polymorphic, as demonstrated in the tests):

```
def unapply(x: SomeType) = ???
```

One can write a macro that generates extraction signatures for unapply
on per-call basis, using the target of the calls (c.prefix) and the type
of the scrutinee (that comes with x), and then communicate these signatures
to the typechecker.

For example, here's how one can define a macro that simply passes
the scrutinee back to the pattern match (for information on how to
express signatures that involve multiple extractees, visit
#2848).

```
def unapply(x: SomeType) = macro impl
def impl(c: Context)(x: c.Tree) = {
  q"""
    new {
      class Match(x: SomeType) {
        def isEmpty = false
        def get = x
      }
      def unapply(x: SomeType) = new Match(x)
    }.unapply($x)
  """
}
```

In addition to the matcher, which implements domain-specific
matching logic, there's quite a bit of boilerplate here, but
every part of it looks necessary to arrange a non-frustrating dialogue
with the typer. Maybe something better can be done in this department,
but I can't see how, without introducing modifications to the typechecker.

Even though the pattern uses structural types, somehow no reflective calls
are being generated (as verified by -Xlog-reflective-calls and then
by manual examination of the produced code). That's a mystery to me, but
that's also good news, since that means that extractor macros aren't
going to induce performance penalties.

Almost. Unfortunately, I couldn't turn matchers into value classes
because one can't declare value classes local. Nevertheless,
I'm leaving a canary in place (neg/t5903e) that will let us know
once this restriction is lifted.

=== Use cases ===

In particular, the pattern can be used to implement shapeshifting
pattern matchers for string interpolators without resorting to dirty
tricks. For example, quasiquote unapplications can be unhardcoded now:

```
def doTypedApply(tree: Tree, fun0: Tree, args: List[Tree], ...) = {
  ...
  fun.tpe match {
    case ExtractorType(unapply) if mode.inPatternMode =>
      // this hardcode in Typers.scala is no longer necessary
      if (unapply == QuasiquoteClass_api_unapply) macroExpandUnapply(...)
      else doTypedUnapply(tree, fun0, fun, args, mode, pt)
  }
}
```

Rough implementation strategy here would involve writing an extractor
macro that destructures c.prefix, analyzes parts of StringContext and
then generates an appropriate matcher as outlined above.

=== Implementation details ===

No modifications to core logic of typer or patmat are necessary,
as we're just piggybacking on #2848.

The only minor change I introduced is a guard against misbehaving
extractor macros that don't conform to the pattern (e.g. expand into
blocks or whatever else). Without the guard we'd crash with an NPE,
with the guard we get a sane compilation error.
@paulp

This comment has been minimized.

Show comment
Hide comment
@paulp

paulp Aug 18, 2013

Contributor

For ease of reference:

[INFO] org.scala-ide.sdt.build ........................... SUCCESS [0.523s]
[INFO] org.scala-ide.sdt.aspects ......................... SUCCESS [5.057s]
[INFO] org.scala-ide.sdt.core ............................ SUCCESS [52.015s]
[INFO] org.scala-ide.sdt.core.tests ...................... SUCCESS [2:06.068s]
[INFO] org.scala-ide.sdt.spy ............................. SUCCESS [4.982s]
[INFO] org.scala-ide.sdt.debug ........................... SUCCESS [14.482s]
[INFO] org.scala-ide.sdt.debug.tests ..................... SUCCESS [58.897s]
[INFO] org.scala-ide.sdt.weaving.feature ................. SUCCESS [0.127s]
[INFO] org.scala-ide.sdt.feature ......................... SUCCESS [0.135s]
[INFO] org.scala-ide.sdt.dev.feature ..................... SUCCESS [0.133s]
[INFO] org.scala-ide.sdt.source.feature .................. SUCCESS [0.079s]
[INFO] org.scala-ide.sdt.update-site ..................... SUCCESS [3.463s]
Contributor

paulp commented Aug 18, 2013

For ease of reference:

[INFO] org.scala-ide.sdt.build ........................... SUCCESS [0.523s]
[INFO] org.scala-ide.sdt.aspects ......................... SUCCESS [5.057s]
[INFO] org.scala-ide.sdt.core ............................ SUCCESS [52.015s]
[INFO] org.scala-ide.sdt.core.tests ...................... SUCCESS [2:06.068s]
[INFO] org.scala-ide.sdt.spy ............................. SUCCESS [4.982s]
[INFO] org.scala-ide.sdt.debug ........................... SUCCESS [14.482s]
[INFO] org.scala-ide.sdt.debug.tests ..................... SUCCESS [58.897s]
[INFO] org.scala-ide.sdt.weaving.feature ................. SUCCESS [0.127s]
[INFO] org.scala-ide.sdt.feature ......................... SUCCESS [0.135s]
[INFO] org.scala-ide.sdt.dev.feature ..................... SUCCESS [0.133s]
[INFO] org.scala-ide.sdt.source.feature .................. SUCCESS [0.079s]
[INFO] org.scala-ide.sdt.update-site ..................... SUCCESS [3.463s]
@paulp

This comment has been minimized.

Show comment
Hide comment
@paulp

paulp Aug 18, 2013

Contributor

All those happy successes are after a905d0e. After 6d77da3 it went back to failing, but I can't for the life of me make any sense of the six megabytes of console output. Whether the failure is spurious or not, I'd think it would be better if it would not abandon ship at the first sniff of something wrong. I'm just going to assume it's spurious unless someone tells me otherwise.

 [INFO] org.scala-ide.sdt.build ........................... SUCCESS [0.479s]
 [INFO] org.scala-ide.sdt.aspects ......................... SUCCESS [4.516s]
 [INFO] org.scala-ide.sdt.core ............................ SUCCESS [43.115s]
 [INFO] org.scala-ide.sdt.core.tests ...................... FAILURE [1:57.485s]
 [INFO] org.scala-ide.sdt.spy ............................. SKIPPED
 [INFO] org.scala-ide.sdt.debug ........................... SKIPPED
 [INFO] org.scala-ide.sdt.debug.tests ..................... SKIPPED
 [INFO] org.scala-ide.sdt.weaving.feature ................. SKIPPED
 [INFO] org.scala-ide.sdt.feature ......................... SKIPPED
 [INFO] org.scala-ide.sdt.dev.feature ..................... SKIPPED
 [INFO] org.scala-ide.sdt.source.feature .................. SKIPPED
 [INFO] org.scala-ide.sdt.update-site ..................... SKIPPED
Contributor

paulp commented Aug 18, 2013

All those happy successes are after a905d0e. After 6d77da3 it went back to failing, but I can't for the life of me make any sense of the six megabytes of console output. Whether the failure is spurious or not, I'd think it would be better if it would not abandon ship at the first sniff of something wrong. I'm just going to assume it's spurious unless someone tells me otherwise.

 [INFO] org.scala-ide.sdt.build ........................... SUCCESS [0.479s]
 [INFO] org.scala-ide.sdt.aspects ......................... SUCCESS [4.516s]
 [INFO] org.scala-ide.sdt.core ............................ SUCCESS [43.115s]
 [INFO] org.scala-ide.sdt.core.tests ...................... FAILURE [1:57.485s]
 [INFO] org.scala-ide.sdt.spy ............................. SKIPPED
 [INFO] org.scala-ide.sdt.debug ........................... SKIPPED
 [INFO] org.scala-ide.sdt.debug.tests ..................... SKIPPED
 [INFO] org.scala-ide.sdt.weaving.feature ................. SKIPPED
 [INFO] org.scala-ide.sdt.feature ......................... SKIPPED
 [INFO] org.scala-ide.sdt.dev.feature ..................... SKIPPED
 [INFO] org.scala-ide.sdt.source.feature .................. SKIPPED
 [INFO] org.scala-ide.sdt.update-site ..................... SKIPPED

paulp added some commits Aug 18, 2013

An unapplySeq-via-String test.
There are a lot of details yet to be ironed out when it comes
to sequences, but at least here's a little evidence that the basic
mechanisms work.
Refinement of name-based unapplySeq.
Can't finnesse the drop method. Call it blindly for now, even
though in the long run you won't have to write drop.
@dragos

This comment has been minimized.

Show comment
Hide comment
@dragos

dragos Aug 19, 2013

Contributor

@paulp I'm going to have a look.

Contributor

dragos commented Aug 19, 2013

@paulp I'm going to have a look.

@dragos

This comment has been minimized.

Show comment
Hide comment
@dragos

dragos Aug 19, 2013

Contributor

@paulp That failure is spurious.

Contributor

dragos commented Aug 19, 2013

@paulp That failure is spurious.

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 19, 2013

Member

@paulp, thanks for splitting up the PR further -- LEG(TM)

Member

adriaanm commented Aug 19, 2013

@paulp, thanks for splitting up the PR further -- LEG(TM)

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 19, 2013

Member

(The 'E' reminded me of another adjective. Epic.)

Member

adriaanm commented Aug 19, 2013

(The 'E' reminded me of another adjective. Epic.)

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 19, 2013

Member

LGTM -- let's merge this tomorrow-Tuesday

Member

adriaanm commented Aug 19, 2013

LGTM -- let's merge this tomorrow-Tuesday

Merge branch 'master' into patmat
Conflicts:
	src/compiler/scala/tools/nsc/Global.scala
	src/compiler/scala/tools/nsc/transform/patmat/MatchTranslation.scala
@paulp

This comment has been minimized.

Show comment
Hide comment
@paulp

paulp Aug 20, 2013

Contributor

I'm resolving the merge conflict. I'm not rebasing since I'll lose those 44 green checkmarks. Let me know if I should rebase anyway.

Contributor

paulp commented Aug 20, 2013

I'm resolving the merge conflict. I'm not rebasing since I'll lose those 44 green checkmarks. Let me know if I should rebase anyway.

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 20, 2013

Member

No, thanks for resolving the conflict. I was about to merge this PR, so I'll wait until 01f771e goes green to merge anything else.

Member

adriaanm commented Aug 20, 2013

No, thanks for resolving the conflict. I was about to merge this PR, so I'll wait until 01f771e goes green to merge anything else.

@paulp

This comment has been minimized.

Show comment
Hide comment
@paulp

paulp Aug 20, 2013

Contributor

@adriaanm it has gone to state "happy green", now initiating breath holding sequence.

Contributor

paulp commented Aug 20, 2013

@adriaanm it has gone to state "happy green", now initiating breath holding sequence.

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 20, 2013

Member

Hope you haven't turned blue yet. Stay happy green!

Member

adriaanm commented Aug 20, 2013

Hope you haven't turned blue yet. Stay happy green!

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 20, 2013

Member

I was doing some quick local checking. Will hit merge button in 5 4

Member

adriaanm commented Aug 20, 2013

I was doing some quick local checking. Will hit merge button in 5 4

adriaanm added a commit that referenced this pull request Aug 20, 2013

Merge pull request #2848 from paulp/patmat2
Pattern matcher: extractors become name-based.

@adriaanm adriaanm merged commit 38d5b20 into scala:master Aug 20, 2013

1 check passed

default pr-scala Took 78 min.
Details
@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 20, 2013

Member

🎆

Member

adriaanm commented Aug 20, 2013

🎆

@paulp paulp deleted the paulp:patmat2 branch Aug 20, 2013

@paulp

This comment has been minimized.

Show comment
Hide comment
@paulp

paulp Aug 20, 2013

Contributor

Hope you haven't turned blue yet. Stay happy green!

https://wiki.jenkins-ci.org/display/JENKINS/Green+Balls

Contributor

paulp commented Aug 20, 2013

Hope you haven't turned blue yet. Stay happy green!

https://wiki.jenkins-ci.org/display/JENKINS/Green+Balls

@adriaanm

This comment has been minimized.

Show comment
Hide comment
@adriaanm

adriaanm Aug 20, 2013

Member

The green is a bit... radio-active, but otherwise this could very well turn out to be my favorite jenkins plugin. (Not the hardest contest.)

Member

adriaanm commented Aug 20, 2013

The green is a bit... radio-active, but otherwise this could very well turn out to be my favorite jenkins plugin. (Not the hardest contest.)

retronym added a commit to retronym/scala that referenced this pull request Dec 12, 2013

Test case for recently improved unchecked warning
Prior to scala#2848, the enclosed
test compiled without warning and printed:

    true
    false

Features interacting:
  - implicit class tags to enable type patterns on abstract types
  - type tests on compound types.

I think the unchecked warning is acceptable for now.

retronym added a commit to retronym/scala that referenced this pull request Mar 12, 2014

SI-8395 Regression in pattern matching with nested binds
Regressed in scala#2848. In particular,
see 0cf47bd and 017460e.

This commit does away with the `unbind` in the `TypeBound` extractor,
as this was burrowing through

Because of the regression, this pattern:

    (s @ (_s @ (_: String)))

was translated into `typeTestStep`, rather than a `bindingStep`. This
came down the the use of `unbind` in the `TypeBound` extractor.

My first step was to remove the `unbind`. That led to another
problem: the tree now matches `SymbolAndTypeBound`, which extracted
`symbol = s, tpe = String`, ignoring the `_s`. I changed that
extractor to no longer recursively apply to the sub-pattern tree,
which is what `MaybeTypedBound` used to do.

I also checked for other uses of `unbind` in the match translation.
The only place I found it is in `BoundTree#pt`. I believe that this
usage is correct, or at least, not obviously incorrect.

retronym added a commit to retronym/scala that referenced this pull request Mar 12, 2014

SI-8395 Regression in pattern matching with nested binds
Regressed in scala#2848. In particular,
see 0cf47bd and 017460e.

Because of the regression, this pattern:

    (s @ (_s @ (_: String)))

was translated into `typeTestStep`, rather than a `bindingStep`. This
came down the the use of `unbind` in the `TypeBound` extractor.

My first step was to remove the `unbind`. That led to another
problem: the tree now matches `SymbolAndTypeBound`, which extracted
`symbol = s, tpe = String`, ignoring the `_s`. I changed that
extractor to no longer recursively apply to the sub-pattern tree,
which is what `MaybeTypedBound` used to do.

I also checked for other uses of `unbind` in the match translation.
The only place I found it is in `BoundTree#pt`. I believe that this
usage is correct, or at least, not obviously incorrect.

yanns added a commit to yanns/json4s that referenced this pull request Nov 1, 2016

@yanns yanns referenced this pull request in json4s/json4s Nov 1, 2016

Closed

name based extractor to avoid instantiation #419

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment