Unreachability analysis for pattern matches #650

adriaanm · 2012-06-01T14:08:42Z

Analyze matches for unreachable cases.

A case is unreachable if it implies its preceding cases.
Call C the formula that is satisfiable if the considered case matches.
Call P the formula that is satisfiable if the cases preceding it match.
The case is reachable if there is a model for -P /\ C.
Thus, the case is unreachable if there is no model for -(-P /\ C),
or, equivalently, P \/ -C, or C => P.

Unreachability needs a more precise model for type and value tests than exhaustivity.
Before, {case _: Int => case 1 =>} would be modeled as X = Int \/ X = 1.type,
and thus, the second case would be reachable if we can satisfy X != Int /\ X = 1.type.
Of course, the case isn't reachable, yet the formula is satisfiable, so we must augment
our model to take into account that X = 1.type => X = Int.

This is done by removeVarEq, which models the following axioms about equality.
It does so to retain the meaning of equality after replacing V = C (variable = constant)
by a literal (fresh symbol). For each variable:

a sealed type test must result in exactly one of its partitions being chosen
(the core of exhaustivity)
when a type test is true, tests of super types must also be true,
and unrelated type tests must be false

For example, V : X ::= A | B | C, and A => B (since A extends B).
Coverage (1) is formulated as: A \/ B \/ C, and the implications of (2) are simply
V=A => V=B /\ V=X, V=B => V=X, V=C => V=X.

Exclusion for unrelated types typically results from matches such as {case SomeConst => case OtherConst => }. Here, V=SomeConst.type => !V=OtherConst.type. This is a
conservative approximation. If these constants happen to be the same value dynamically
(but the types don't tell us this), the last case is actually unreachable. Of course we
must err on the safe side.

We simplify the equality axioms as follows (in principle this could be done by the
solver, but it's easy to do before solving). If we've already excluded a pair of
assignments of constants to a certain variable at some point, say (-A \/ -B), then
don't exclude the symmetric one (-B \/ -A). (Nor the positive implications -B \/ A,
or -A \/ B, which would entail the equality axioms falsifying the whole formula.)

TODO: We should also model dependencies between variables: if V1 corresponds to
x: List[_] and V2 is x.hd, V2 cannot be assigned at all when V1 = null or
V1 = Nil. Right now this is implemented hackily by pruning counter-examples in exhaustivity.
Unreachability would also benefit from a more faithful representation.

I had to refactor some of the framework, but most of it is shared with exhaustivity. We
must allow approximating tree makers twice, sharing variables, but using different
approximations for values not statically known. When considering reachability of a case,
we must assume, for example, that its unknown guard succeeds (otherwise it would wrongly
be considered unreachable), whereas unknown guards in the preceding cases must be
considered to fail (otherwise we could never get to those case, and again, it would
falsely be considered unreachable).

Since this analysis is relatively expensive, we give up when the CNF representation of the
proposition that models reachability/exhaustivity would become too large. When compiling
the compiler itself, 11% of the matches are left behind. Furthermore, you may opt-out using
-Xno-patmat-analysis (or by annotating the selector with @unchecked).
We hope to improve the performance in the near future.
(An experimental Sat4J interface is in the works, but it turns out it's really the CNF
explosion that kills performance.) Finally, -Ystatistics has also been extended to
provide some numbers on time spent in the equality-rewrite, CNF generation,
solving and analyzing.

@unchecked

Analyze matches for unreachable cases. A case is unreachable if it implies its preceding cases. Call `C` the formula that is satisfiable if the considered case matches. Call `P` the formula that is satisfiable if the cases preceding it match. The case is reachable if there is a model for `-P /\ C`. Thus, the case is unreachable if there is no model for `-(-P /\ C)`, or, equivalently, `P \/ -C`, or `C => P`. Unreachability needs a more precise model for type and value tests than exhaustivity. Before, `{case _: Int => case 1 =>}` would be modeled as `X = Int \/ X = 1.type`, and thus, the second case would be reachable if we can satisfy `X != Int /\ X = 1.type`. Of course, the case isn't reachable, yet the formula is satisfiable, so we must augment our model to take into account that `X = 1.type => X = Int`. This is done by `removeVarEq`, which models the following axioms about equality. It does so to retain the meaning of equality after replacing `V = C` (variable = constant) by a literal (fresh symbol). For each variable: 1. a sealed type test must result in exactly one of its partitions being chosen (the core of exhaustivity) 2. when a type test is true, tests of super types must also be true, and unrelated type tests must be false For example, `V : X ::= A | B | C`, and `A => B` (since `A extends B`). Coverage (1) is formulated as: `A \/ B \/ C`, and the implications of (2) are simply `V=A => V=B /\ V=X`, `V=B => V=X`, `V=C => V=X`. Exclusion for unrelated types typically results from matches such as `{case SomeConst => case OtherConst => }`. Here, `V=SomeConst.type => !V=OtherConst.type`. This is a conservative approximation. If these constants happen to be the same value dynamically (but the types don't tell us this), the last case is actually unreachable. Of course we must err on the safe side. We simplify the equality axioms as follows (in principle this could be done by the solver, but it's easy to do before solving). If we've already excluded a pair of assignments of constants to a certain variable at some point, say `(-A \/ -B)`, then don't exclude the symmetric one `(-B \/ -A)`. (Nor the positive implications `-B \/ A`, or `-A \/ B`, which would entail the equality axioms falsifying the whole formula.) TODO: We should also model dependencies between variables: if `V1` corresponds to `x: List[_]` and `V2` is `x.hd`, `V2` cannot be assigned at all when `V1 = null` or `V1 = Nil`. Right now this is implemented hackily by pruning counter-examples in exhaustivity. Unreachability would also benefit from a more faithful representation. I had to refactor some of the framework, but most of it is shared with exhaustivity. We must allow approximating tree makers twice, sharing variables, but using different approximations for values not statically known. When considering reachability of a case, we must assume, for example, that its unknown guard succeeds (otherwise it would wrongly be considered unreachable), whereas unknown guards in the preceding cases must be considered to fail (otherwise we could never get to those case, and again, it would falsely be considered unreachable). Since this analysis is relatively expensive, you may opt-out using `-Xno-patmat-analysis` (or annotating the selector with @unchecked). We hope to improve the performance in the near future. -Ystatistics has also been extended to provide some numbers on time spent in the equality-rewrite, solving and analyzing.

scala-jenkins · 2012-06-01T14:42:56Z

Started jenkins job pr-scala-testsuite-linux-opt at https://scala-webapps.epfl.ch/jenkins/job/pr-scala-testsuite-linux-opt/204/

scala-jenkins · 2012-06-01T16:02:07Z

jenkins job pr-scala-testsuite-linux-opt: Success - https://scala-webapps.epfl.ch/jenkins/job/pr-scala-testsuite-linux-opt/204/

adriaanm · 2012-06-03T12:40:43Z

@retronym, could you please don your reviewer hat and take a look at this patch?
everyone else, feel free to join the party

retronym · 2012-06-03T15:10:28Z

Last comments:

very nicely described in the commit comments
test case / code ratio seems a touch low

adriaanm · 2012-06-03T15:48:37Z

I agree about the tests, absolutely. I just did not have much time left.
They are coming, promise!

adriaanm · 2012-06-03T15:52:38Z

Started jenkins job pr-scala-testsuite-linux-opt at https://scala-webapps.epfl.ch/jenkins/job/pr-scala-testsuite-linux-opt/215/

@retronym

Unreachability analysis for pattern matches Thanks for reviewing, @retronym!

Adriaan Moors added 7 commits June 1, 2012 16:04

all treemakers need positions for unreachable error

66220c9

construct typed trees in TypeTestTM

bc99166

patmatdebug

bf533ea

support for repeated approximation of treemakers

01ac32a

fixing bug found by unreachability

9ebd4f9

allow disabling patmat analyses: -Xno-patmat-analysis

8cab2fa

incorporate @retronym's review comments

3cb72fa

adriaanm pushed a commit that referenced this pull request Jun 3, 2012

Merge pull request #650 from adriaanm/topic-virtpatmat

6ed018b

Unreachability analysis for pattern matches Thanks for reviewing, @retronym!

adriaanm merged commit 6ed018b into scala:master Jun 3, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unreachability analysis for pattern matches #650

Unreachability analysis for pattern matches #650

Uh oh!

adriaanm commented Jun 1, 2012

Uh oh!

scala-jenkins commented Jun 1, 2012

Uh oh!

scala-jenkins commented Jun 1, 2012

Uh oh!

adriaanm commented Jun 3, 2012

Uh oh!

retronym commented Jun 3, 2012

Uh oh!

adriaanm commented Jun 3, 2012

Uh oh!

adriaanm commented Jun 3, 2012

Uh oh!

Uh oh!

Unreachability analysis for pattern matches #650

Unreachability analysis for pattern matches #650

Uh oh!

Conversation

adriaanm commented Jun 1, 2012

Uh oh!

scala-jenkins commented Jun 1, 2012

Uh oh!

scala-jenkins commented Jun 1, 2012

Uh oh!

adriaanm commented Jun 3, 2012

Uh oh!

retronym commented Jun 3, 2012

Uh oh!

adriaanm commented Jun 3, 2012

Uh oh!

adriaanm commented Jun 3, 2012

Uh oh!

Uh oh!