-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Unreachability analysis for pattern matches #650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Analyze matches for unreachable cases. A case is unreachable if it implies its preceding cases. Call `C` the formula that is satisfiable if the considered case matches. Call `P` the formula that is satisfiable if the cases preceding it match. The case is reachable if there is a model for `-P /\ C`. Thus, the case is unreachable if there is no model for `-(-P /\ C)`, or, equivalently, `P \/ -C`, or `C => P`. Unreachability needs a more precise model for type and value tests than exhaustivity. Before, `{case _: Int => case 1 =>}` would be modeled as `X = Int \/ X = 1.type`, and thus, the second case would be reachable if we can satisfy `X != Int /\ X = 1.type`. Of course, the case isn't reachable, yet the formula is satisfiable, so we must augment our model to take into account that `X = 1.type => X = Int`. This is done by `removeVarEq`, which models the following axioms about equality. It does so to retain the meaning of equality after replacing `V = C` (variable = constant) by a literal (fresh symbol). For each variable: 1. a sealed type test must result in exactly one of its partitions being chosen (the core of exhaustivity) 2. when a type test is true, tests of super types must also be true, and unrelated type tests must be false For example, `V : X ::= A | B | C`, and `A => B` (since `A extends B`). Coverage (1) is formulated as: `A \/ B \/ C`, and the implications of (2) are simply `V=A => V=B /\ V=X`, `V=B => V=X`, `V=C => V=X`. Exclusion for unrelated types typically results from matches such as `{case SomeConst => case OtherConst => }`. Here, `V=SomeConst.type => !V=OtherConst.type`. This is a conservative approximation. If these constants happen to be the same value dynamically (but the types don't tell us this), the last case is actually unreachable. Of course we must err on the safe side. We simplify the equality axioms as follows (in principle this could be done by the solver, but it's easy to do before solving). If we've already excluded a pair of assignments of constants to a certain variable at some point, say `(-A \/ -B)`, then don't exclude the symmetric one `(-B \/ -A)`. (Nor the positive implications `-B \/ A`, or `-A \/ B`, which would entail the equality axioms falsifying the whole formula.) TODO: We should also model dependencies between variables: if `V1` corresponds to `x: List[_]` and `V2` is `x.hd`, `V2` cannot be assigned at all when `V1 = null` or `V1 = Nil`. Right now this is implemented hackily by pruning counter-examples in exhaustivity. Unreachability would also benefit from a more faithful representation. I had to refactor some of the framework, but most of it is shared with exhaustivity. We must allow approximating tree makers twice, sharing variables, but using different approximations for values not statically known. When considering reachability of a case, we must assume, for example, that its unknown guard succeeds (otherwise it would wrongly be considered unreachable), whereas unknown guards in the preceding cases must be considered to fail (otherwise we could never get to those case, and again, it would falsely be considered unreachable). Since this analysis is relatively expensive, you may opt-out using `-Xno-patmat-analysis` (or annotating the selector with @unchecked). We hope to improve the performance in the near future. -Ystatistics has also been extended to provide some numbers on time spent in the equality-rewrite, solving and analyzing.
Started jenkins job pr-scala-testsuite-linux-opt at https://scala-webapps.epfl.ch/jenkins/job/pr-scala-testsuite-linux-opt/204/ |
jenkins job pr-scala-testsuite-linux-opt: Success - https://scala-webapps.epfl.ch/jenkins/job/pr-scala-testsuite-linux-opt/204/ |
@retronym, could you please don your reviewer hat and take a look at this patch? |
Last comments:
|
I agree about the tests, absolutely. I just did not have much time left. |
Started jenkins job pr-scala-testsuite-linux-opt at https://scala-webapps.epfl.ch/jenkins/job/pr-scala-testsuite-linux-opt/215/ |
Unreachability analysis for pattern matches Thanks for reviewing, @retronym!
Analyze matches for unreachable cases.
A case is unreachable if it implies its preceding cases.
Call
C
the formula that is satisfiable if the considered case matches.Call
P
the formula that is satisfiable if the cases preceding it match.The case is reachable if there is a model for
-P /\ C
.Thus, the case is unreachable if there is no model for
-(-P /\ C)
,or, equivalently,
P \/ -C
, orC => P
.Unreachability needs a more precise model for type and value tests than exhaustivity.
Before,
{case _: Int => case 1 =>}
would be modeled asX = Int \/ X = 1.type
,and thus, the second case would be reachable if we can satisfy
X != Int /\ X = 1.type
.Of course, the case isn't reachable, yet the formula is satisfiable, so we must augment
our model to take into account that
X = 1.type => X = Int
.This is done by
removeVarEq
, which models the following axioms about equality.It does so to retain the meaning of equality after replacing
V = C
(variable = constant)by a literal (fresh symbol). For each variable:
(the core of exhaustivity)
and unrelated type tests must be false
For example,
V : X ::= A | B | C
, andA => B
(sinceA extends B
).Coverage (1) is formulated as:
A \/ B \/ C
, and the implications of (2) are simplyV=A => V=B /\ V=X
,V=B => V=X
,V=C => V=X
.Exclusion for unrelated types typically results from matches such as
{case SomeConst => case OtherConst => }
. Here,V=SomeConst.type => !V=OtherConst.type
. This is aconservative approximation. If these constants happen to be the same value dynamically
(but the types don't tell us this), the last case is actually unreachable. Of course we
must err on the safe side.
We simplify the equality axioms as follows (in principle this could be done by the
solver, but it's easy to do before solving). If we've already excluded a pair of
assignments of constants to a certain variable at some point, say
(-A \/ -B)
, thendon't exclude the symmetric one
(-B \/ -A)
. (Nor the positive implications-B \/ A
,or
-A \/ B
, which would entail the equality axioms falsifying the whole formula.)TODO: We should also model dependencies between variables: if
V1
corresponds tox: List[_]
andV2
isx.hd
,V2
cannot be assigned at all whenV1 = null
orV1 = Nil
. Right now this is implemented hackily by pruning counter-examples in exhaustivity.Unreachability would also benefit from a more faithful representation.
I had to refactor some of the framework, but most of it is shared with exhaustivity. We
must allow approximating tree makers twice, sharing variables, but using different
approximations for values not statically known. When considering reachability of a case,
we must assume, for example, that its unknown guard succeeds (otherwise it would wrongly
be considered unreachable), whereas unknown guards in the preceding cases must be
considered to fail (otherwise we could never get to those case, and again, it would
falsely be considered unreachable).
Since this analysis is relatively expensive, we give up when the CNF representation of the
proposition that models reachability/exhaustivity would become too large. When compiling
the compiler itself, 11% of the matches are left behind. Furthermore, you may opt-out using
-Xno-patmat-analysis
(or by annotating the selector with @unchecked).We hope to improve the performance in the near future.
(An experimental Sat4J interface is in the works, but it turns out it's really the CNF
explosion that kills performance.) Finally, -Ystatistics has also been extended to
provide some numbers on time spent in the equality-rewrite, CNF generation,
solving and analyzing.