-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parmatch: make exhaust
(exhaustivity and fragility checking) lazy
#9511
Conversation
6636f0c
to
a7065aa
Compare
The "end-game" comment was maybe a bit tacky. There are situations involving GADTs where an exponential number of counter-example candidates are considered and all discarded (so looking for the first valid counter-example), and the type-checking time remains exponential in this case. For example (I will add it to the testsuite): type _ t =
| A : int t
| B : int t
| C : bool t
type force = Int of int t
let test = function
Int (A|B), Int (A|B), Int (A|B), Int (A|B), Int (A|B),
Int (A|B), Int (A|B), Int (A|B), Int (A|B), Int (A|B),
Int (A|B), Int (A|B)
-> () On this family of programs, exhaustivity checking is still exponential. On this example (N=12), this PR is twice faster than trunk (which is a good sign that it is safe for compile-time in the general case). On larger N values the PR terminates (slowly) while trunk does a Stack Overflow. |
In (* Warn about all relevant cases when possible *)
let f = function
| None, None -> 1
| Some _, Some _ -> 2;;
(*
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
((Some _, None)|(None, Some _))
*) With the present PR, we only have one counter-example shown. Is this a problem? Note that, already in trunk, you only see one example if a GADT is involved: type _ option_gadt =
| None : 'a option_gadt
| Some : 'a -> 'a option_gadt;;
let f = function
| None, None -> 1
| Some _, Some _ -> 2;;
(*
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
(Some _, None)
*) |
a7065aa
to
40d366b
Compare
@gasche thanks for the quick fix. |
exhaustiveness.ml contains a mix of tricky examples (checking correctness of the typer) and "slow" examples (that have exponential checking time due to a search explosion). Everything is run using expect_test, which uses the bytecode type-checker, so the "slow" examples are very slow. This PR separates out the "slow" example in their own file exhaustiveness_slow.ml which is not a reference test but checked using ocamlc.opt. On my machine make one DIR=tests/typing-warnings goes from 14s to 2.3s. This would be important in the future if we want to add more "slow" tests (in particular this would be useful for ocaml#9511). (No change entry needed.)
exhaustiveness.ml contains a mix of tricky examples (checking correctness of the typer) and "slow" examples (that have exponential checking time due to a search explosion). Everything is run using expect_test, which uses the bytecode type-checker, so the "slow" examples are very slow. This PR separates out the "slow" example in their own file exhaustiveness_slow.ml which is not an expect-style test but checked using ocamlc.opt. On my machine make one DIR=tests/typing-warnings goes from 14s to 2.3s. This would be important in the future if we want to add more "slow" tests (in particular this would be useful for ocaml#9511). (No change entry needed.)
@gasche: this might be a naïve question (I haven't dived back into the actual code yet) but I would have thought that since parmatch doesn't generate or-patterns anymore, Is that not the case, or have you just not got round to it yet? |
Parmatch does generate or-patterns, but not at the toplevel anymore. In particular, exploding a wildcard results in an or-pattern there. I think that we could review the
|
(Clarification: I think that currently the only or-patterns generated in depth are those that come from wildcard explosion, they are generated in Parmatch by |
Thanks for the explanation :) |
@garrigue this PR ensures that only a single counter-example is shown to the user. This is good for performance and it follows the recommendation of @maranget. If you object to this change, could you make it clear? Otherwise we should probably assume that it is fine -- and go ahead if the PR is approved. |
I have no strict opinion about the number of counter-examples shown to the user. |
Indeed, we only save checking counter-examples after the first valid one. For exhaustiveness checking, most programs are exhaustive and will not have any counter-example, so there is no gain ... except we avoid stack overflows by (1) being tail-recursive in generation and (2) sending smaller candidates to the type-checker. On the other hand, for fragility checking, most programs are not fragile and thus return many counter-examples; there will be a gain, possibly a large one. |
We could probably decide to show a few more counter-examples, but the question then is how many we show. 3 ? 5 ? All of them until we exhaust a fixed computation time ? The nice thing with just 1 is that it seems to be a canonical answer. It is also what the type-checker does when it is in backtracking mode -- so currently the number of counter-examples depends on the use of GADTs or not. |
Note that redundancy checking also expects that a counter-example will be found, and one should suffice. So I agree that in general there is no need to generate all counter-examples. A question could still be whether the cost of doing so when we are checking specifically exhaustiveness is really problematic, compared to the benefit of having a comprehensive list of cases. I suppose this is completely independent of the generation part. |
For any fixed number N of counter-examples we decide to show, we can build an artificial example that will blowup but would not blowup with less counter-examples. But we have no reason to believe that, except for maybe N=2, there are non-artificial situations like this in the wild. The question is whether we have a good reason to choose any specific N other than 1. If we don't know how many we want, then maybe 1 is enough. If you can make an argument that 3 is better than 1, but 4 is not better than 3, I'm happy with 3. Or 2. Right now the only choice I know how to justify is 1. |
Yes. It's trivial to tweak this PR to show N counter-examples instead of 1. |
(With the current state of my understanding, the only justified choice is N=1 (and Luc supports this specific choice), so I think it would be reasonable to merge this PR if it gets properly reviewed/approved. We can of course change the number of counter-example later if we get new information.) |
My answer may have been unclear, but I never suggested using a fixed N other than 1. |
I see, thanks for the clarification. I am still wondering about whether we could/should change the order of counter-example to use the wildcard/default-group first (whether we want to show the counter-example with the leftmost or the rightmost difference). One argument in favor of showing more counter-examples: in exhaustivity checking, if we assume that the user is going to fix the counter-example given (and maybe generalize a bit to related ones), they will re-run the type-checker again and the following counter-examples will be forced. For example, if we show one counter-example at a time and they iterate (recompile and fix) three times, we have the same explosions that we would have had with N=3 anyway. (If the example can explode because of GADTs, then this explosion is inevitable once the pattern becomes exhaustive.) |
I tried to change the counter-example order to include the wildcard/default-group counter-examples first, and then the constructor group. My intuition was that it would be better because it shows the counter-example with "leftmost" difference first. In fact I don't like the result, at least the impact on the testsuite. Many cases look like this: 151 | ....match r1, r2, a with
152 | | R1, _, A -> ()
153 | | _, R2, X -> ()
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
-(R1, R1, B)
+(R2, R2, Y) When I see the match above, obviously both cases are missing, but if I was to fix the code by adding new clauses for them I would put the Another example of (arguably) regression: 2 | ..match x, y with
3 | | _, A z -> z
4 | | _, B z -> if z then 1 else 2
5 | | _, C z -> truncate z
6 | | TE TC, D [|1.0|] -> 14
7 | | TA, D 0 -> -1
8 | | TA, D z -> z
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
-(TE TC, D [| 0. |])
+(TB, D _) This does not mean that the current strategy is always the right one, of course. Sometimes the new order would result in a better result, or at least a non-worse result: 176 | ....match r1, r2, a with
177 | | _, R1, 0 -> ()
178 | | R2, _, [||] -> ()
179 | | _, R1, 1 -> ()
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
-(R2, R2, [| _ |])
+(R1, R1, 2) Here I would in fact rather add the The ideal ordering of counter-example would be the order that follows the source program: we could order the submatrices by the order of first appearance in the clauses. The output would be very nice, but the implementation could be a bit unpleasant; and it is important to keep this soundness-critical part of the codebase nice. I will think more about it. P.S.: @trefis warned me wisely when I opened the PR (26 days ago) that we don't want to get caught deeply in the details of counter-example ordering, and that is a risk of the path this PR was taking with the cut-to-1 strategy. Spot on! But then this PR has remained open on Github and in my head, and no one else did the work of helping ourselves move on, so here we are. |
40d366b
to
2d34dc2
Compare
Of course I got obsessed with the question of the perfect ordering of counter-examples. (@trefis: I believe this is relevant to your interest in Merlin clause completion.) I just pushed a new commit that ensures that, during the computation of specialized submatrices, we preserve source order: the sub-rows of the specialized submatrices are in source order, and the submatrices themselves appear in the order of their first row in the source. This gives very nice ordering properties for the generated counter-examples. For example, consider
The type-checker would previously first suggest
Note: I first tried something more complex and more delicate, which is to also respect the ordering of the default submatrix relative to the specialized submatrices. This actually gives bad results, in the cases I looked at it is not what we want. (There are more explanations in the commit message if for some strange reason you are curious about this.) I now believe that the default submatrix should always go last. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks correct and like a nice improvement.
I really liked the effect of the last commit.
@@ -2211,7 +2220,11 @@ let check_unused pred casel = | |||
let qs = [q] in | |||
begin try | |||
let pss = | |||
get_mins le_pats (List.filter (compats qs) pref) in | |||
(* prev was accumulated in reverse order; | |||
restore source order to get ordered counter-examples *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the order matters here, we're just trying to decide whether q
is useful or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you comment the List.rev
here, the following happens:
type ('env, 'a) var =
| Zero : ('a * 'env, 'a) var
| Succ : ('env, 'a) var -> ('b * 'env, 'a) var
;;
type ('env, 'a) typ =
| Tint : ('env, int) typ
| Tbool : ('env, bool) typ
| Tvar : ('env, 'a) var -> ('env, 'a) typ
;;
let f : type env a. (env, a) typ -> (env, a) typ -> int = fun ta tb ->
match ta, tb with
| Tint, Tint -> 0
| Tbool, Tbool -> 1
| Tvar var, tb -> 2
| _ -> . (* error *)
;;
[%%expect{|
Line 15, characters 5-6:
15 | | _ -> . (* error *)
^
Error: This match case could not be refuted.
- Here is an example of a value that would reach it: (Tint, Tvar Zero)
+ Here is an example of a value that would reach it: (Tbool, Tvar Zero)
|}];;
List.map (fun constr_mat -> Some constr_mat) constrs @ [None] | ||
|> List.to_seq | ||
|> Seq.flat_map | ||
(function | ||
| Some constr_mat -> try_non_omega constr_mat | ||
| None -> try_omega ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this a bit hard to get my head around, harder than I'd like it to be at least.
The following seems simpler to me
Seq.append
(Seq.flat_map try_non_omega (List.to_seq constrs))
(try_omega ())
Though I understand it implies more computations early on (but does it really matter?)
Note: I'm fine with leaving the code as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I care about delaying computations as much as we (easily) can. Should I use [ `Specialized of foo | `Default ]
instead of options?
This solves exponential-blowup issue with the strict traversal and/or strict witness computation in cases where an exponential number of counter-examples is generated. This fixes Stack Overflow and exponential-time issues on examples using or-patterns heavily, including one that naturally found its way in real-world user code (see the following testsuite commit). We now systematically keep only one counter-example, instead of letting the type-checker decide whether to discard counter-examples (in Backtrack_or mode) or to preserve them (in Refine_or mode). Note: in the exhaustivity warning, there are sub-messages printed to indicate that: - the exhaustivity counter-example is related to an extensible type, or - that a when-guarded clause does match the counter-example In both cases the warning is there to explain the counter-example(s) shown (not a property of all counter-examples); keeping at most one valid counter-example means that they will be printed less often, but it is the correct/intended behavior in that case.
We produce exhaustivity counter-example in the order of the specialized submatrices. Having submatrices in source order gives the nice behavior that the clause that would naturally been inserted first in the source is given first as a counter-example. Consider for example: function | true, true -> true | false, false -> false The two counter-examples are (true, false) and (false, true). Before this patch, (false, true) would be shown first. After this patch, (true, false) is shown first. This corresponds to the following natural completion order: function | true, true -> true | true, false -> ? | false, false -> false | false, true -> ? On the other hand, the ordering of the default submatrix, with respect to the specialized submatrices, is not preserved -- it is always ordered last. One could intuitively expect the default submatrix to appear in the position of the first omega row in the source. We tried this, and it is not a good idea: - the change is much more invasive as the interface of `build_specialized_submatrices` has to change - the behavior of the result is in fact unpleasant; it is not intuitive to order counter-examples in this way. For example, consider: function | _, None -> false | true, Some true -> false The two exhaustivity counter-examples are (true, Some false) and (false, Some _). The second comes from the default submatrix: morally it is (_, Some _), with "some other constructor missing from the column" instead of the first _. There is no reason to suppose that the user would want to place this (_, Some _) or (false, Some _) counter-example first in the completed code; indeed, this intuition would suggest writing an exhaustive covering of patterns of the form (_, foo), inserted after the first clause, but then the other clauses below become unnecessary! When an omega patterns appears high in the column like this, it is usually because there is a very specific matching condition to the rest of its row, that justifies some shortcutting behavior. The program is typically *not* completed by adding more specific matching conditions.
2d34dc2
to
d8acfa9
Compare
I am eager to close this particular branch of work (although we still need to decide the fate of #9514 for this), so I decided to go ahead and merge the branch as approved. (In addition to @maranget's approval of principle in a discussion along with @trefis and @Octachron when the PR was written.) |
This PR fixes #9498 (a Stack Overflow when checking for fragile pattern matching in a hand-written user program) and subsumes #9499 and #9502, which were earlier approaches that made witness/counter-example generation partially lazy.
(cc @maranget, @trefis and @Octachron, with which I discussed the design space; we decided on this approach together, see #9502 (comment) )
Design
With this PR, the
exhaust
function is completely lazy (it returns aSeq.t
of non-matched patterns), and we only keep the first output that passes the type-checker (thepred
function provided by Typecore). Before, the type-checke would decide (through thetype_pat
splitting mode) whether to keep only the first example (Backtrack_or) or to keep more examples (Refine_or); in particular, in some cases we show simpler, shorter reports to users, with only one counter-example to exhaustivity instead of an or-pattern of many examples.Before:
After:
We could decide to show more than one counter-example, but there isn't a clear choice of bound/cutoff (if we show all of them, we are back to an exponential behavior); I don't think this would be a good idea.
On the other hand, we could decide to show the wildcard-group counter-example before the constructor-group counter-examples, which would show
(false, _, _, _)
in this case. This seems nicer to me (I would assume that wildcard-group counter-examples are typically broader: "the difference more to the left") and I would be interested in @maranget's opinion, it sounds like something he would have intuitions about.Performance impact
Besides fixing a real-world bug (a stack overflow on a human-written program, #9498), basically the hope is that this is the end-game in terms of avoiding or-pattern blowups resulting from the
Parmatch.exhaust
function. On the family of exponential examples provided by @maranget in #9499 (comment), all examples now run in linear time (instantaneous).For previous iterations of this PR, I suspected that they could increase compilation time in common cases (it used an lazy sequence but would often force it fully in the end). This new version should be just as fast or faster than trunk in the common case, because we stop the traversal after the first valid counter-example is found. (If there is only one example, then the cost is very small anyway; if there are many, we save time by giving less work to the type-checker.) I expect that the difference will not be noticeable on most programs.
Edit: I realized later that this reasoning does not hold in the case of exhaustiveness checking. In the common case, a function definitive exhaustive, which means that we have to traverse all counter-example candidates to check that none of them are valid (so in particular we fully force the sequence). In this common use-case we do expect a constant-factor slowdown due to the use of a lazy rather than strict sequence; but in practice I was unable to observe a noticeable performance difference, even on hand-crafted worst-case examples.
Code style
I rewrote the PR with non-invasiveness in mind, and also the goal of not using any additional function missing from the
Seq
module. In particular I inlined mypop : 'a Seq.t -> ('a * 'a Seq.t) option
function, and I rewrote one part in a slightly less natural form to use justSeq.flat_map
instead ofSeq.append
. This avoids the question of where to store theSeq
additions (locally in the function, or at the beginning of Parmatch, or in utils/misc.ml?), and should make it easier for our Bucklescript friends to rebase to old OCaml versions if they want to.