The Pattern-Matching Bug: special handling of final exits#13076
The Pattern-Matching Bug: special handling of final exits#13076gasche merged 4 commits intoocaml:trunkfrom
Conversation
|
cc @ncik-roberts who I understand intends to review this, and also possibly @trefis, @maranget, @Octachron, @lthls |
c0dde85 to
d6b9380
Compare
| let pat_ctx = Context.lub pat ctx in | ||
| if Context.is_empty pat_ctx then None | ||
| else Some (pat, pat_ctx) | ||
| ) input_fail_pats in |
There was a problem hiding this comment.
Before these cases with an empty context would traverse exit_failpats, repeatedly filtered out in the partition_map call. I would then have to handle them in the None case, to filter them out before deciding whether the final exit is taken. Filtering them out earlier is better.
| will be [[Some _]]. *) | ||
| let fail_pats = complete_pats_constrs seen in | ||
| if List.length fail_pats >= !Clflags.match_context_rows then ( | ||
| let input_fail_pats = complete_pats_constrs seen in |
There was a problem hiding this comment.
I renamed fail_pats into input_fail_pats here because, while rebasing this patch, I introduced bugs by using fail_pats instead of fail_pats_in_ctx within the body of exit_failpats below.
lambda/matching.ml
Outdated
| let final_exit = Default_environment.final_exit defs in | ||
| let final_pats = List.map fst fail_pats_in_ctx in | ||
| (final_exit, final_pats) :: acc, | ||
| Jumps.empty Partial |
There was a problem hiding this comment.
It is not obvious here that the previous behavior is preserved.
- In
Totalmode, in trunk there is no notion of "final exit", so we would returnaccbefore. - In
Partialmode, we reachNoneearlier than in trunk, as in trunk the final exit would be the last element returned byDefault_environment.pop. The logic here is equivalent to what we would have done in that case.
lambda/matching.ml
Outdated
| match Default_environment.pop def with | ||
| | Some ((i, _), _) -> i, Jumps.singleton i ctx | ||
| | None -> Default_environment.final_exit def, Jumps.empty Partial | ||
| in (Lstaticraise (i, []), jumps) |
There was a problem hiding this comment.
The definition of comp_exit was below line 3371, I moved it up here and adapted the definition. Previous definition:
let comp_exit ctx m =
match Default_environment.pop m.default with
| Some ((i, _), _) -> (Lstaticraise (i, []), Jumps.singleton i ctx)
| None -> fatal_error "Matching.comp_exit"The fatal_error case in the old definition would correspond to the case of hitting a failure in a match that is known to be total. This case does not occur in the new definition, which always use the final exit if there are no other exits in the default environment.
| (* Act as Total, this means | ||
| If no appropriate default matrix exists, | ||
| then this switch cannot fail *) | ||
| (None, Jumps.empty) |
There was a problem hiding this comment.
This odd case of an empty default environment inPartial mode has its behavior change slightly in the new definition below. In the new code, this will generate a jump to the final exit, while the previous code would not. This is one case where the compilation behavior is in fact modified, and the new behavior is slightly worse than before.
(A high-level summary: we are going in the direction of trusting the type-checker totality information less, to avoid generating incorrect code when it is wrong. But this may introduce some cases where we will generate slightly worse code than if we had trusted it blindly. This here is one such case, but it is very rare, as explained below.)
I believe that this case basically never happens. When this current code was first written, there was an assert false in this case because the author (Luc Maranget) believed that it could not happen. But then Gilles Peskine in 2004 found an odd example that happens to trigger precisely this scenario. Here is the minimal repro case:
(*
By Gilles Peskine, compilation raised some assert false i make_failactionneg
*)
type bg = [
| `False
| `True
]
type vg =
| A
| B
| U of int
| V of int
| W of int
type tg = {
v : vg;
x : bg;
}
let predg x = true
let rec gilles o = match o with
| (* clause 1 *) {v = (U _ | V _); x = `False} when predg o -> 1
| (* clause 2 *) {v = (A | B) ; x = `False}
| (* clause 3 *) {v = (U _ | V _ | W _); x = `False}
-> 2
| (* clause 4 *) {v = _; x = `True} -> 3A rough summary of what happens is as follows:
- the whole program is Total, so in trunk there would be no final handler case added in the default environment -- compilation starts with an empty default environment
- the compiler decides to first check clauses 1, 2, 4 together, and then clause 3 in a separate submatrix; the compilation of the submatrix 1,2,4 is in Partial mode (as it is followed by another submatrix), with an exit for clause 3 in the default environment
- the compiler splits 1,2,4 into 4 followed by 1,2
- when compiling 1,2 together, the compiler first generates a switch on column
v, and then jumps to sub-matrices that match on columnx(the details of why this happens are in a documentation comment for theprecompile_orfunction) - when it compiles the second column of clause 2, it does this in a restricted default environment where all patterns incompatible with
v = (A | B)have been filtered out (this is the call toDefault_environment.pop_compatinprecompile_or); this default environment is empty, so the compilation of the second column happens in Partial mode but in an empty default environment
In this case, our PR (or any approach to include a final-exit case in the Total case as well) will generate slightly worse code: the compilation of the x column of clause 2 will include a failure case if the value is different from False jumping to the final exit, and the whole pattern will thus include an (unnecessary) final exit handler.
Note that all three of (a) a guard, (b) several or-patterns in the same column, and (c) polymorphic variants at a closed type are necessary in this example. Without (b), we don't get the call to pop_compat, without (a), we don't get the split in step 2., so the compilation happens in a Total context. Without (c) (with a normal variant instead of a polymorphic variant), we don't get a call to mk_failaction_neg, we get mk_failaction_pos which is aware than there are no other cases that can match and does not generate a final exit.
There was a problem hiding this comment.
I believe that this case basically never happens.
Famous last words! I believe that @ncik-roberts found examples of this pattern in the wild in #13338 (comment) .
In that PR I propose a fix that de-pessimizes this situation. (There was no easy fix in the context of the current PR, but #13338 introduces a refinement of partiality information that makes it easier to fix.)
d6b9380 to
9c62599
Compare
|
Note: I am in the process of rebasing the rest of the fix on top of the present PR, and one thing I wanted to mention is that once this PR is in, fixing the bug itself is fairly easy. A minimal fix (which is not necessarily the most pleasant to argument for correctness and to review), with the copious comments removed, is as follows: diff --git c/lambda/matching.ml w/lambda/matching.ml
index 180908d6d94..7e954c6c576 100644
--- c/lambda/matching.ml
+++ w/lambda/matching.ml
@@ -3539,7 +3539,7 @@ and compile_match_nonempty ~scopes repr partial ctx
| { args = (arg, str) :: argl } ->
let v, newarg = arg_to_var arg m.cases in
bind_match_arg str v arg (
- let args = (newarg, Alias) :: argl in
+ let args = (newarg, str) :: argl in
let cases = List.map (half_simplify_nonempty ~arg:newarg) m.cases in
let m = { m with args; cases } in
let first_match, rem =
@@ -3554,13 +3554,22 @@ and compile_match_simplified ~scopes repr partial ctx
| { cases = []; args = [] } -> comp_exit ctx m.default
| { args = ((Lvar v as arg), str) :: argl } ->
bind_match_arg str v arg (
- let args = (arg, Alias) :: argl in
+ let args = (arg, str) :: argl in
let m = { m with args } in
let first_match, rem = split_and_precompile_simplified m in
combine_handlers ~scopes repr partial ctx first_match rem
)
| _ -> assert false
+and compile_partial partial = function
+ | Mutable -> Partial
+ | Immutable -> partial
+
+and mut_of_str =
+ function
+ | Strict | Alias -> Immutable
+ | StrictOpt -> Mutable
+
and bind_match_arg str v arg (lam, jumps) =
let jumps =
(* If the Lambda expression [arg] to access the first argument is
@@ -3588,12 +3724,7 @@ and bind_match_arg str v arg (lam, jumps) =
incorrect. We "fix" the context information on mutable arguments
by calling [Context.erase_first_col] below.
*)
- let mut =
- match str with
- | Strict | Alias -> Immutable
- | StrictOpt -> Mutable
- in
- match mut with
+ match mut_of_str str with
| Immutable -> jumps
| Mutable ->
Jumps.map Context.erase_first_col jumps in
@@ -3648,9 +3770,9 @@ and do_compile_matching_pr ~scopes repr partial ctx x =
and do_compile_matching ~scopes repr partial ctx pmh =
match pmh with
| Pm pm -> (
- let arg =
+ let arg, arg_partial =
match pm.args with
- | (first_arg, _) :: _ -> first_arg
+ | (first_arg, str) :: _ -> first_arg, compile_partial partial (mut_of_str str)
| _ ->
(* We arrive in do_compile_matching from:
- compile_matching
@@ -3683,20 +3805,20 @@ and do_compile_matching ~scopes repr partial ctx pmh =
compile_test
(compile_match ~scopes repr partial)
partial divide_constant
- (combine_constant ploc arg cst partial)
+ (combine_constant ploc arg cst arg_partial)
ctx pm
| Construct cstr ->
compile_test
(compile_match ~scopes repr partial)
partial (divide_constructor ~scopes)
- (combine_constructor ploc arg ph.pat_env cstr partial)
+ (combine_constructor ploc arg ph.pat_env cstr arg_partial)
ctx pm
| Array _ ->
let kind = Typeopt.array_pattern_kind pomega in
compile_test
(compile_match ~scopes repr partial)
partial (divide_array ~scopes kind)
- (combine_array ploc arg kind partial)
+ (combine_array ploc arg kind arg_partial)
ctx pm
| Lazy ->
compile_no_test ~scopes
@@ -3706,7 +3828,7 @@ and do_compile_matching ~scopes repr partial ctx pmh =
compile_test
(compile_match ~scopes repr partial)
partial (divide_variant ~scopes !row)
- (combine_variant ploc !row arg partial)
+ (combine_variant ploc !row arg arg_partial)
ctx pm
)
| PmVar { inside = pmh } -> |
9c62599 to
468c453
Compare
trefis
left a comment
There was a problem hiding this comment.
I am not confident enough in my review to click on the approve button, but I read through this diff and it makes sense to me.
Also: I like the new "pipeline".
|
Hmmm, apparently I was expecting too much from github: I posted some replies to your comments in my review, but that's only apparent on the diff view. In the "Conversation" view they appear as standalone review comments (which don't make much sense without context). 🤷 |
|
I have taken review comments into account, this is ready for another round of review if someone is interested and available. |
ncik-roberts
left a comment
There was a problem hiding this comment.
The focus of my review was on the correctness of the code. This may not be apparent from the comments I left, which are nitpicks, generally speaking. All this means is that I couldn't find issues that threaten correctness. (That's a good thing.)
| begin match compile_fun safe_partial pm with | ||
| | exception Unused -> assert false | ||
| | (lam, jumps) -> | ||
| match Jumps.partial jumps with |
There was a problem hiding this comment.
I might suggest asserting that jumps.env is empty at this point. My preference is for checking these sort of invariants explicitly, especially when their violation would mean that there is a bug elsewhere in the code (as would be the case here).
There was a problem hiding this comment.
There is a global invariant that functions that take a default environment return a jump summary that is well-scoped with respect to this default environment, it only mentions exit numbers that are listed in the default environment. This is more important than specifically ensuring an empty environment here -- it holds in recursive calls as well.
I thought about how to check this statically upon reading your comment. My intuition is that it holds "by construction" because all extensions of the jump summary are done on exit numbers obtained from the default environment. In fact when reading the code I noticed that the present PR breaks this invariant subtle in the mk_failaction_pos case, it sometimes calls Jumps.add i ... jumps when i is the final exit, and not one the non-final exits explicitly listed in the default environment. I changed this by pushing an extra commit which you can have a look at if you are curious.
(The shape of the code is slightly tricky here because we accumulate exit numbers and fail patterns to build both the fails list and the jump summary, but we don't want to handle the final exit differently for fail patterns.)
To summarize. (1) in fact the idea that jumps.env is empty at this point did not hold with the code that you looked at, but (2) this was an innocuous difference because mentioning the final exit in the jump summaries does not make a difference as long as the "partiality" information of the jump summary is correct, and (3) this should be fixed now.
There was a problem hiding this comment.
Note: adding a dynamic check here is not obvious with the current API as there is no way to iterate on the non-final exits of a jump summary. I used ignore (Jumps.map (fun _ -> assert false) jumps as a way to do a quick&dirty check at the point you suggested (and the check does not fail on the compiler codebase or the testsuite), but I propose to not modify the API to commit a check for now.
| (* Total: a singleton only jumps to exit [i], | ||
| not to the final exit. *) | ||
| add i ctx (empty Total) |
There was a problem hiding this comment.
This condition is easier to observe to hold at the callsite to singleton, and not here. (You could imagine calling singleton final_exit ctx.) I would suggest adding a partial argument to singleton so that the assumption can be validated at the callsite.
There was a problem hiding this comment.
This (and the previous issue with final exit ending up bound in the jump summary through Jumps.add) suggests an awkward corner of our API: one should not call Jumps.singleton or Jumps.add with the final exit, but there is no easy way to check this inside the Jumps module because jump summaries do not know what the final exit number is.
I agree that this is awkward and worth fixing, and I have two fixes in mind:
- I could make the jump-summary-extension functions take the default environment as parameter, to check dynamically that the exit is a non-final exit of the default environment, or
- I could change the API of
Default_environmentto hide the final exit better. The only way it is used during compilation is to produce a fail action, so I could just add aDefault_environment.final_fail_actionfunction and that's it. (This would require remodelingmk_failaction_posa bit, possibly for the better.) Then it would be very obvious that the exit number cannot possibly be the final exit, as there would be no way to extract it from the default environment.
I prefer option (2) and will give it a try after mulling over it.
There was a problem hiding this comment.
(1) seems simple enough too. The dynamic check could just be "raise if they're the same". I don't want to indicate that I prefer a significant API overhaul when the API is just used in a few places, and is used correctly in all of those places.
There was a problem hiding this comment.
I pushed a new commit that performs a refactoring along (2). Let me know what you think. If we decide to go in this direction, I will rebase the PR to squash it into the first commit.
There was a problem hiding this comment.
I think I prefer your original approach. It's easier for me to judge its correctness, and I don't find the API change to make substantially more-useful guarantees.
I may just suggest a comment along the lines of "the label must not be the final exit" on both Jumps.singleton/Jumps.add, but even that I don't feel strongly about.
There was a problem hiding this comment.
I worked on mk_failaction_pos again but in the end I decided to keep the approach with raise_final_exit (so not the original approach), I find it cleaner -- and it is not more complex.
There was a problem hiding this comment.
OK, sounds good. Thanks.
71c0f73 to
8731c10
Compare
|
I have rebased the PR against trunk (notably #13084 which created conflicts with the present PR). I reworked |
|
All looks good to me. Thanks for working through my comments. I consider this "approved", but we'll need a maintainer to make that official. |
trefis
left a comment
There was a problem hiding this comment.
Approving on behalf of @ncik-roberts
This approach should preserve the current compilation behavior. It is more efficient for [Partial] matches, as we avoid tracking any context information on the final exit. We hope that this refined interface will also let us implement a fine-grained fix for totality failures due to mutable value. The idea is that when we detect that some current values may flow to the final exit, we could sometimes weaken [Jumps.empty Total] into [Jumps.empty Partial] if we are under a mutable constructor.
8731c10 to
c36f6c6
Compare
|
Thanks @ncik-roberts and @trefis! I rebased the PR and updated the Changes entry, it should now be ready to merge. |
|
Thanks to the discussion with @ncik-roberts in #13152 (the next step in the Pattern-Matching Bug saga, which is still un-merged), I have realized, unlike what I originally believed, the present PR changes pattern-matching compilation in some cases that are pessimized. Simple repro case: type _ t = Bool : bool t | Int : int t | Char : char t
let test2 : type a . a t * a t -> unit = function
| Int, Int -> ()
| Bool, Bool -> ()
| _, Char -> ()This uses a GADT: not all constructors are handled in each case (for example The compiler first split the pattern-matching input in two submatrices, one for the first two lines and one for the last line. The change happens when compiling the first matrix. We see the following in the compiler matching-compilation debug output: In words:
Before this PR, a "final exit" default environment would be added at the toplevel for toplevel-Partial matches, and not added for toplevel-Total matches. When encountering the After this PR, we ignore missing failure cases only if the current partiality of the match is Total. In the present example the current partiality is Partial, because we are the first matrix of a split. The change in behavior may pessimize pattern-matchings on GADTs, even in absence of mutable state, by adding extra match-failure clauses. (I believe that the impact analysis of @ncik-roberts included this commit, so that we know that it doesn't actually affect negatively the Jane Street codebase much, but I'm not sure, maybe he somehow looked at #13152 in isolation?) If we wanted to avoid this (it's a matter of cost/benefit analysis, so depends on how easy it is to avoid), I think that we could track the totality information in a more fine-grained way. So far in the general approach (including #13152) we start Total, then we move to Partial, and in some cases we switch from Total to Partial on some arguments in mutable positions. Instead we may want to track something more fine-grained, for example (but I haven't fleshed this out completely yet):
The current partiality being Total means that we will not exit the current submatrix; if Partial, we may jump outside it. The global partiality being Total means that if we exit the current submatrix, we only jump to one of the existing default environments (a further split); Partial means that we may jump to the final exit / a match-failure clause. |
|
(We noticed this when we decided in #13152 to add a warning in situations where the new pattern-matching logic pessimizes compilation. We assumed this warning would require a mix of GADTs and mutable states. But then CamlinternalFormat started warning on the |
|
The impact analysis indeed only looked at #13152 in isolation. So that's why I missed the pessimization in this PR. |
|
I'm supportive of trying to make this case better. I found many more cases where this issue presents in Jane Street's codebase. Some extra examples beyond just GADTs where a match_failure case is inserted because of this PR:
Refutation case example: type nothing = |
type t = A | B | C of nothing
let f : bool * t -> int = function
| true, A -> 3
| false, A -> 4
| _, B -> 5
| _, C _ -> .Hand-written comparison example: type t =
| A of int
| B of string
| C of string
| D of string
let compare t1 t2 =
match t1, t2 with
| A i, A j -> Int.compare i j
| B l1, B l2 -> String.compare l1 l2
| C l1, C l2 -> String.compare l1 l2
| D l1, D l2 -> String.compare l1 l2
| A _, (B _ | C _ | D _ ) -> -1
| (B _ | C _ | D _ ), A _ -> 1
| B _, (C _ | D _) -> -1
| (C _ | D _), B _ -> 1
| C _, D _ -> -1
| D _, C _ -> 1I think even a simple version of the approach you propose would help these cases, and I think they're worth the effort. (I see a few hundred such cases in the Jane Street codebase.) |
… default This un-does a small pessimization of ocaml#13076, which was analyzed in details in ocaml#13076 (comment), but at the time was believed to not affect any program in the wild. Since then Nick Roberts found instances of this pattern involving or-patterns and polymorphic variants. Thanks to the new presentation of partiality information that distingues the "global" and "current" information, the pessimization is now easy to undo, as done in this commit.
… default This un-does a small pessimization of ocaml#13076, which was analyzed in details in ocaml#13076 (comment), but at the time was believed to not affect any program in the wild. Since then Nick Roberts found instances of this pattern involving or-patterns and polymorphic variants. Thanks to the new presentation of partiality information that distingues the "global" and "current" information, the pessimization is now easy to undo, as done in this commit.
… default This un-does a small pessimization of #13076, which was analyzed in details in #13076 (comment), but at the time was believed to not affect any program in the wild. Since then Nick Roberts found instances of this pattern involving or-patterns and polymorphic variants. Thanks to the new presentation of partiality information that distingues the "global" and "current" information, the pessimization is now easy to undo, as done in this commit.
…n-fix Matching compilation: fix a pessimization from #13076
This PR is the next episode of The Pattern-Matching Bug saga, as documented in #7241 (comment) . It does not fix any bug itself, and should not change the compiler behavior in any way; it changes the way totality information is handled in the compiler in a way that is necessary for the final fix for "totality issues". The change is relatively subtle and deserves a careful review of its own, independent of any behavior change and bug fixing.
Today in trunk
When we call the pattern-matching compiler, we pass it some totality information, that is a flag
TotalorPartial, which was computed by the type-checker.Partialmeans that the set of clauses may not cover all possible inputs, so the compiler has to generate code to handle match failures.The logic to handle match failures is in the
toplevel_handlerfunction which is the shared entry point for clauses compilation, and in thecheck_totalhelper function.ocaml/lambda/matching.ml
Lines 3790 to 3807 in 1a79864
ocaml/lambda/matching.ml
Lines 3776 to 3781 in 1a79864
When the totality information is
Total(and we don't use the-safer-matchingflag, which makes the compiler always assumePartial), we start the compilation with an empty default environment. When it isPartial(or when-safer-matchingis used), we create a final exit,raise_num, for the match-failure case, and we populate the default environment with the information that this final exit is present and that it can handle all possible input values (this is whatomega_listdoes).(A default environment is a piece of optimization information used by the pattern-matching compiler that lists all possible exit points if the current sub-matrix fails to handle some input, with information on which values can possibly be matched by those exit points to avoid jumping to a part of the code that will fail to match the value anyway and propagate to a later exit.)
check_totalis then in charge of creating the actual handler for this final exit. There is an optimization here: if the pattern-matching compiler did not in fact generate any jump to the final exit (despite the totality information beingPartial), then there is no need to generate the handler. This information is found in the "jump summary" calledtotalgenerated by the compiler at it produces the pattern-matching code. The optimization is useful because sometimes the totality information over-approximates, for example any match on an extensible datatype is considered Partial, sotry ... with _ -> ()will be considered Partial but it does match all possible exceptions.The problem
The problem is that sometimes the totality information computed by the type-checker is wrong, in the cases discussed in #7421 where a value is mutated while it is being matched. The exhaustivity checker in the type-checker is not aware of this issue and will falsely report as
Totalsets of clauses that do fail to match at runtime. This can result in unsound code generated by the pattern-matching compiler.(At this point one may suggest to fix the exhaustivity checker in the type-checker to reason about mutability in a more correct way. The problem is that the exhaustivity checker is a fairly complex beast, in charge of dealing with the subtleties of advanced type-system features (GADTs, extensible types, exception rebinding) and of approximating the resolution of a NP-complete problem without blowing up in practice. I don't want to have the soundness of the code generated by the compiler depend on my attempt at also correctly handling in-flight mutations inside this machinery.)
The general approach I propose to fix this is to sometimes degrade the totality information from
TotaltoPartial, inside the pattern-matching compiler, when we encounter a match on a mutable sub-value that is not locally total (it does not handle all cases) but is claimed Total by the type-checker. This change is not included in the present PR, which only contains buildup work to make the change easier.With the current approach in trunk, the choice to have a final exit is made before compilation starts, based only on the type-checker-provided totality information. If we decide later, in the middle of compilation, that we in fact wanted to compile in
Partialmode, it is too late to add a final exit in the default environment to correctly compile the rest of the code.This PR
One simple way to fix this problem is to always include a final exit, even in
Totalmode, in the default environment. Thanks to thecheck_totaloptimization, this exit will not be actually generated if it is not used by the code, in particular in all theTotalcases that are not unsound. But this approach has a downside, which is that we track fine-grained context information in the jump summaries (to be able to optimize the code of the corresponding handler), but we know that the context information is useless for the final exit, which has a trivial handler which does not depend on the shape of the values it receives. Adding a bunch of context computations to allTotalmatches (which are by far the most common in OCaml programs) is not a great idea, it could result in a noticeable slowdown to compilation speed for existing programs with subtle total pattern-matching.This PR implements a refined approach where this "final exit" has a special status in default environments and jump summaries:
final_exitrecord fieldThen we do create a final exit in all cases (whether we are told that the match is
PartialorTotal), we compile our pattern-matching clauses, and we generate a handler for the final exit if the jump summary detected that some values use it. The correspondingtoplevel_handlercode now looks like this: