Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust matches #1518

Closed
wants to merge 8 commits into from
Closed

More robust matches #1518

wants to merge 8 commits into from

Conversation

trefis
Copy link
Contributor

@trefis trefis commented Dec 7, 2017

As noted in #1493 various places in parmatch and matching work under the assumption that patterns in the same column will be of the same type, which is just wrong.

Parmatch

Two places made such an assumption:

  • Compat.compat(s): it expected the patterns it is called on to be of the same kind. There is no need to have such an assumption, considering patterns of different "kinds" as incompatible is corrrect.

  • build_other: the function in charge of building a value not being matched by a matching when we know the matching to be non-exhaustive.
    This function works by listing all the "constructors" of a type present on the first column, and then returning a constructor of type which is not in that list.
    Here we need to account for the fact that all the patterns in the first column are not necessarily of the same type.

  • full_match: cf. More robust matches #1518 (comment)

I checked the other assert false and fatal_error present in the file, I believe they are present for other reasons than "oh, these two things can only be of the same type".

Matching

We sometimes make that assumption when specializing the list of reachable trap-handlers. This was the case when specializing by a record, a tuple, or a lazy pattern.
We now do some kind checking there and when things are of the same kind (two tuples, two records or two lazys) we do, here too, resort to arity checking.
Just as constructor of extensible types, these two cases, as well as array patterns, would benefit from some kind of compatibility checking based on typing information.
This is however not crucial and non-trivial to do well, so let's consider it at a later date.

Does this PR cover all the cases?

I believe so, but of course can't be sure.
My process was to look at all the assert false in parmatch justified by the said assumption, writing a test breaking the assumption, fixing the code, making sure the test passed.

While doing this, I would sometimes get a assertion failure/fatal error in matching.
This happened for records in the expected way (that is: we already knew records weren't handled well), as well as for tuples.
I then looking at all the matcher_* functions, and apart from matcher_lazy, they all seemed sensible to me; that is: they all raised NoMatch instead of failing, which seems right.

@trefis
Copy link
Contributor Author

trefis commented Dec 7, 2017

(Beware: once again github decides to sort commit by their creation time, not topologically)

@trefis
Copy link
Contributor Author

trefis commented Dec 7, 2017

OK actually, parmatch is still broken after this patch

  • exhaust can be made to fatal_error
  • I believe full_match can be tricked into returning true whereas the matching is actually incomplete.
  • ...

I'm... looking.

@xavierleroy xavierleroy changed the title Robuster matches More robust matches Dec 7, 2017
@trefis
Copy link
Contributor Author

trefis commented Dec 7, 2017

Alright, new approach:

The idea

First, notice that build_other is only called if full_match returns false.
So, if full_match returned true on columns containing patterns of different types, then build_other could continue to assume the column it's working on is consistent.

I believe we can change full_match to return true when the column it is looking at contains patterns which are of incompatible types. The reason being that we cannot build a value satisfying both types (alternative formulation: the intersection of two such types is empty, and any match on the empty type is complete).

The implementation

There are a few things that might seem surprising when looking at the "coherence" check being run by full_match.

  • we do not recurse in subpatterns, we only look at the head constructor.
    So we are not preventing (1, 2) and ('a', 2) from being in the same column for example. That is fine though, the recursion in exhaust will handle such cases.
  • the consistency check for records is more complex than it needs to, considering full_match always returns true once it sees a record.
    I will fix that tomorrow.
  • what about polymorphic variants, shouldn't we be doing something more complicated? Perhaps. I'm tired. You're welcome to think about it.
  • Couldn't the check for constructors simply ensure that everything is a constructor, without worrying that all constructors are part of a type of the same shape?
    No, it wouldn't catch cases like this:
type t = A | B | C                                  
type _ repr = R1 : unit repr | R2 : t repr          
                                                    
let f (type a) (r1 : a repr) (r2 : a repr) (a : a) =
  match r1, r2, a with                              
  | R1, _, () -> ()                                 
  | _, R2, C -> ()                                  
  • Is the check for constructors strong enough? What happens if I mix two types of exactly the same shape? Could I trick full_match into believing a match isn't complete and then get build_other to fail?
    Not really:
    • if you have less than the number of constructors present in both types (I don't mean the sum, I mean the number which is the same for both types), then build_other can extend either of the types, it's fine.
    • you apparently can't have more than the number of constructors present in both types. I still haven't exactly figured out why. Perhaps get_mins cleans things up before we even get into exhaust. Although I'm currently not sure that it would. In any case, we could change the the = in full_match for >= and we'd be safe.

@trefis
Copy link
Contributor Author

trefis commented Dec 7, 2017

you apparently can't have more than the number of constructors present in both types. I still haven't exactly figured out why. Perhaps get_mins cleans things up before we even get into exhaust.

get_mins actually has nothing to do in that story. It's just that when building the specialized submatrices, we'll conflate any two constructor which have the same tag (cf. simple_match). Therefore removing any possibility of having more specialized submatrices than there are constructor in either types.

@gasche
Copy link
Member

gasche commented Dec 7, 2017

Should we try to meet in person (with @maranget) to explain/discuss/review the patch?

@trefis
Copy link
Contributor Author

trefis commented Dec 19, 2017

Some of the changes of these GPR are being proposed separately in #1538 , while the rest needs to be slightly reworked and will be proposed in a later PR.

Closing.

@trefis trefis closed this Dec 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants