Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

do not error when instantiating polymorphic fields in patterns #1748

Merged
merged 4 commits into from May 25, 2018

Conversation

@trefis
Copy link
Contributor

commented Apr 30, 2018

Reading this thread on the caml-list, it seems like the initial motivation for 68006b7 and 2acec46 was that Parmatch wasn't able to cope with incoherent columns.

Also, these commits apparently had the side effect that matching the field with a variable introduced polymorphic bindings. But nowadays that would remains the case even if we reverted them: type_cases makes sure that polymorphism is always propagated to the branches.

As Parmatch is now able to deal with incoherent columns, it has become possible to revert the changes introduced by these commits.
Some reasons for doing so would be:

  • it would make the code simpler
  • the compiler would stop rejecting code that typechecks and cannot go wrong

For the record, one can already write similar examples where we're instantiating polymorphic types from patterns, and where the compiler does not error or warn, e.g.

match [] with
| [] -> ()
| [ 4 ] -> ()

This GPR reverts these two commits.


People looking at the test output might be surprised to see warning 8 emitted: shouldn't Parmatch consider a matrix with an incoherent first column as exhaustive?
Indeed it should, however the exhaustiveness check never sees a matrix with an incoherent column in this case: on this particular example get_mins (which is used when building our initial matrix) removes the rows introducing the incoherence.
It should be fairly easy to update get_mins so that it notices the incoherence ("and then what?") but that is completely orthogonal to this GPR, so I'll delay that discussion for a later time.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 1, 2018

One potentially useful side-effect of this change is that it lets you instantiate the variable with an empty type to allow refutation cases:

type (_, _) eq = Refl : ('a, 'a) eq

type empty = (int, string) eq

type foo = { foo : 'a. 'a list }

let foo () =
    match { foo = [] } with
    | { foo = [] } -> true
    | { foo = (_ : empty) :: _ } -> .
@gasche

This comment has been minimized.

Copy link
Member

commented May 1, 2018

I understand @trefis' point that the original error was put in place to workaround a lack of robustness of the pattern-matching engine, which is now more robust and thus removes the need for it. However, I cannot help thinking that the proposed code examples look fishy. Leo demonstrates that there cannot be any other useful case than [] at type 'a . 'a list, so why are we insisting that we should be able to write the corresponding dead code?

(It's not always the case that the pattern-language is expressive enough to write a refutation clause using the right choice of instantiation of the type variable; think 'a. unit -> 'a for example; but it would be possible to enrich it, we should remark that those clauses are refutable semantically.)

Personally I would rather say that the current error makes sense, and that ideally an exhaustivity checker (a type-directed version, which we don't have) could know that rigid/polymorphic variables must not be instantiated to prove more sets of clauses exhaustive.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 2, 2018

think 'a. unit -> 'a for example

I just wanted to point out that this type is not refutable -- no function with that type can return normally, but such functions clearly do exist.

@yallop

This comment has been minimized.

Copy link
Member

commented May 2, 2018

Here's some (previously-rejected) code that's not currently handled correctly:

# type t = { l : 'a. 'a; };;
type t = { l : 'a. 'a; }
# match {l=assert false} with {l=3} -> () | {l= ""} -> ();;
Fatal error: exception File "bytecomp/matching.ml", line 2231, characters 58-64: Assertion failed
``
@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 2, 2018

I don't think the current error does make sense though. It prevents patterns that are both useful and valid. For example:

# type id = { id : 'a. 'a -> 'a };;
type id = { id : 'a. 'a -> 'a; }
# let foo { id = (id : int -> int) } = id;;
Characters 10-12:
  let foo { id = (id : int -> int) } = id;;
            ^^
Error: The record field id is polymorphic.
       You cannot instantiate it in a pattern.

Detecting that a case is dead is a job for parmatch it should not be handled with an ad-hoc and incorrect type error.

There is also some missing motivation in @trefis' description of this issue. The code for implementing this error does not mix at all well with #1675 -- since the error is of very dubious value it seems better to get rid of it than to implement something complex just to support it.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 2, 2018

Here's some (previously-rejected) code that's not currently handled correctly:

I had assumed this would mean that there was an example without using the current patch that could reach this assertion failure -- since it is already possible to get a single column containing incompatible patters -- but I haven't been able to find it. Someone who understands the pattern-matching code better than me should probably have a proper look though just to make sure.

@gasche

This comment has been minimized.

Copy link
Member

commented May 2, 2018

I don't consider your last example {id = (id : int -> int) } particularly "useful and valid".

@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 2, 2018

Why not? Coercing a value to a more specific type is often useful to ensure it is being used as you think it is, so it's useful, and the value being matched clearly has int -> int as an instance of it's type, so it's valid.

@gasche

This comment has been minimized.

Copy link
Member

commented May 2, 2018

I apologize for the somewhat confrontational style the discussion has taken. Let's maybe take a step back.

  1. Removing the non-trivial code that currently enables this error is useful and would help other changes to the code.
  2. On the other hand, there seems to be some issues with incoherence-robustness in matching (the pattern-matching compiler, rather than the analyzer in parmatch) that show up.
  3. Allowing to instantiate polymorphic quantifiers when typing patterns is certainly sound, but it is unclear how useful (and thus desirable) it is. Unless we have some appealing examples, maybe theory can help.

Regarding (3), I think it is rather natural to consider adding instantiation to a language of System F patterns. A basic pattern language would be:

p ::=
  | _
  | lit
  | p as x
  | K (p1, .., pn)

And I can see two natural choices of typing rules:

  • separate abstractions and applications,
    p @ A and forall a. p, where
    p @ A matches on a value of type forall a. B when p matches on a value of type B[A/a],
    and forall a . p merely adds a new a type variable in the typing environment and matches on
    a p of the same type
  • combined abstraction and application,
    under a . p, where
    under a . p matches on a value of type forall a . B if p matches on a value of type B with a
    new variable a in the typing environment

(all of these constructs are fully erased in an implicit language where type abstractions and applications are inferred, like OCaml)

The first proposal lets you write the examples given above

match ([] : forall a. List a) with
| [] -> ()
| (1 :: _) @ int -> ()
| (_ :: _) @ empty -> .

and

match id with
| {id = id_int @ int} -> id_int

and the application construct is useful for matching under a polymorphic value without instantiating to concrete types

match (p : forall a. int * a) with
| forall a . (n, _) @ a -> n

The under a . p design only allows the latter form,

match (p : forall a. int * a) with
| under a . (n, _) -> n

My understanding is that the current check in the OCaml compiler roughly corresponds to enforcing that only the under a . ... form is used, which guarantees that type abstractions are only opened with rigid/generalized variables, which removing the check would move us into the world of the separate application and abstraction forms, where instantiations with non-variable types are possible.

Boths forms are obviously sound, but I still don't really see the point of the non-rigid instantiation here.

Note: I believe that the refutation clause mechanism could be enriched by knowing that capturing a value at a rigid/generalizable variable type is just as impossible as capturing a value of an empty type, so that

match (p : forall a . list a) with
| under a . [] @ a -> ()
| under a . (_ : a) :: _ -> .

also works.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 2, 2018

Firstly, I should say that (1) and (2) probably outweigh (3) anyway -- so much of this discussion is probably academic. However, with regards to (3):

Boths forms are obviously sound, but I still don't really see the point of the non-rigid instantiation here.

This seems back-to-front to me. The question should not be about why should we support non-rigid instantiation, the question should be why should we prevent non-rigid instantiation. It is the latter that requires additional code. Support for non-rigid instantiation naturally falls out of the type system, and then we have to do additional checks to prevent it.

Adding additional ad-hoc restrictions inherently creates inconsistencies. OCaml's type system supports normal instantiation everywhere else, so restricting it in patterns creates unnecessary asymmetry. For example,

# type id = { id : 'a. 'a -> 'a };;
type id = { id : 'a. 'a -> 'a; }
# let foo id =
    match id.id with
    | (id : int -> int) -> id;;
    val foo : id -> int -> int = <fun>
# let foo id =
    match id with
    | { id = (id : int -> int) } -> id;;
    Characters 39-41:
      | { id = (id : int -> int) } -> id;;
          ^^
Error: The record field id is polymorphic.
       You cannot instantiate it in a pattern.

There is no fundamental reason that those two functions should be treated any differently. Projecting in an expression and projecting in a pattern should be considered equivalent.

The only reason -- beyond the issues with parmatch.ml and matching.ml -- I've seen given for adding this restriction is roughly that most of the examples using it involve dead code. I have two issues with this reason:

  1. Dead code should produce warnings from the compiler not errors. Errors should be reserved for soundness issues. Sometimes we use errors to avoid having to handle awkward corner cases in the compiler but where possible they should correspond to a genuine concern about the soundness of the program. We even have a specific warning ("this match case is unused") that would be ideal for the examples.
  2. We do not detect the dead code almost identical situations. For example:
# let foo () =
    match [] with
    | [] -> ()
    | x :: _ -> x;;
      val foo : unit -> unit = <fun>

Note: I believe that the refutation clause could be enriched by knowing that capturing a value at a rigid/generalizable variable is just as impossible as capturing a value of an empty type, so that

I agree, but similarly the unused match case detection could be extended with such knowledge. This would then produce a warning exactly on the cases that involved dead code, rather than ruling out all uses of non-rigid instantiation.

@gasche

This comment has been minimized.

Copy link
Member

commented May 2, 2018

I actually find your argumentation rather convincing. In the treatment of OCaml polymorphic record fields, field projection is where instantiation happens, so it makes sense to consider record-patterns (which are the left-hand-side counterpart of field projections) as instantiation sites. I would be reassured if I could get a proper unused-clause warning out of dead-code instantiations, but independently the argument for removing the error is cogent.

@trefis trefis force-pushed the trefis:remove-record-check branch 2 times, most recently from d6a7a40 to 2113413 May 4, 2018

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 4, 2018

Like Leo I spent some time trying to reproduce the error on trunk and failed to do so.
Then I tried adding a few similar examples but using other kind of patterns (e.g. constructors), as there are a few other assert falses lying around I thought I'd be able to produce similar errors: I didn't.

As it turns out, these assert false were all legitimate, because split_constr was doing its job properly. Except when the head pattern is a constant, in which case it will happily group other clauses whose head pattern is a constant, regardless of whether it's the same "kind" of constant or not.

I commited these few tests and then added a commit fixing that issue in matching.

The code generated for such matchings can at times be reasonably nonsensical, but since it is dead code, I guess it doesn't matter too much.
It might be nice however to come back to this at some point and be explicit about dead code in the output.

Note: at this point this PR is rebased on top of #1762 (because it was more pratical for debugging), so it there's a bit of noise in the diff. Hopefully that will disappear soon.

@trefis trefis force-pushed the trefis:remove-record-check branch from 2113413 to 37df043 May 7, 2018

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 9, 2018

This PR is now rebased on a fairly recent trunk (i.e. there is no noise in the diff anymore), and, I believe, ready to be reviewed.

@gasche: do you think you can review this or should we ask @maranget? I'm confident the proposed change to matching is safe/correct independently of the rest of this PR, what I am not certain of is whether it is enough (though I suspect it is).

@gasche

This comment has been minimized.

Copy link
Member

commented May 14, 2018

I have reviewed the PR and believe that it is correct.

I wonder how difficult it is to actually have Useless Clause or similar warning when writing a type-instantiating pattern in this case. I tried to play a bit with the ~pred parameter of do_check_partial, as this seems to be the way type-directed information is injected within parmatch, but I couldn't get anything working as I hoped.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 14, 2018

I have reviewed the PR and believe that it is correct.

Enough to formally approve this PR? :)

I wonder how difficult it is to actually have Useless Clause or similar warning when writing a type-instantiating pattern in this case. I tried to play a bit with the ~pred parameter of do_check_partial, as this seems to be the way type-directed information is injected within parmatch, but I couldn't get anything working as I hoped.

Perhaps you should give some more details about what you tried?

The way I see it there shouldn't be any technical difference in this case between warning (i.e. checking) from parmatch or checking when we first type the pattern; the problem is about finding a different implementation of the check that would play nicely with the changes proposed in #1675. I currently do not have any proposition to make for such a new implementation, do you?

@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 15, 2018

I think that using the old code for the warning is not really accurate enough. Ideally the warning would find patterns that would not type check if the polymorphic type was replaced by an empty type. However care would be needed around type annotations because those should not trigger the warning.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 23, 2018

Ah yes indeed, I forgot about that :)

Anyway, does one of you want to approve this?

@gasche

gasche approved these changes May 24, 2018

Copy link
Member

left a comment

The reason I haven't approved this yet is that it still feels like a regression in my mind: we are removing a mechanism that is, on average, useful, and replacing it with nothing -- the gain is in facility to update the code for other purposes. I am now convinced by the argument that the removal is not a bad idea, but in absence of a mechanism to warn on bad-uses of the newly-allowed feature, I'm still not convinced that it is a good idea.

On the other hand, I think that you are both right to try to simplify the codebase when possible, to make further changes easier. After much hesitation, I'm approving this PR.

@trefis trefis force-pushed the trefis:remove-record-check branch from 37df043 to c2fef18 May 24, 2018

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 24, 2018

Yes, I get your point... Anyway, thank you for approving!
(I rebased the commits and will merge them once the CIs come back OK)

@maranget

This comment has been minimized.

Copy link
Contributor

commented May 24, 2018

Oups I forgot to ask : why are the tests so precise ? That is why having lambda-code in reference compilation output. Isn't sufficient to check compilation success and warnings?

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 24, 2018

Fair question, I'm not sure I remember why I did it this way, I guess I was just curious to see what the pattern matching compiler emits for dead branches.

And to answer the question in a more general setting: I think it is good to have at least some output of the pattern matching compiler in the testsuite. Having regression tests of the output allows us to track whether optimizations apply properly or not, whereas checking the behavior only allows us to test correctness.

@trefis trefis merged commit e407ecf into ocaml:trunk May 25, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@trefis trefis deleted the trefis:remove-record-check branch May 25, 2018

@shindere

This comment has been minimized.

Copy link
Contributor

commented May 25, 2018

@trefis it seems this breaks CI at Inria.

What fails is the third ocnfiguration of tools/ci/inria/other-configs,
namely when the compiler is configured with:

./configure -no-flat-float-array -flambda -with-flambda-invariants

With this configuration, the tests/basic/patmatch_incoherence.ml expect test
fails:

 ... testing 'patmatch_incoherence.ml' with 1 (expect) => failed (program output patmatch_incoherence.ml.corrected.corrected differs from reference patmatch_incoherence.ml: 
--- patmatch_incoherence.ml	2018-05-25 11:29:14.635156844 +0200
+++ patmatch_incoherence.ml.corrected.corrected	2018-05-25 11:35:38.164219153 +0200
@@ -137,7 +137,8 @@
          (raise (makeblock 0 (global Assert_failureg) [0: "" 1 12])))
      *match* =a (field 0 *match*))
     (catch
-      (let (len =a (array.length[gen] *match*)) (if (!= len 0) (exit 10) 0a))
+      (let (len =a (array.length[addr] *match*))
+        (if (!= len 0) (exit 10) 0a))
      with (10) (if (!= *match* 3) (exit 9) 0a)))
  with (9) (raise (makeblock 0 (global Match_failureg) [0: "" 1 0])))
 Exception: Assert_failure ("", 1, 12).
)

See e.g. log of build #834 of the other-configs job on Inria's CI.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2018

Sigh.

I guess @maranget was right to be surprised by these tests, it's hard to get a stable lambda output.
I'm now tempted to not dump the lambda, but just placing a type annotation on the [||] pattern would also fix the issue.

Does anyone have an opinion?

@gasche

This comment has been minimized.

Copy link
Member

commented May 25, 2018

I think that removing the lambda dump is reasonable. What is it that the lambda is testing?

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2018

I pushed 34504e8 which remove the lambda dump from the test output.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented May 25, 2018

(the CI still appears to be broken but for a test unrelated to this one; I don't think I'm the responsible for that faillure)

@shindere

This comment has been minimized.

Copy link
Contributor

commented May 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.