Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the order in which record fields are typed #7816

Open
vicuna opened this Issue Jul 2, 2018 · 4 comments

Comments

Projects
None yet
1 participant
@vicuna
Copy link
Collaborator

commented Jul 2, 2018

Original bug ID: 7816
Reporter: @trefis
Status: new
Resolution: open
Priority: normal
Severity: minor
Category: typing
Monitored by: @nojb @yallop

Bug description

Original starting point of the discussion: #1675

N.B. I will not discuss principality concerns here, they remain the same regardless of which order is actually chosen.

Current state of things (up-to OCaml 4.07)

For patterns:

  • when typing a record pattern, fields are typed in the order following that of the record declaration
  • parmatch currently relies on this sorting having been done, cf.
    let records_args l1 l2 =
  • matching doesn't appear to rely on this having been done, but does use that same (declaration) order, cf.

For expressions fields are typed in the order following that of the record declaration
Alain pointed out this is coherent with the way labeled arguments are typechecked.

Suggested change: follow the source order instead

If I'm not mistaken, the proposition has been made by both Gabriel and Leo.

I'm not sure whether they propose this change to be limited to patterns or to apply to expression as well.
Although it seems like if we change the order in patterns we should change the order in expressions as well.

Potential issues

If we change the order in which typing is done, then we also need to update the evaluation order (to match the new typing order).

As Jeremy said:

The equation propagation behaviour is closely connected with
evaluation order, too. In a match like this:

match e with
| {x = lazy A, y = lazy B}

equality information needs to flow in the direction first-evaluated
to last-evaluated because the equalities are only available when
pattern matching succeeds.

So changing the direction in which equalities flow (which will cause
a noisy type-checking failure in some currently-valid programs) also
involves changing the order in which patterns are evaluated (which
will cause a silent evaluation-order change in some currently-valid
programs).

Although I'm not sure lazyness really matters here?

Following that, Gabriel added:

I thought that the evaluation-order change would be simple, but we
have a problem if a matched column contains two record patterns with
different field orders.

To give a concrete example of a problematic pattern:

type _ t = Pair : ('a * 'b) t | Unit : unit t
type 'a r = { x : 'a t * 'a; y : 'a t * 'a; }
let foo (type a) (r : a r) =
  match r with
  | { x = (Pair, _); y = (_, (a, b)) } -> ignore (a, b)
  | { y = (Pair, _); x = (_, (c, d)) } -> ignore (c, d)
  | _ -> ()
@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 2, 2018

Comment author: @stedolan

To expand on Jeremy's comment about laziness, here's an example using laziness that would segfault if the typing order were changed but the matching order were not.

type 'a is_str = Is_string : string is_str
type 'a box = { x : 'a; tag : 'a is_str Lazy.t }

let foo (type a) : a box -> bool = function
  | { tag = lazy Is_string; x = "foo" } -> true
  | _ -> false

let bang = foo { x = 42; tag = lazy (assert false) }

The field 'x' is only known to be of type 'string' after the match on field 'tag' has succeeded. Since it is a lazy pattern-match, this match could fail with an exception, in which case 'x' might not be a string.

So, if the equations gleaned from 'tag = lazy Is_string' are available when type-checking 'x = "foo"', then 'tag' must be evaluated before 'x' is matched. 4.06.1 types and matches 'x = "foo"' first, so this function gives a type error:

This pattern matches values of type string
but a pattern was expected which matches values of type a

It would also be fine to type and match 'tag = lazy Is_string' first, in which case this code would be type correct and fail with an assertion error, but soundness demands that the typing order and evaluation order do not contradict each other.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 2, 2018

Comment author: @trefis

Thanks for the example.

What I meant was: we can get the same segfault without needing lazyness at all, if you have:

type 'a typ = Is_string : string typ | Is_int : int typ
type 'a box = { x : 'a; tag : 'a typ }

let foo (type a) : a box -> bool = function
  | { tag = Is_string; x = "foo" } -> true
  | _ -> false

you'll get into trouble just the same if you look at x first and you're passed { tag = Is_int, x = 42 }.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 2, 2018

Comment author: @gasche

A comment on my different-field-orders-in-the-same-column point: (as Thomas already knows) we know how to compile this correctly, by splitting the matrix into several submatrices, one for different row orders. (In general "split more" is the magical solution to compilation correctness issue.) But this solution is a bit frustrating as it may degrade the performance of existing programs.

We could have a conservative criterion on when it is not needed to split, that is when it is valid to reorder field. It's clearly sound if the fields are all variables, and I think we can extend to the case where the fields are all exhaustive value patterns (value patterns as opposed to effectful patterns like "lazy"). In particular, it is non-trivial but I think checking a one-constructor GADT can be delayed to after accessing type-dependent patterns -- this is similar the logic used to shortcut the case switch when all type-valid GADT constructors return a field of interest in the same position.

With this improvement (a conservative criterion on when reordering is sound and splitting is thus not necessary), the impact on performance would probably be negligible, so it would be all dandy. But this is yet more complicated logic to implement and maintain, and nobody is positively excited by the idea.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

commented Jul 3, 2018

Comment author: @trefis

I agree with this on principle, though I would probably use a slightly different criterion.

But indeed, I'm not excited by this either: keeping the current ordering seems "fine" for now; we can reconsider after an hypothetical cleanup/simplification of matching.ml

@vicuna vicuna added the typing label Mar 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.