Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not enrich type_decls with incoherent manifests #1468

Merged
merged 4 commits into from Feb 8, 2018

Conversation

Projects
None yet
4 participants
@trefis
Copy link
Contributor

trefis commented Nov 6, 2017

Follow up to: MPR#6394, GPR#1425.

As discussed in the MPR and GPR linked above, recursive module can add inconsistent equations in the environment. The two previous PRs updated parts of the codebase, which were previously relying on the fact that the environment is consistent, to keep working even when that's not the case.
For one of these, the change raised the question of what to do when the environment is inconsistent, and whether it was safe to keep going (were other parts of the codebase really going to catch the issue afterward?).

This GPR proposes to handle the issue differently by preventing the addition of inconsistent equations to the environment in the first place, and reverting to the old attitude of "if the environment is inconsistent, there is a bug in the compiler, let's crash now".

A few remarks:

  • Some (weak) coherence check was already present before, I simply extended it.
  • If the check fails, we do not fail right away, we simply do not add the guilty equation to the environment. This behavior was already present before, and while I'd be inclined to change it, I'll leave that as future work.
  • I reverted the previous patch which made the compiler more resilient to inconsistent environment. This means that the compiler might fail again if something other than recursive module adds inconsistent equations. I think this is a good thing: I'd rather know this is going on and make the conscious decision of supporting or rejecting it, rather than silently accepting it.

@trefis trefis force-pushed the trefis:enrich_typedecl branch from 7405529 to 53a82fd Nov 6, 2017

Changes Outdated
@@ -14,6 +14,9 @@ be mentioned in the 4.06 section below instead of here.)
(Vincent Laviron, with help from Pierre Chambart,
reviews by Gabriel Scherer and Luc Maranget)

This comment has been minimized.

@trefis

trefis Nov 6, 2017

Author Contributor

N.B. that change entry was in a "Code generation and optimizations" before #1370 got merged, but then somehow appeared in "Type system", I assume that's due to a merge/rebase mistake.

@trefis

This comment has been minimized.

Copy link
Contributor Author

trefis commented Dec 4, 2017

@garrigue could you do a formal review of these changes?

@garrigue
Copy link
Contributor

garrigue left a comment

The idea seems good, but the choice of Ctype.mcomp, and why it suffices, needs to be argued.

Btype.newgenty(Tconstr(Pident id, decl.type_params, ref Mnil))
in
let env = Env.add_type ~check:false id decl env in
Ctype.mcomp env orig_ty new_ty;

This comment has been minimized.

@garrigue

garrigue Dec 4, 2017

Contributor

Is really mcomp the function you want to call here?
It is extremely conservative (only finds conflicts between types of different shapes, ignoring all nominal aspects).
The advantage is that it depends only very weakly on the environment, in particular it handles all abstract and missing types as wild-cards.
Note that there is no explicit guarantee that it will discover incompatibilities: it works only on a best effort basis. Soundness requires to be very careful when adding incompatibilities.
So the question is whether it is really sufficient for what you intend.

If you're going to use this function, you should probably add a comment about that, since originally it was designed to be used only during GADT pattern unification.

This comment has been minimized.

@garrigue

garrigue Dec 6, 2017

Contributor

OK, let's get back to the piece of code this is supposed to remplace.
Originally, the problem is that you could end up trying to find the type of a constructor with get_constructor_type_path, just to discover that it is a pair.
Since a pair is structurally incompatible with a sum type, this will be correctly detected.

In general, assuming that id is defined in env when trying to enrich the type declaration (otherwise it doesn't make sense), and that it points to a sum type definition, what mcomp tells you is that its original definition (scraping all privates) can only be a sum type (i.e. expands to a type path), an abstract type (also extends to a type path), or a type variable. So only this last case would be problematic, but I'm not sure how to build a counter-example for that: would adding type 'a t = 'a do the trick? Note also that if env is not the same calling enrich_type_decl and get_constructor_type_path, there might be other discrepancies.

This comment has been minimized.

@lpw25

lpw25 Dec 6, 2017

Contributor

would adding type 'a t = 'a do the trick

This raises a good point. My reasoning that we are making recursive modules as safe as GADTs ignores the fact that recursive modules can add higher-kinded equations -- which GADTs don't support. Possibly we need to extend mcomp to handle them as accurately as it handles ordinary types.

@gasche gasche referenced this pull request Dec 5, 2017

Closed

Pr7661 again #1493

@lpw25

This comment has been minimized.

Copy link
Contributor

lpw25 commented Dec 5, 2017

Another way to look at this patch, is that it ensures that environments are no less consistent than they might be when using GADTs. mcomp has been used for a while now to prevent GADTs from introducing obviously incompatible equations (causing erros like Type int is not compatible with type string) and so it is quite reasonable to use it for the recursive modules case as well.

As for the check being conservative, it has to be fairly conservative. Consider the example:

module rec M : sig
  type s
  type t = Foo of s
  type r = s
end = struct
  type r
  type t = Foo of r
  type s = r
end

when considering if we can add the equation = M.t to the definition of t we must allow for the fact that M.s and r are compatible even though they are not equal in the environment where we are doing the check. It could perhaps consider nominal aspects for types outside of the recursive module, but I don't think that will provide any real additional safety.

@gasche

This comment has been minimized.

Copy link
Member

gasche commented Dec 6, 2017

There are other places in the compiler where I would like to have a function that tells me, given two fully-inferred types (after type-checking is finished), whether, in a given typing environment:

  • they are equal (have the same values)
  • they are different (have distinct values)
  • they may be equal (they could have the same value, and in particular start with the same type constructor)

Below is the code I started writing for this, also relying on mcomp, and below an adaptation of may_equal_constr (whether two extension constructors may be the same, reasoning on their arity and parameter types). It is not complete (doesn't even compile as-is, I think). Consider this message a feature request: could someone who knows better than I do write such a function?

let may_equal_type ~existentials env t1 t2 =
  let snap = Btype.snapshot () in
  (* the `rigidify` calls are inspired by the `matches` function *)
  let tyv1 = rigidify t1 in
  let tyv2 = rigidify t2 in
  try
    mcomp env t1 t2;
    let equal_modulo_existentials v1 v2 =
      (* this is inspired by `all_distinct_vars` *)
      let v1 = expand_head env v1 in
      let v2 = expand_head env v2 in
      (List.mem v1 existentials && List.mem v2 existentials)
      || (v1 == v2)
    in
    let check_vars v1 v2 =
      if not (equal_modulo_existentials v1 v2)
      then raise (Unify []) in
    List.iter2 check_vars tyv1 tyv2;
    undo_compress snap;
    true
  with Unify _ ->
    undo_compress snap;
    false

let may_equal_constr c1 c2 = match c1.cstr_tag, c2.cstr_tag with
  | Cstr_constant i1, Cstr_constant i2 -> i2 = i1
  | Cstr_block i1, Cstr_block i2 -> i2 = i1
  | Cstr_unboxed, Cstr_unboxed -> true
  | Cstr_extension (path1, b1), Cstr_extension (path2, b2) ->
      let equal = Path.same path1 path && b1 = b2 in
      let may_equal () =
        let existentials = c1.existentials in
        c1.cstr_arity = c2.cstr_arity
        && List.for_all2 (may_equal_type ~existentials)
             c1.cstr_args c2.cstr_args
      in equal || may_equal ()
  | (Cstr_constant _|Cstr_block _|Cstr_unboxed|Cstr_extension _), _ -> false

The constraints we would need for such a function is to always be correct in the choice it makes between the three categories :-)

@trefis trefis force-pushed the trefis:enrich_typedecl branch from c08cc7a to 6a6de8e Jan 12, 2018

@trefis

This comment has been minimized.

Copy link
Contributor Author

trefis commented Jan 12, 2018

would adding type 'a t = 'a do the trick

Yes indeed!

The new implementation ( in commit fc069f3 ) is not much different: we just replace all the Tvars by some univars.
E.g. instead of checking whether 'a t and 'b M.t may be compatible, we now check that 'a. 'a t and 'a. 'a M.t may be compatible.

The function doing the replacement is called reify_univars because it seems semantically closer to reify than rigidify (in the sense that it replaces Tvars by something else), but I'm not particularly fond of the name (and also, the implementation is actually closer to rigidify).
Note that contrarily to both reify and rigidify, this function doesn't do anything smart with row variables: mcomp_row doesn't care whether the rows it compares are fixed or not.

I have also added some tests to this PR.

@trefis trefis requested a review from garrigue Feb 6, 2018

@garrigue
Copy link
Contributor

garrigue left a comment

Except fo a small level problem, this fix seems to do what is announced.

match ty.desc with
| Tvar name ->
save_desc ty ty.desc;
let t = newty (Tunivar name) in

This comment has been minimized.

@garrigue

garrigue Feb 7, 2018

Contributor

Shouldn't it be newty2 ty.level ?
Otherwise you might break some invariant.

This comment has been minimized.

@trefis

trefis Feb 7, 2018

Author Contributor

You’re right, it should be.
I will fix that.

@trefis trefis force-pushed the trefis:enrich_typedecl branch from 6a6de8e to b47295c Feb 7, 2018

@trefis

This comment has been minimized.

Copy link
Contributor Author

trefis commented Feb 7, 2018

I fixed the issue mentioned above, rebased, and updated the Changes entry.
I think this might be good to go.

@garrigue
Copy link
Contributor

garrigue left a comment

The code seems fine too.
I just suggested adding a comment on reify_univars

unmark_type ty;
let ty = copy ~env ty in
cleanup_types ();
newty2 ty.level (Tpoly(repr ty, !vars))

This comment has been minimized.

@garrigue

garrigue Feb 8, 2018

Contributor

It took me a while to realize that the ty.level here is about the fresh copy...
I would suggest adding a comment at the beginning of the function that the argument to reify_univars is supposed to be at generic_level (i.e. it comes from a generalized type declaration). Otherwise this code would not work. Under this assumption the last newty2 could be just newty, at the level on ty will be current_level anyway (not essential).

@trefis trefis force-pushed the trefis:enrich_typedecl branch from b47295c to 085306b Feb 8, 2018

@trefis

This comment has been minimized.

Copy link
Contributor Author

trefis commented Feb 8, 2018

Comment added.
Thank you for the review!

@trefis trefis merged commit b00ca46 into ocaml:trunk Feb 8, 2018

0 of 2 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@trefis trefis deleted the trefis:enrich_typedecl branch Feb 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.