Skip to content

recmod: check consistency before subtyping (#12959)#13055

Merged
gasche merged 5 commits intoocaml:trunkfrom
Octachron:includemod_recmod_consistency
Mar 29, 2024
Merged

recmod: check consistency before subtyping (#12959)#13055
gasche merged 5 commits intoocaml:trunkfrom
Octachron:includemod_recmod_consistency

Conversation

@Octachron
Copy link
Copy Markdown
Member

@Octachron Octachron commented Mar 28, 2024

This PR proposes to fix the internal error reported in #12959 which happens typing recursive modules where the implementation and the declared module type have inconsistent views on type definitions:

module rec A : sig val c : 'a B.t -> unit end = struct
  let c _ = ()
end
and B : sig type 'a t end = struct
  type t
end

Fatal error: exception Invalid_argument("List.iter2")

This internal error is triggered because in the last stage of the typechecking of recursive modules, we check that we can derive a subtyping relationship between the inferred module types and the explicit module types.
And in order to do so, we extend the type of the modules in the recursive module types to their inferred types, before checking for inclusion. However, the core type system assumes that type definitions are consistent, and when it encounters the signature:

sig val c: 'a B.t -> unit end

in an environment where B has type

sig type t end

it fails immediately.

This PR proposes to fix this issue by adding a consistency step just before the last stage of the typechecking of recursive module bindings:

    module rec M_1: Decl_1 = Impl_1
    and M_2 : Decl_2 = Impl_2
    ...

This new test checks that the actual types Actual_i are consistent with the declared types Decl_i before trying to establish than M_1:Actual_1, ... , M_n:Actual_n <: M_1:Decl_1, ..., M_n:Decl_n

This step uses a very coarse-grained consistency relation C defined like <: except that inclusion for type declarations and value declarations are replaced by a consistency test, and all declarations of classes, class types and extensions constructors are seen as consistent.

For instance,

  • sig type -'a t end C sig type 'a t = A end
  • sig external x: float ="try" end C sig val x: int end

( C is an equivalence on types ), but we have sig type 'a t endsig type 'a t end sig type t end

This step ensures that it is safe to add the equations M_i = Impl_i to the environment.

And with this precaution added, the internal error above is replaced by a normal error message (which could be improved):

module rec A : sig val c : 'a B.t -> unit end = struct
  let c _ = ()
end
and B : sig type 'a t end = struct
  type t
end
Error: Modules do not match:
        sig type t end
      is not included in
        sig type 'a t end
      Type declarations do not match: type t is not included in type 'a t
      They have different arities.

This commit adds a consistency step just before the last stage of the
typechecking of recursive module bindings:

    module rec M_1: Decl_1 = Impl_1
    and M_2 : Decl_2 = Impl_2
    ...

This new test check that the actual type Actual_i are consistent with
the declared types Decl_i before trying to establish than

   Actual_1, ... , Actual_n <: Decl_1, ..., Decl_n

This step uses a very coarse-grained consistency relation `C`
defined like `<:` except that inclusion for type declarations
and value declarations are replaced by a consistency test, and
all declarations of classes, class types and extensions constructors
are seen as consistent.

For instance,

   sig type -'a t end C sig type 'a t = A end
   sig val x: int end C sig external x: float ="try" end

( `C` is an equivalence on types ), but

  sig type 'a t end ₡ sig type 'a t end

This step ensures that it is safe to add the equation `M_1` = `Impl_1` to
the environment. Without this step, the core system typechecker may fail
with an internal error whenever type declarations are inconsistent
between an implementation and its declared module type.
Copy link
Copy Markdown
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in your description, in the example of two signatures that are not in the consistency relation.

The approach looks reasonable to me and I believe that the code is correct. See various comments.

It is not obvious to me that passing a constant ~core parameter all over the place is preferable to using a functor on these four functions, but I am okay with that choice.

end

type 'a core_incl0 = 'a -> 'a -> (module_coercion, Error.sigitem_symptom) result
type 'a core_incl =
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect a comment that explains what this is about.

(My understanding: we can instantiate the module-level subtyping check with different notion of what it means for "core constructs" (term-level constructs, not module-level constructs) to be compatible, and in particular one instance checks full subtyping and another instance just check a form of compatibility of arities to guarantee well-formedness.)


exception Dont_match of value_mismatch

let value_descriptions_consistency env vd1 vd2 =
Copy link
Copy Markdown
Member

@gasche gasche Mar 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you explain somewhere what "consistency" means, so that we can understand the intended difference between the foo and the foo_consistency functions?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a tentative definition of consistency

(** The functions [value_descriptions_consistency] and
[type_declarations_consistency] check if two declaration are consistent.
Declarations are consistent when there exists an environment such that the
first declaration is a subtype of the second one.

Notably, if a type declaration [td1] is consistent with [td2] then a type
expression [te] which is well-formed with the [td2] declaration in scope
is still well-formed with the [td1] declaration: [E, td2 |- te] => [E, td1
|- te]. *)

with the idea that the property which is needed is the second one.

class_declarations: Env.t -> Subst.t -> Types.class_declaration core_incl0;
class_type_declarations:
loc:Location.t -> Env.t -> Subst.t ->
Types.class_type_declaration core_incl0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is slightly weird that these methods have different signatures -- some of them need a location, but class_declarations does not, and the class ones do not use the mark parameter. Another choice would be to use the same core_incl type everywhere and then ignore the extra parameters when instantiating the record with the actual functions.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that having an uniform type is nice; I have decided to adapt the definition of the functions rather than have a closure adapter.

| Sig_value(id1, valdecl1, _) ,Sig_value(_id2, valdecl2, _) ->
let item =
value_descriptions ~loc env ~mark subst id1 valdecl1 valdecl2
core.value_descriptions ~loc env ~mark subst id1
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fond of the fact that, at this point, there is a value_descriptions function in scope, so that the code would work correctly if you forget to get the core version in scope. I think that it would be nicer if the value_descriptions function was renamed into value_descriptions_inclusion or value_descriptions_subtyping to avoid this.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed this issue by grouping the inclusion function wrappers in their own Core_inclusion module.

@gasche
Copy link
Copy Markdown
Member

gasche commented Mar 28, 2024

I would be reassured if someone knowledgeable, for example @garrigue or @lpw25, would be willing to validate the high-level design, that checked consistency as a first layer of defense is the right approach. Assuming that it is, I believe that a review for correctness is fairly easy (I have mostly done it already) because the code is fairly simple, in particular it nicely avoids adding redundancy by factoring the code common to the two checks.

@garrigue
Copy link
Copy Markdown
Contributor

Actually the expert here is @xavierleroy rather than me, since I don't remember seriously modifying the recursive module code.
This said, @Octachron 's analysis sounds correct, and adding a consistency check, that is just checking arity and primitive declarations, cannot break anything.
The introduction of object-oriented code is always a bit disturbing, but here it seems the best way to keep the changes minimal. Hopefully this is local enough that there is no ambiguity.
It would still be nice to give a definition of what consistency means somewhere.

@gasche
Copy link
Copy Markdown
Member

gasche commented Mar 29, 2024

(To be clear, there are no actual objects in this PR, just records of functions passed around.)

I'll take this as a design approval, thanks @garrigue! I'm happy to finish reviewing and approving the PR once the feedback has been taken into account -- it also needs a Changes entry.

@gasche gasche merged commit ade8c88 into ocaml:trunk Mar 29, 2024
@gasche
Copy link
Copy Markdown
Member

gasche commented Mar 29, 2024

Merged, thanks!

@xavierleroy
Copy link
Copy Markdown
Contributor

For the record: we (@Octachron and I) discussed briefly an alternative approach where the type-checker code would be hardened against arity mismatches in type constructors like this one, reporting proper errors instead of raising Invalid_argument. But this looks like a lot more work than the proposed fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants