Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct mutually recursive types are rejected by the type-checker #8511

Closed
kit-ty-kate opened this issue Mar 17, 2019 · 20 comments
Closed

Correct mutually recursive types are rejected by the type-checker #8511

kit-ty-kate opened this issue Mar 17, 2019 · 20 comments

Comments

@kit-ty-kate
Copy link
Member

Consider the two following mutually recursive types:

type p = [`A | q] and q = [`B of p]
type p = < x : int; q> and q = <y : p>

Both type definitions should be correct, however, they both are rejected by the type-checker with the following error message:

Error: The type constructor q is not yet completely defined

Both look valid in a sense that if you manually inline the definition of q in p, it's accepted by the compiler. You can also define q first with a type parameter and this works too. e.g.:

type 'a q = [`B of 'a]
type p = [`A | p q]

But both solutions are not practical with more complex types and there shouldn't be any reasons to reject the first definition.
At the MirageOS retreat where we bumped into this issue, @gasche pointed out that fixing this issue requires a static analysis to be done to eliminate infinite recursions cases like:

type p = [`A | p]
type p = [`A | q] and q = [`B | p]
@gasche
Copy link
Member

gasche commented Mar 17, 2019

Indeed, I think this should be reasonably straightforward, but yet another static information about types to keep around in signatures. A closed polymorphic variant type (or, dually, an object type) is "grounded" when we know the full list of possible variants constructors (or dually, its methods) -- an arbitrary closed type is grounded when we know its head type connective. A parametrized type has a "grounded status" that depends on its parameter grounding modes: type 'a t = [ `Foo of a ] is any -> grounded, but type 'a t = 'a (or type 'r t = [ `Foo | 'r ], if this existed) is grounded -> grounded. We can perform a fixpoint iteration to determine, in a set of mutually-recursive type declarations, which types are grounded, and we should reject declarations which cannot be determined to be grounded in this way.

@lpw25
Copy link
Contributor

lpw25 commented Mar 17, 2019

You wouldn't need to keep the information in signatures. Once the type has finished being defined the information is no longer needed -- you can just expand the type away when analysing later type definitions.

It also doesn't really make sense to keep the information in signatures because it wouldn't make sense to abstract the type but keep the information (unlike variance or immediateness) -- it is not sufficient to know that a type is "grounded" you need to know what closed polymorphic variant type it expands to.

@gasche
Copy link
Member

gasche commented Mar 17, 2019

Keeping the information in signatures is nicer / more modular / more efficient than expanding at use-site.

@lpw25
Copy link
Contributor

lpw25 commented Mar 17, 2019

It's not more modular because, as I said, the information cannot be usefully abstracted, and it's not more efficient because you need to do the expansion anyway to find out the underlying polymorphic variant. It will also be less accurate because the information is not stable under substitution (also true of variance which should really be fixed at some point).

It would be better to keep this analysis self-contained as part of handling type declarations rather than pushing it into the rest of the code and requiring it to be maintained everywhere. This is solely an issue with recursive definitions and it should be dealt with by the code that handles recursive definitions.

@trefis trefis added the typing label Mar 18, 2019
@garrigue
Copy link
Contributor

There is a good reason why both definitions are rejected: in both cases p cannot be defined without first knowing the definition of q.
In the past, there has been suggestions that we should allow that in the reverse order (if q depends on p but not conversely), however this was rejected for the sake of symmetry.

I agree that one could do some extra work to find a dependency order allowing the definition, but in general this is not the policy of the OCaml type system, and changing this policy would mean doing more work in many places.
Could you describe your use case, and why the workaround you describe doesn't scale?
(For instance, if your problem is with a large number of mutually recursive definitions, using a constrained type parameter lets you close the recursion with a single parameter.)

@kit-ty-kate
Copy link
Member Author

My use case was this one: https://github.com/kit-ty-kate/ocaml-activitypub/blob/f64965dfcb0d480e619341b865fef17a75888bdf/lib/activitystreams.mli

It's a bunch of mutually recursive types here to encode the Activitystreams spec. In this example, inlining is not an option (the types are too big) and using a type parameter is not an option either because there are too many mutually recursive types to encode.

@gasche
Copy link
Member

gasche commented Mar 18, 2019

I agree that one could do some extra work to find a dependency order allowing the definition, but in general this is not the policy of the OCaml type system, and changing this policy would mean doing more work in many places.

Well variance checking works in this way. Personally I would find it natural that we do the right thing here and accept definitions when they do make sense. (We could look at the "other places" as the need arises.)

Jacques wrote:

(For instance, if your problem is with a large number of mutually recursive definitions, using a constrained type parameter lets you close the recursion with a single parameter.)

Kate wrote:

using a type parameter is not an option either because there are too many mutually recursive types to encode.

@kit-ty-kate, the encoding that Jacques mentions is what I called "type-level records" during our discussion:

type 'dict foo = [
  `Foo of 'foo
  `Bar of 'bar
] constraint 'dict = < foo : 'foo ; bar : 'bar >

The constraint lets you transport many type parameters (here 2) in a single type. (You can do it with tuples, or functions, whatever, but I like using object types to give robust names to the parameters, which show up properly even in inferred types.)

@garrigue
Copy link
Contributor

Here is a workaround in your case:

type ('a, 'b) either = Left of 'a | Right of 'b
type ('a, 'b, 'c, 'd) either4 = In1 of 'a | In2 of 'b | In3 of 'c | In4 of 'd
type anyURI
type dateTime and duration and nonNegativeInteger and units
;;

module type S = sig
module T : sig
  type t and ty_object and ty_collection and ty_image
  type ap_activity and ap_intransitiveActivity and ap_collectionPage
  and ap_orderedCollectionPage and ap_collection and ap_question
  and ap_relationship and ap_place and ap_profile and ap_tombstone
  and ap_orderedCollection
end
open T
type  ty_link = [
  (* Core Types *)
  | `Link of ap_link
  (* Link Types *)
  | `Mention of ap_link
]

and ty_collectionPage = [
  | `CollectionPage of ap_collectionPage
  | `OrderedCollectionPage of ap_orderedCollectionPage
]

and ty_image

and ap_object = <
  id : prop_id; (* Not defined ??! *)
  type_ : prop_type; (* Not defined ??! *)
  attachment : prop_attachment;
  attributedTo : prop_attributedTo;
  audience : prop_audience;
  content : prop_content;
  context : prop_context;
  name : prop_name;
  endTime : prop_endTime;
  generator : prop_generator;
  icon : prop_icon;
  image : prop_image;
  inReplyTo : prop_inReplyTo;
  location : prop_location;
  preview : prop_preview;
  published : prop_published;
  replies : prop_replies;
  startTime : prop_startTime;
  summary : prop_summary;
  tag : prop_tag;
  updated : prop_updated;
  url : prop_url;
  to_ : prop_to;
  bto : prop_bto;
  cc : prop_cc;
  bcc : prop_bcc;
  mediaType : prop_mediaType;
  duration : prop_duration;
>

and ap_link = <
  id : prop_id; (* Not defined ??! *)
  type_ : prop_type; (* Not defined ??! *)
  href : prop_href;
  rel : prop_rel;
  mediaType : prop_mediaType;
  name : prop_name;
  hreflang : prop_hreflang;
  height : prop_height;
  width : prop_width;
  preview : prop_preview;
>

and prop_id = anyURI option
and prop_type = anyURI list
and prop_actor = (ty_object, ty_link) either list list
and prop_attachment = (ty_object, ty_link) either list list
and prop_attributedTo = (ty_object, ty_link) either list list
and prop_audience = (ty_object, ty_link) either list list
and prop_bcc = (ty_object, ty_link) either list list
and prop_bto = (ty_object, ty_link) either list list
and prop_cc = (ty_object, ty_link) either list list
and prop_context = (ty_object, ty_link) either list list
and prop_current = (ty_collectionPage, ty_link) either option
and prop_first = (ty_collectionPage, ty_link) either option
and prop_generator = (ty_object, ty_link) either list list
and prop_icon = (ty_image, ty_link) either list list
and prop_image = (ty_image, ty_link) either list list
and prop_inReplyTo = (ty_object, ty_link) either list list
and prop_instrument = (ty_object, ty_link) either list list
and prop_last = (ty_collectionPage, ty_link) either option
and prop_location = (ty_object, ty_link) either list list
and prop_items = (ty_object, ty_link) either list list
and prop_oneOf = (ty_object, ty_link) either list list
and prop_anyOf = (ty_object, ty_link) either list list
and prop_closed = (ty_object, ty_link, dateTime, bool) either4 list list
and prop_origin = (ty_object, ty_link) either list list
and prop_next = (ty_collectionPage, ty_link) either option
and prop_object = (ty_object, ty_link) either list list
and prop_prev = (ty_collectionPage, ty_link) either option
and prop_preview = (ty_object, ty_link) either list list
and prop_result = (ty_object, ty_link) either list list
and prop_replies = ty_collection option
and prop_tag = (ty_object, ty_link) either list list
and prop_target = (ty_object, ty_link) either list list
and prop_to = (ty_object, ty_link) either list list
and prop_url = (anyURI, ty_link) either list list
and prop_accuracy = float option
and prop_altitude = float option
and prop_content = string list list
and prop_name = string list list
and prop_duration = duration option
and prop_height = nonNegativeInteger option
and prop_href = anyURI option
and prop_hreflang = string option (* TODO: not equivalent to [BCP47] Language Tag *)
and prop_partOf = (ty_collection, ty_link) either option
and prop_latitude = float option
and prop_longitude = float option
and prop_mediaType = string option (* TODO: not equivalent to MIME Media Type *)
and prop_endTime = dateTime option
and prop_published = dateTime option
and prop_startTime = dateTime option
and prop_radius = float option
and prop_rel = string list list (* TODO: not equivalent to [RFC5988] or [HTML5] Link Relation *)
and prop_startIndex = nonNegativeInteger option
and prop_summary = string list list
and prop_totalItems = nonNegativeInteger option
and prop_units = (units, anyURI) either option
and prop_updated = dateTime option
and prop_width = nonNegativeInteger option
and prop_subject = (ty_object, ty_link) either option
and prop_relationship = ty_object list list
and prop_describes = ty_object option
and prop_formerType = ty_object list list
and prop_deleted = dateTime option

type ty_collection = [
  | `Collection of ap_collection
  | `OrderedCollection of ap_collection
  | ty_collectionPage
]

type ty_object = [
  (* Core Types *)
  | `Object of ap_object
  | `Activity of ap_activity
  | `IntransitiveActivity of ap_intransitiveActivity
  | ty_collection
  (* Activity Types *)
  | `Accept of ap_activity
  | `TentativeAccept of ap_activity
  | `Add of ap_activity
  | `Arrive of ap_intransitiveActivity
  | `Create of ap_activity
  | `Delete of ap_activity
  | `Follow of ap_activity
  | `Ignore of ap_activity
  | `Join of ap_activity
  | `Leave of ap_activity
  | `Like of ap_activity
  | `Offer of ap_activity
  | `Invite of ap_activity
  | `Reject of ap_activity
  | `TentativeReject of ap_activity
  | `Remove of ap_activity
  | `Undo of ap_activity
  | `Update of ap_activity
  | `View of ap_activity
  | `Listen of ap_activity
  | `Read of ap_activity
  | `Move of ap_activity
  | `Travel of ap_intransitiveActivity
  | `Announce of ap_activity
  | `Block of ap_activity
  | `Flag of ap_activity
  | `Dislike of ap_activity
  | `Question of ap_question
  (* Actor Types *)
  | `Application of ap_object
  | `Group of ap_object
  | `Organization of ap_object
  | `Person of ap_object
  | `Service of ap_object
  (* Object Types *)
  | `Relationship of ap_relationship
  | `Article of ap_object
  | `Document of ap_object
  | `Audio of ap_object
  | `Image of ap_object
  | `Video of ap_object
  | `Note of ap_object
  | `Page of ap_object
  | `Event of ap_object
  | `Place of ap_place
  | `Profile of ap_profile
  | `Tombstone of ap_tombstone
]

type t = [
  | ty_object
  | ty_link
]

type ap_activity = <
  ap_object;
  actor : prop_actor;
  object_ : prop_object;
  target : prop_target;
  result : prop_result;
  origin : prop_origin;
  instrument : prop_instrument;
>

type ap_intransitiveActivity = <
  ap_object;
  actor : prop_actor;
  target : prop_target;
  result : prop_result;
  origin : prop_origin;
  instrument : prop_instrument;
>

type ap_collection = <
  ap_object;
  totalItems : prop_totalItems;
  current : prop_current;
  first : prop_first;
  last : prop_last;
  items : prop_items;
>

type ap_orderedCollection = ap_collection

type ap_collectionPage = <
  ap_collection;
  partOf : prop_partOf;
  next : prop_next;
  prev : prop_prev;
>

type ap_orderedCollectionPage = <
  ap_orderedCollection;
  ap_collectionPage;
  startIndex : prop_startIndex;
>

type ap_question = <
  ap_intransitiveActivity;
  oneOf : prop_oneOf;
  anyOf : prop_anyOf;
  closed : prop_closed;
>

type ap_relationship = <
  ap_object;
  subject : prop_subject;
  object_ : prop_object;
  relationship : prop_relationship;
>

type ap_place = <
  ap_object;
  accuracy : prop_accuracy;
  altitude : prop_altitude;
  latitude : prop_latitude;
  longitude : prop_longitude;
  radius : prop_radius;
  units : prop_units;
>

type ap_profile = <
  ap_object;
  describes : prop_describes;
>

type ap_tombstone = <
  ap_object;
  formerType : prop_formerType;
  deleted : prop_deleted;
>
end;;

module rec Types : S with module T := Types = Types;;

@garrigue
Copy link
Contributor

@gasche Variance works that way because it has to: even for a single type definition, you have to compute a fixpoint. The code for mutually recursive definitions is just a generalization of that.

For variant and object inheritance, the problem is very different: for them to be well-defined, there must be a correct definition order, and this reflects the way value definitions work in OCaml (contrary to Haskell). So you just have to respect this order.
By the way, the natural way to solve that would not be a fixpoint, but to compute this order.

Another option could be to revert past decisions, and decide to allow inheritance inside a mutual recursion, like for classes. One would still have to reorder the definitions, but no workaround would be need.

@github-actions
Copy link

github-actions bot commented May 6, 2020

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label May 6, 2020
@kit-ty-kate
Copy link
Member Author

This feature-wish is still relevant

@garrigue
Copy link
Contributor

As I said in my last message in the discussion, there is a potential solution of doing the same as for class inheritance, i.e. allow to access previous type definitions inside mutual recursive blocks. This is simpler that computing an order: you still need to write your definitions in the right order, but they can be mutually recursive.
An argument for this solution is that we do not need the full type definition, but only to be able to copy the cases we are inheriting, in a more or less textual way.
I remember that Xavier was not really happy with that approach, so I kept to just rejecting this kind of definitions, but it would be coherent with classes.

@github-actions github-actions bot removed the Stale label Jun 10, 2020
@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@kit-ty-kate
Copy link
Member Author

could this be reopened?

@gasche gasche removed the Stale label Jul 12, 2021
@gasche gasche reopened this Jul 12, 2021
@github-actions
Copy link

This issue has been open one year with no activity. Consequently, it is being marked with the "stale" label. What this means is that the issue will be automatically closed in 30 days unless more comments are added or the "stale" label is removed. Comments that provide new information on the issue are especially welcome: is it still reproducible? did it appear in other contexts? how critical is it? etc.

@github-actions github-actions bot added the Stale label Jul 15, 2022
@gasche gasche removed the Stale label Jul 15, 2022
@JoanThibault
Copy link
Contributor

Sorry, to reopen this issue, but I am also personally interested in its resolution.

I know it is not ideal, but could we have a [@ i know this type is circular, please ignore] that we could put next to the problematic type definition ?

@JoanThibault
Copy link
Contributor

JoanThibault commented Jan 4, 2023

My problem occur on this example :

  type ('forward, 'backward, 'repeat, 'raise, 'stop) walker_command =
    | Forward  of 'forward (* used to propagate computation forward *)
    | Backward of 'backward (* used to propagate result backward *)
    | Repeat   of 'repeat  (* ask the same query again (likely assume some internal change in the controler) *)
    | Stop     of 'stop    (* similar to [Raise] intterupts the walk and returns the corresponding value *)
    | Raise    of 'raise   (* similar to [Stop] but also return a "thread" to resume the computation later *)

  type ('param, 'result, 'raise, 'stop) walker_internal =
    (
      (* forward  *) ('param, 'result, 'raise, 'stop) walker_callback * (int * 'param) array,
      (* backward *) 'result,
      (* repeat   *) unit,
      (* stop     *) 'stop,
      (* raise    *) 'raise
    ) walker_command
  and  ('param, 'result, 'raise, 'stop) walker_callback =
    (int * 'result) array -> ('param, 'result, 'raise, 'stop) walker_internal

I know I could inline (and it is probably what I will do in the meantime) but it requires to write more boilerplate code, and it breaks genericity.

In this case, I can also introduce unique constructor :

  type ('param, 'result, 'raise, 'stop) walker_internal =
    WalkerInternal of (
      (* forward  *) ('param, 'result, 'raise, 'stop) walker_callback * (int * 'param) array,
      (* backward *) 'result,
      (* repeat   *) unit,
      (* stop     *) 'stop,
      (* raise    *) 'raise
    ) walker_command
  and  ('param, 'result, 'raise, 'stop) walker_callback =
    (int * 'result) array -> ('param, 'result, 'raise, 'stop) walker_internal

@gasche
Copy link
Member

gasche commented Jan 5, 2023

@JoanThibault I don't think that your example concerns the same issue, you are talking about something different. @kit-ty-kate wants to build equi-recursive types that are allowed by OCaml, because they go through object types or polymorphic variants, but some ways to define them are rejected and some are accepted. Your example is trying to build a recursive type that is not allowed in OCaml unless -rectypes is passed, so it makes perfect sense that it is rejected. Yes, you have to "break" the recursive cycle by using an extra datatype constructor -- you might want to use the [@@unboxed] representation to make this a performance-free move.

(Unrelated note: there are various typos in your documentation, "controler", "intterupts", and "but also return" should have "returns".)

@gasche
Copy link
Member

gasche commented Jan 5, 2023

I know it is not ideal, but could we have a [@ i know this type is circular, please ignore] that we could put next to the problematic type definition ?

We cannot do this, because later the type-checker would risk blowing up (through a hard error or by non-termination) on such types that do not respect its well-formedness expectations. @garrigue has actually worked on type theories that tolerate ill-founded recursive definitions, and one might possibly harden the OCaml type-checker in this style, but this is a major undertaking that I don't think anyone is planning to work on, there are more pressing issues with the type checker.

@JoanThibault
Copy link
Contributor

@gasche thank you for the explanations (and for spotting the typos) 😄

@ocaml ocaml locked and limited conversation to collaborators Jan 26, 2023
@nojb nojb converted this issue into discussion #11944 Jan 26, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Projects
None yet
Development

No branches or pull requests

6 participants