Skip to content

Rework package constraint checking to improve interaction with immediacy #12930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Feb 22, 2024

Conversation

ccasin
Copy link
Contributor

@ccasin ccasin commented Jan 23, 2024

The goal of this PR is to fix two bugs related to the interaction of package with type constraints and immediacy. One of the bugs is #12924. The other is that if you use a package with type constraint to add an immediate manifest to an abstract type declaration, you can not use the fact that the resulting declaration is immediate (an example is added as a test).

This PR fixes these issues by reworking the typechecking of package with type constraints. Package constraints are different than normal signature constraints. They are much simpler in most ways (e.g., they do not work on type declarations with parameters or manifests and do not support destructive substitution). But they are slightly more complicated in one way, in that they they may make use of the current type variable environment while normal signature constraints may not. For example:

module type S = sig
  type t
end

type 'a m = (module S with type t = 'a)

The existing implementation tries to paper over these differences by constructing a fake, malformed bit of parse tree and relying on the existing checking for normal signature constraints (explained in more detail below). I think the new implementation here, which directly checks the package constraint rather than re-using the normal signature constraint checking, is simpler and clearer. But I am happy to hear other opinions. Also, I am not an expert on first class modules, so it is possible I have missed something.

I've broken this into several commits, which can be read independently.

  • The first couple make minor related fixes I noticed while working on this.
  • The third adds tests documenting the current behavior of package constraints, including its flaws, some of which are fixed in subsequent commits.
  • The fourth does the bulk of the rework, fixing Package with type constraints do not work on @@immediate types #12924.
  • The fifth does a small bit of cleanup in Typedecl, now that transl_with_constraint no longer needs to check for the weird fake package constriants.
  • The sixth fixes the related second bug in unpacking.

Reviewers can stop reading here and go look at the code, if they like. The rest of this is just text describing my understanding of how things used to work and how they work after the PR, which may or may not be helpful.

How package constraints work before this PR

Package type expressions are handled by the Ptyp_package case of Typetexp.transl_type_aux. This function relies on a helper function create_package_mty to construct a fake (non-first-class) parsetree module type that vaguely resembles the constrained package. This fake module type has with constraints, but they have no manifests - a situation which can never arise from the actual parser. So a package type like (module S with type t = int) results in a fake module type that looks something like S with type t.

We then translate this module type. Normal module with type constraints are handled by Typemod.merge_constraint, which translates the constraint to a type decl using Typedecl.transl_with_constraint and then checking that this type decl is included in the constrained type decl from the original signature. transl_with_constraint has special case logic to handle these declarations with no manifests, since normal with type constraints must have them. Finally these new type decls are merged into the original signature, which has no effect in the case of package constraints (because they have no manifests) except to update some locations.

Finally, we separately translate the types from the package constraints, and build a Tpackage type including them. Here we are crucially relying on the fact that the fake package constraint checked successfully with no manifests. This implies that the constrained type in the original signature is fully abstract, and so any constraint is fine - no need to compare the type in the constraint against the original signature.

Can't we just fix this by adding manifests to the fake module type built by create_package_mty?

No. There are two problems.

First, the existing implementation of transl_with_constraint in Typedecl uses the absence of a manifest to detect and provide a bit of special case logic for the "fake" with type constriants used in package checking. See #6293.

Second, as shown in the example near the top of this PR, package constraints may use the current type variable environment in a way that normal module with type constraints may not. Package constraint bodies that make use of the environment don't typecheck properly as normal signature constraints.

New approach

The new approach proposed by this PR is to write a much simplified version of with type constraint checking that is specific to package constraints, rather than constructing a fake module type and trying to check package constraints as module type constraints. create_package_mty is eliminated, and there are new functions Typedecl.transl_package_constraint and Typemod.merge_package_constraint that mirror Typedecl.transl_with_constraint and Typemod.merge_constraint respectively, but are much simpler because package constraints are always simple non-destructive with type constraints.

It would be possible to go even farther in this direction. E.g., Includecore.type_declarations does a lot of work that is not needed for decls arising from package constriants. This felt like the best middle ground to me - clarifying how package constraints work without duplicating too much code.

Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reviewed this patch.

  • I don't have deep perspective on the last (very short!) commit; others may be better off taking a quick look. It looks correct to me.
  • The rest of this looks correct. I don't have a great sense as to whether the approach taken here is the best possible approach, but the code looks correct. I have a few questions about design.

If no one else comes along ready to review this soon, I think it's fine to merge this. There may indeed be improvements to be found, but those can be made in a later PR.

let sg =
List.fold_left
(fun sg (lid, cty) ->
snd (merge_package_constraint env loc sg lid cty))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just call merge_constraint here? That's definitely overkill, but it seems better than duplicating code. I don't see what merge_package_constraint does that's better than merge_constraint.

Copy link
Contributor Author

@ccasin ccasin Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works - trying to use merge_constraint for both things is the source of the problems fixed in this PR.

The key difference is that merge_package_constraint takes the constraint as an already-translated bit of typedtree, while merge_constraint takes it as a not-yet-translated parsetree type decl. That type decl is subsequently typechecked in the wrong type variable environment to support package constriants, which is why the old version used the fake malformed bit of parsetree described in the PR description above. I tried to explain there why obvious small tweaks to that approach don't work - maybe that makes more sense after this answer, but if not let me know and I'll try again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had missed the difference between the parsetree and the typedtree here -- an important distinction.

But I'm still not convinced we have to repeat all this code. You encountered two problems, but I think both can be addressed without duplicating the code:

  1. We must make sure that the type in the original signature is abstract. (Why? I'm not sure. But let's not try to fix that today.)
  2. We must allow variables to be used from the environment.

It seems relatively easy to accommodate both (1) and (2) with an extra parameter passed to merge_constraint, saying whether we're in a package type. For (1), we just do an extra check. For (2), we skip the TyVarEnv.reset() in transl_with_constraint. If that runs into trouble, then maybe merge_constraint decides between calling transl_with_constraint and the new transl_package_constraint (which I'm not bothered by, because it's so short).

It's possible there will be more obstacles on this path. I'm just concerned by duplicating so much code here. Also happy to chat about this live this afternoon, say.

Copy link
Contributor Author

@ccasin ccasin Feb 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fairly strongly disagree. Maybe that would work, maybe it would not. (In practice I did try deleting the TyVarEnv.reset call you mention and that's not enough to get the correct tyvarenv. I did not deeply investigate why, but this contributes to my view that the old approach is very confusing and is best to ditch.)

But even if we could easily additionally parameterize merge_constraint, I think the approach I've proposed here is much better! I believe this PR considerably clarifies how package with constraints work. The old approach, of constructing a fake bit of parse tree (Bad!) and having a bunch of subtle special cases, was pretty confusing. It took me quite a while to work out what is allowed and what is not allowed for package constriants. In the new version, that's it's own function and you can just read it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the aim is to merge the merge_constraint function, an option that could probably work would be to add a With_type_package variant to the with_info type which takes a typed type declaration rather than a parsetree fragment. Then merge_package_constraint could be:

let merge_package_constraint initial_env loc sg lid cty =
  let _, s = merge_constraint initial_env loc sg lid (With_package_type cty) in
  s

(A dirty implentation is visible at 7c0ebd3)

However, I have mixed feeling about this idea: yes, the complex code of merge_constraint is shared, but this code is complex because it is mixing multiple functions with a small common spine. Thus I am not completely sure if this version is simpler than the version which is extracting the simple package type case from the chimeric aggregated function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the aim is to merge the merge_constraint function, an option that could probably work would be to add a With_type_package variant to the with_info type which takes a typed type declaration rather than a parsetree fragment.

I like this idea! It nicely isolates the behavior of package with type constraints so it's less spread out in special cases of merge_constraint and transl_with_constraint, and eliminates the need to construct fake parse tree. I will give it a try.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed a commit implementing @Octachron's idea.

Compared with his draft, I made a few changes:

  1. I made a separate case in merge_constraint for With_type_package, rather than combining it into the With_type/With_typesubst case. I think this is clearer, even if it results in a couple duplicated lines.
  2. I changed merge_constraint to only conditionally return the typed tree with_constraint (by returning an option). This constraint was unused and slightly nonsense in the With_type_package constraint case, so this seems clearer.
  3. I kept the requirement that package with constraints only work on type declarations without manifests. A nice thing about his version is that we could potentially allow updating manifests with compatible definitions, but I believe this will require more thought around row variables (or at least more work to get good error messages in those cases), so I propose to leave it as future work. I have added a few more tests of this type. (Note this PR preserves today's behavior, it just gives it a clearer error message.)

I agree with @Octachron that there are trade offs here - the old version was maybe a little clearer by having two separate functions rather than combining them into one just to avoid duplicating the "common spine". But I think this approach is probably the best compromise, eliminating the need to build fake parsetree without duplicating too much code.

@ccasin ccasin force-pushed the package-with-type branch from 4840cb7 to 99da2c8 Compare February 8, 2024 00:51
@ccasin
Copy link
Contributor Author

ccasin commented Feb 8, 2024

I've rebased this and added a changes entry. There is a failure on the macos build that I believe is spurious, but I don't think I have permissions to restart it. Can someone do that?

@gasche
Copy link
Member

gasche commented Feb 8, 2024

(I tried to "re-run" the OSX job.)

@gasche
Copy link
Member

gasche commented Feb 8, 2024

@garrigue it would be nice if you could have a look at this PR. If you don't have the time to look at it, it will probably be Florian or myself making a less-informed decision.

@gasche
Copy link
Member

gasche commented Feb 8, 2024

(My own decision-making process would be to broadly trust @ccasin and @goldfirere's review and question-asking, and the testsuite, have a quick look at the code and merge without reading it in full details.)

item
| _ ->
let newmd = {md with md_type = Mty_signature newsg} in
Sig_module(id, Mp_present, newmd, rs, priv)
Copy link
Member

@Octachron Octachron Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we want to share this logic for module aliases with merge_constraint by having a common update_module function. (I meant the special on Mty_alias _).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nice idea, but I found it a little tricky to do cleanly because the merge_constraint code also checks which sort of constraint we have, here. (Though I confess I do not entirely understand why it does so.)

Error: In the constrained signature, type "P.t" is defined to be "M.t".
Package "with" constraints may only be used on abstract types.
|}];;

Copy link
Member

@Octachron Octachron Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may be useful in order to be future proof to test the private row case:

(* Ghosts haunted type definitions *)
module type Private_row = sig
  type a
  and t = private [< `A | `B ]
  and b
  and d = private [< `C ]
 end
 (* This is fine, the ghost type `t#row` is removed silently *)
 module type Test = Private_row with type t = [ `A ]
 (* This fail currently, but make sure that if no ghost appear here)
 type fail = (module Private_row with type t = [ `A ] )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea - test added.

| With_package_manifest (lid, ty) ->
Location.errorf ~loc
"@[In the constrained signature, type %a is defined to be %a.@ \
Package %a constraints may only be used on abstract types.@]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extremely minor nitpicking (because I have the context in mind): the global error reporter currents prints error message inside a b box (@[%t@]), thus opening a second box inside the message has currently no effects at all and it might be better to avoid it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed - fixed.

let merge_package_constraint initial_env loc sg lid cty =
let rec patch_item namelist outer_sig_env sg_for_env ~ghosts item =
let return replace_by =
Some ((), {Signature_group.ghosts; replace_by})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are by construction not in the destructive substitution case of the signature patching , maybe:

let return replace_by =
      Some ((), {Signature_group.ghosts; replace_by=Some replace_by})

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I agree that is clearer, and have changed it.

Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a happy medium, indeed. Shall we merge? Or do we want to wait for a review from e.g. @garrigue? (To be clear, I'm pretty confident that the fix is correct. The question is whether the design is the best, but I'm even more confident than I was before.)

@gasche gasche added this to the 5.2 milestone Feb 10, 2024
@gasche
Copy link
Member

gasche commented Feb 10, 2024

I would propose to wait a few more days and then merge if no feedback.
I added this to the 5.2 milestone so that we don't forget.

@gasche gasche merged commit 0d5157f into ocaml:trunk Feb 22, 2024
@Octachron
Copy link
Member

@gasche , is there a reason why you ended up not cherry-picking the change to 5.2 ?

@gasche
Copy link
Member

gasche commented Mar 22, 2024

I think that I marked this as 5.2 to ensure that we revisit this in time (and we did), but I am not sure that we should cherry-pick it to 5.2. (I don't have the expertise and haven't invested the review time to feel confident in the change, and I suspect that the fixed issues are not particularly likely to affect users of the upstream compiler before the 5.3 release.) You read this in details so you would make a better decision.

@Octachron
Copy link
Member

I agree that this is not an urgent fix, however the changes are well delimited and not likely to affect other part of the typechecker. I am in favor in merging it now in 5.2.0 .

@ccasin
Copy link
Contributor Author

ccasin commented Mar 22, 2024

For what it's worth, I am in the process of rolling this fix out internally at Jane Street. The only slightly unexpected consequence is that with type constraints on packages now less eagerly count as a use of the constrained type, so after this change we detect more unused type declarations in modules (that is, warning 34 is improved). This occured twice in our approximately 60m lines of OCaml.

(So, I agree it is fine to eagerly cherry-pick this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants