Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explicitly rescope idents when substituting signatures #8586

Merged
merged 15 commits into from Apr 16, 2019

Conversation

Projects
None yet
6 participants
@trefis
Copy link
Contributor

commented Apr 4, 2019

Fixes #8548.

The issue was indeed introduced by #1980: we need to be careful (that we properly scope idents) when "freshening" signatures. In #1980 we were only careful in two of several cases, when opening or including a signature.
This PR covers other cases:

  • when constraining a signature
  • when "simplifying" a signature
  • when running checks in Includemod
  • when checking that a recursive module is well typed (this one was partially handled in #1980)
  • when populating components tables (in Env)

There are still calls to Subst functions which do not explicitly pass a scope argument:

  • calls that happen outside the typechecker (in bytecomp, in the debugger, etc.): the scope doesn't matter anymore at this point
  • calls for which the scope argument would clearly have no impact.

Note: before #1980 idents were implicitly "rescoped" to "the current time" whenever we ran a substitution. In almost all the cases listed above they are now being explicitly rescoped to the moral equivalent of "the current time" (i.e. we're creating a fresh now scope, and using that).
That is not what happens when computing components of modules, there I chose to reuse the scope of the module, which seems more correct in general.
There is one situation where it's not so obviously correct though: when computing components of a functor application that we encounter in a path (cf. Env.find_module and Env.find_module_descr).
In this case the scope I use is the one of the functor, but I wonder if using the narrowest of scope of the functor or of its argument (that is: using ~scope:(max f.fcomp_scope (Path.scope p2))) might not be better; I couldn't find any example where this made a difference in practice.


This doesn't apply cleanly on 4.08, but I have already written that version locally, to be able to compile janestreet's codebase with 4.08.
I'll push it directly on top of 4.08 when/if this PR is merged.

@trefis trefis added this to the 4.08 milestone Apr 4, 2019

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Apr 4, 2019

I'd like to understand this problem a bit better. The components of modules should be able to be given "local" scope, since they are local to the module and are thus essentially generalised, so I'd like to know how doing that is able to cause a problem.

I had been thinking of making Mtype.lower_gen explicitly make all the idents local because that is logically the place where we "generalize" the module type, but now I'd be worried it might trigger #8548.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 5, 2019

I had the same intuition as you, and my initial version of the patch didn't change Env. However upon running it upon jane I found that the problem persisted, that's why I added the last two commits.

Perhaps the changes I made are too invasive, but given that I have been surprised once, I decided to be on the safe side. That seems like the right choice for now, given that this is indeed for 4.08, and can be revisited later.

If you want to look at this more however, I am not going to stop you :)

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Apr 5, 2019

Don't you have a "decreasing" invariant for scopes, similar as the one for levels?
I would expect something like "every path in the body of a type definition must have a scope lower than the type it defines". Otherwise you cannot ensure that update_level will expand all paths that have a higher scope.
This should mean that, as you wrote, the scope for type definitions inside a functor application should be the maximum of all the identifiers in this application.

The lesson from before the introduction of Mtype.lower_gen is that some problems with the module system are very hard to trigger, so the fact that something happens to work is no guarantee that it is correct. Here I agree with @lpw25 that we really need to understand what is happening.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 8, 2019

I would expect something like "every path in the body of a type definition must have a scope lower than the type it defines". [...]
This should mean that, as you wrote, the scope for type definitions inside a functor application should be the maximum of all the identifiers in this application.

Yes, that sounds right.
I will fix that part.


As for whether we should care about the scope when populating components tables, I'd be happy to look at this in more details. But I'd suggest delaying that a bit, since:

  • I believe the current implementation to be safe, as it is closer to what was done in 4.07
  • #8548 is blocking 4.08
  • I have weak hopes that it might be easier to reason about that part of the code once #2127 is merged.

If either of you (or someone else!) disagrees, then I'll spend more time on this, and try and come up with a clear explanation of what's going on.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

I'm fine with this plan, for now.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2019

I'm also fine with this plan, although I would appreciate it if you could find time to work out what is happening in the not too distant future.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2019

although I would appreciate it if you could find time to work out what is happening in the not too distant future.

Yes, I would like to avoid having to page everything back in in a year. I'm hoping to have a look once 4.08 is "done".

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2019

That being said, I still need someone to look at the code of this PR.

@trefis trefis force-pushed the trefis:pr8548 branch from 502890f to 041189a Apr 9, 2019

@garrigue
Copy link
Contributor

left a comment

This change introduces extra complexity, and I think you should document the API changes.
Also, I couldn't find any use of max or Path.scope in the diff, while I believe this appeared in the discussion.

Show resolved Hide resolved typing/env.ml Outdated
Show resolved Hide resolved typing/includemod.ml Outdated
Show resolved Hide resolved typing/subst.mli Outdated
@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 10, 2019

Thanks for looking at this!
You are correct that:

  • I forgot to push the change we discussed previously; this is now fixed
  • this PR does add complexity

I added some of the documentation you wanted (in Subst and in Env).
I didn't add any in Include* yet, but when I thought about it, I wondered if another API in Subst might not be better.
Instead of val signature : ?scope:int -> ..., perhaps I want:

type scoping =
  | Make_local
  | Preserve_scope
  | Rescope of int

val signature : scoping -> ...

at which point Include* can just pass Preserve_scope. My intuition is that this would be correct, and make more sense than what is done right now.
I will look more closely tomorrow. (But if you have any comment to make on that idea, I do welcome them)

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Apr 10, 2019

I agree that having a Subst API that encapsulates things better (and is more readily understandable) would improve readability and remove part of the extra complexity. In particular it is good if one doesn't have to wonder about what to pass where it's not really relevant.

@trefis trefis force-pushed the trefis:pr8548 branch from afe45d6 to 8046696 Apr 12, 2019

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 12, 2019

Alright, you'll be happy to know that I took the time to look more closely at what was happening with the components and have subsequently rewritten this PR (it turns out that these things are easier when one is not jetlagged) [1].

It turns out that one indeed needed to be mindful of the scope in Env.components_of_module_maker, but only when applying the prefixing substitution, not for the freshening one (as the freshened signature is indeed local).

I also changed the subst API to always take the scoping argument that I described above, instead of optionally taking a scope. This forces one to always think about what happens to scopes when applying a substitution, and makes everything clearer.

As a result, the final diff is a lot smaller and simpler than it previously was.

For reading this new diff I recommend reading commit per commit: the approach I chose is to make a commit for each scoping decision that had to be made, as a result the individual commits don't build [2], but are self-contained and easy to reason about.
In the few cases where the scoping decision wasn't necessarily obvious, I added a comment in the code to explain why I believe the choice I made is correct.


[1]: The previous diff of this PR is preserved here
[2]: I think we will probably want to squash everything before merging, to make bisecting easier? Though I'm not sure it's strictly necessary. @gasche: do you have an opinion on the matter?

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 12, 2019

Oh, I forgot to say one thing: I made one change which is unrelated to this PR: I disabled warning 40 on the codebase.
I was tired of having to write Subst.modtype Subst.Make_local sub mty. Given that we build with -principal, I think it's acceptable to use constructor disambiguation.

@trefis trefis force-pushed the trefis:pr8548 branch from 8046696 to ee50ff6 Apr 12, 2019

@garrigue
Copy link
Contributor

left a comment

I approve because the code itself looks fine, and I don't want to delay more, but you should give more information about how to decide which scoping information to use where.
At this point this looks a bit like trial and error...
Someday we may have a fuller understanding of why, but at least we need to understand the invariants you are enforcing.

@@ -1853,8 +1853,10 @@ and type_module_aux ~alias sttn funct_body anchor env smod =
let mty_appl =
match path with
Some path ->
Subst.modtype (Subst.add_module param path Subst.identity)
mty_res
let scope = Ctype.create_scope () in

This comment has been minimized.

Copy link
@garrigue

garrigue Apr 15, 2019

Contributor

Why not use the scope of the path of this applicative functor?

This comment has been minimized.

Copy link
@trefis

trefis Apr 15, 2019

Author Contributor

I think it would have been sound, but seems less correct that creating a new scope, because you'd end up with:

module M = F(A)

where the scope of M is bigger than the one of the items of the resulting signature.

This comment has been minimized.

Copy link
@garrigue

garrigue Apr 16, 2019

Contributor

I see. One day, M will be handled as a module alias, and then this may make a difference.
But I agree that at this point there seems to be no way to distinguish between the two, and creating a new scope is more in line with the current semantics.

Show resolved Hide resolved Makefile Outdated
| Keep
| Make_local
| Rescope of int

This comment has been minimized.

Copy link
@garrigue

garrigue Apr 15, 2019

Contributor

Could you provide a rationale for which one should be used where?
Something like: inside the environment, and when extracting from it, one should use the scope of the path of the enclosing module, etc ...
Currently this looks a bit like magic.
Even if you don't yet understand fully why this has to be this way, this would give a hint.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

BTW, I'm not sure I'm qualified for approving the -w -40 part, since I'm originally on the don't care side of the discussion.

@Drup

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

I don't know if I'm qualified either, but I'm on the "I don't understand why it wasn't done eons ago" side, so I think it's a good idea, and I've wanted it for quite a while now.

@mshinwell

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

Agreed re. warning 40.

@gasche

This comment has been minimized.

Copy link
Member

commented Apr 15, 2019

Re. warning: I think it wasn't the nice thing to do to bundle the change with a PR like that, and personally I still believe that this feature should not be the default, but I have the feeling that I'm basically the only one with this opinion, so maybe it's fine to let you all do whatever you want?

@mshinwell

This comment has been minimized.

Copy link
Contributor

commented Apr 15, 2019

Agreed that this part should go in a separate patch.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 15, 2019

Your wish is my command: the warning part was extracted to #8617.

@trefis trefis force-pushed the trefis:pr8548 branch from ee50ff6 to af8d7c7 Apr 16, 2019

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Apr 16, 2019

#8617 was merged, I rebased this PR, I will merge once travis and appveyor come back green.
I will then make a follow up PR for 4.08.


With regard to the following comments from Jacques:

  • you should give more information about how to decide which scoping information to use where.
    At this point this looks a bit like trial and error...

  • Could you provide a rationale for which one should be used where?
    Something like: inside the environment, and when extracting from it, one should use the scope of the path of the enclosing module, etc ...
    Currently this looks a bit like magic.

These are fair critics, which are partly due to the fact that I was trying to more-or-less mimic what was happening before #1980.

As for what the correct behavior should be, I discussed it offline with Leo and he suggested that: all the idents that are bundled inside a module (or module type) should always be made local, and that the signature should be rescoped properly whenever we open it up. I agree with him.
Concretely that means that of the Keep constructor should disappear from Subst.scoping, and that the callers of Includemod should be careful to pass correctly scoped module types as argument.
I didn't want to include such a change into 4.08 so late into the cycle, but I hope to send a PR against trunk implementing it soon (though I have many things on my plate right now, so this might have to wait a while).

In the meantime, the approach I used was:

  • if you know the signature shouldn't be used as is (i.e. without being refreshed), make the idents local
  • if there is a clear scope to give to the item, use Rescope
  • if it's not clear what should be done, or if the scope doesn't matter, use Keep

@trefis trefis merged commit a8fd17f into ocaml:trunk Apr 16, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

trefis added a commit to trefis/ocaml that referenced this pull request Apr 16, 2019

Merge pull request ocaml#8586
Explicitly rescope idents when substituting signatures

trefis added a commit to trefis/ocaml that referenced this pull request Apr 16, 2019

Merge pull request ocaml#8586
Explicitly rescope idents when substituting signatures

@trefis trefis deleted the trefis:pr8548 branch Apr 16, 2019

trefis added a commit that referenced this pull request Apr 16, 2019

Merge pull request #8586
Explicitly rescope idents when substituting signatures
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.