Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New attempt at fixing MPR#7726 #1676

Merged
merged 4 commits into from Jul 12, 2018

Conversation

Projects
None yet
4 participants
@garrigue
Copy link
Contributor

commented Mar 23, 2018

After failed attempt #1639, this is a new attempt at fixing MPR#7726, which came from the incorrect detection of cycles in types when substituting in a recursive module signature.

The new approach works by checking explicitly for such types after applying non-trivial substitutions, namely functor applications and with declarations.

The error reporting is a bit weak, but this PR is already open for discussion.

@garrigue garrigue requested a review from lpw25 Mar 23, 2018

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Mar 23, 2018

The main question here is where to check. I currently only check after applying with constraints and functors. Is it needed somewhere else? Can you imagine a case where there would be a stack overflow before the check is run?

@damiendoligez

This comment has been minimized.

Copy link
Member

commented May 30, 2018

@lpw25

This comment has been minimized.

Copy link
Contributor

commented May 30, 2018

Sorry, for my slow reply on this PR. The reason is that what I actually wanted to do is to implement the alternative suggested by Jacques:

Well, actually one could also assume that any non-manifest types in the result of a functor application taking a module of the recursion as argument potentially introduces a non-contractive recursion

because I strongly dislike the approach taken in this current PR. It is completely different from how we deal with recursive type checks everywhere else in the compiler, and type-checking of abstract types more generally.

If a functor definition is accepted then any module which has the module type of the parameter should be allowed. This PR instead adds an extra condition that has nothing to do with the parameter type of the functor. We should not be checking that a substitution is valid after the fact -- this is the approach to substitution of C++ templates.

However, I haven't had time to implement the alternative, and this change is clearly forwards compatible with doing the more principled fix later, so I guess we can put it in for now.

I'll review the code itself for correctness this afternoon.

@lpw25
Copy link
Contributor

left a comment

The code looks fine in that it does what it is intended to. However, it's not sufficient to prevent all instances of MPR#7726. For example:

        OCaml version 4.07.0+dev0-2017-09-18

# module type T = sig type t end
module Fix(F:(T -> T)) = struct
  module rec Fixed : T with type t = F(Fixed).t = F(Fixed)
end;;
      module type T = sig type t end
module Fix :
  functor (F : T -> T) ->
    sig module rec Fixed : sig type t = F(Fixed).t end end
# module Id (X:sig type t end) = struct type t = X.t end;;
module Id : functor (X : sig type t end) -> sig type t = X.t end
# let f (x : Fix(Id).Fixed.t) = x;;
val f : Fix(Id).Fixed.t -> Fix(Id).Fixed.t = <fun>
# f 5;;
Fatal error: exception Stack overflow

@garrigue garrigue force-pushed the garrigue:fix_mpr7726a branch from e72226c to be0caff Jun 1, 2018

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jun 1, 2018

Well spotted. Actually, this is related to the fact module applications inside paths are not properly typed.
So I fixed this by calling Typemode.type_module on all module application nodes when translating a type. This should actually solve other strange behaviors where invalid type paths were accepted (for instance, applications of generative functors).

# module F() = struct type t end;;
module F : functor () -> sig type t end
# module M = struct end;;
module M : sig  end
# type t = F(M).t;;
Error: The functor F is generative, it cannot be applied in type expressions

Concerning your doubts about checking after substitution, I think this is perfectly fine from a theoretical point of view. Usually, we just cannot do that because of abstraction and separate compilation. But in the case of recursive modules, what you see is what you get. All recursive modules must be declared simultaneously and typed together. Actually, one could say that what I did with Keiko is similar: assume that we know the types or aliases of all the recursive modules, and from that check well-foundedness. The difference is that, in presence of abstraction, one has to redo it every time something is made concrete.
A modular approach, where this check is done only once, would be much more complex: we would need a way to annotate types and modules with their contractiveness for instance. And it would be less precise: we would end up rejecting perfectly safe programs.
Another way to see it, is that it is actually a lucky case, where we can check the well-foundedness exactly, at the cost of a few re-computations.

A question is whether some other checks could benefit from this approach. Something that I have delayed for a long time is recomputing variances after functor applications. There could be some others, but all this strongly depends on working only at the level of types: we cannot break abstraction, so this is only about cases where making thing concrete changes some behavior.

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jun 1, 2018

Oops, the error message shows that actually the problem with generative functors was already fixed :)
Yet, this change ensures that all checks done during typing of module applications are done for type paths too.
Now, I have to find a way to improve the error messages. Now, the physical location is more or less correct, but the logical location may be hard to relate to the code.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Jun 6, 2018

This is still clearly not sufficient. You'll need to re-check every type recursively through every functor application. For example:

# module type T = sig type t end;;
module type T = sig type t end

# module Fix(F:(T -> T)) = struct
     module rec Fixed : T with type t = F(Fixed).t = F(Fixed)
  end;;
module Fix :
  functor (F : T -> T) ->
    sig module rec Fixed : sig type t = F(Fixed).t end end

# module Id (X:sig type t end) = struct type t = X.t end;;
module Id : functor (X : sig type t end) -> sig type t = X.t end

# module Foo (F : T -> T) = struct
    let f (x : Fix(F).Fixed.t) = x
  end;;
module Foo :
  functor (F : T -> T) -> sig val f : Fix(F).Fixed.t -> Fix(F).Fixed.t end

# module M = Foo(Id);;
module M : sig val f : Fix(Id).Fixed.t -> Fix(Id).Fixed.t end

# M.f 5;;
Fatal error: exception Stack overflow

I really don't think this is a good approach to solving this issue. We should just be conservatively assuming that the the results of functor applications might be equal to any of the types passed in to them as arguments.

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jun 7, 2018

Thank you for this new example. Indeed, by commuting one more abstraction, one sees that we would have to recheck all types (or rather all modules in type constructor paths). This is doable, but might be costly.

I'd still very much want to see an example of unsoundness before introducing a drastic restriction such as assuming the non-contractiveness of all abstract functors, as this would prevent the definition of useful functors building fixpoints. I do not exclude that such unsoundness exists, maybe doing something like MPR#5343.

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jun 7, 2018

Actually, there is an easy enough solution: call check_recmod_typedecls when accessing a path containing an application in Env.components_of_functor_appl. Since this application is cached, this shouldn't be expensive. This is what I was going to do, before wrongly thinking that it was enough to do it in Typetexp. There are drawbacks: the error may be delayed until we access this path, and it is not easy to get a location, so it may be good to keep an early check where possible. But it seems still better than just getting a stack overflow.

Do you still see cases that wouldn't be caught by that approach?

Another approach, conform with what is done for the module system, is to accept that since type checking is undecidable, we may sometimes have non-termination...

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jun 8, 2018

I implemented it, and improved the error messages.

@garrigue garrigue force-pushed the garrigue:fix_mpr7726a branch from dd5e55b to 4491247 Jun 23, 2018

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jun 23, 2018

I've rebased again and made sure the cost is reasonable by only building the environment when needed.
It's ready for a new review.

@dra27
Copy link
Contributor

left a comment

Small correction needed to ..gitattributes or #1148 will complain...

module T1 = Fix(functor (X:sig type t end) -> struct type t = X.t option end);;
[%%expect{|
Line _, characters 12-77:
module T1 = Fix(functor (X:sig type t end) -> struct type t = X.t option end);;

This comment has been minimized.

Copy link
@dra27

dra27 Jun 30, 2018

Contributor

This over-long line I think is unavoidable, so this test file needs an ocaml-typo=long-line entry in .gitattributes

This comment has been minimized.

Copy link
@garrigue

garrigue Jul 7, 2018

Author Contributor

Adding long-line was not enough. I also had to add missing-header. Looks like adding an exception discards the previous ones.

garrigue added some commits Mar 23, 2018

Fix MPR#7726
* Check module applications when translating types in Typetexp
* Check all results of functor applications lazily
* Reduce the overhead of checking module types by building the environment lazily

@garrigue garrigue force-pushed the garrigue:fix_mpr7726a branch from 48c1d83 to 4d01385 Jul 2, 2018

@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jul 7, 2018

Since this PR has changed quite a bit since its first version, here is an explanation of the goal, the choices, and the limitations.

As stated before, the fundamental problem is that when a recursive module is defined inside a functor, the well-foundedness check for types sees abstract types coming from functor arguments as external types, and assumes that they cannot cause loops. However, when one of the functor arguments is a functor, and it is given one of the modules in the recursion as argument, it may actually return an internal type from this module, and by instantiation the functor one could create a loop.

My understanding is that, contrary to undetected non-contractiveness, which can be used to equate two arbitrary types, and break soundness, this problem only can cause non-termination of the type-checker (usually stack overflow). But of course, even if we know that this is not satisfactory.

A fix would be to be conservative, and just say that indeed any type returned by the argument functor can be any type from the modules it was given, rejecting all programs where this could result in ill-foundedness.

module type T = sig type t end
module Fix(F:(T -> T)) = struct
  module rec Fixed : T with type t = F(Fixed).t = F(Fixed)
end;;
module F1(X : T) = struct type t = X.t end;;
module M1 = Fix(F1);;

Here trying to expand M1.Fixed.t would cause a stack overflow, so we would end up rejecting Fix as potentially dangerous.

However, this restriction is actually pretty drastic. It prohibits safe uses of the Fix functor.

module F2(X:T) = struct type t = Z | S of X.t end;;
module M2 = Fix(F2);;
let x : M2.Fixed.t = S Z;;

The problem here is that, due to the contractiveness condition, it is already impossible to have a functor construct the fixpoint of a parameterized type given as argument, so that the above Fix functor is actually the only way to create such a recursive type (without adding a spurious closing constructor, or using polymorphic variants).

An alternative approach is to delay the check until the application of the outside functor. However, as @lpw25 pointed, since functor applications can appear inside type constructors, this requires checking all the types exported by (almost) any functor for all functor applications included in their expansion.

What this PR implements is a lazy version of this alternative approach. We still re-check recursive modules in the result of functor applications, in module expressions and type annotations, but we delay the check of functor applications inside type constructors to their lookup in the environment (which is cached, and already applies a substitution to the result type of the functor). The cost of this check is further reduced by constructing the internal environment lazily during this check, i.e. the cost only appears when the functor return type contains recursive modules.

Pros:

  • No new restriction other than strictly necessary.
  • We keep much freedom in the building of functor fixpoints.
  • The cost is low (hopefully dominated by substitution).
  • This is better than a stack overflow.

Cons:

  • The applicability of a functor to an argument does not only depend on the types of the formal and concrete arguments, but also on the type of the result of the functor.
  • Due to the laziness of the implementation, errors can be delayed further, until a type is really used.
  • As a result, the error message bears the location of the use, rather than that of the functor application.
  • More generally, since in general we do not assume that lookup may uncover an error, errors may be reported or not in a seemingly arbitrary way (much as what happens when there is no cmi for a module).

My opinion is that, since recursive modules are still an experimental feature, we can live with these limitations.

Note that the last 3 points could be alleviated by checking all constructor paths in principal mode.
On the other hand, one could argue that the laziness is an advantage, as one can discard the ill-founded types with an explicit signature. So it might be nicer to eagerly check the paths only in the resulting binding.

Of course, it might still be worth exploring the first approach, to see whether we really need to keep all this expressiveness.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Jul 7, 2018

Thanks for the full explanation. I'll review the code some time this week.

@lpw25

This comment has been minimized.

Copy link
Contributor

commented Jul 12, 2018

Right, I've read the code and I think that it does what it is intended to and that it will prevent the stack overflows from the MPR.

Against my better judgement I'm going to approve this PR, based on this reasoning:

My opinion is that, since recursive modules are still an experimental feature, we can live with these limitations.

I really hope that we can remove this code at some point and replace it with something more principled, but for an experimental feature like recursive modules -- which already have plenty of ugly corner cases -- I think it is fine like this for now.

My only remaining concern is that it might adversely effect the compile-time performance of code bases with a lot of functors (e.g. Jane Street's). I think it is probably fine, and I'll do a general check on the performance of the compiler on our code base before 4.08 is released so we'll see then if there are any issues.

@lpw25

lpw25 approved these changes Jul 12, 2018

@garrigue garrigue merged commit 7aa377a into ocaml:trunk Jul 12, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@garrigue

This comment has been minimized.

Copy link
Contributor Author

commented Jul 12, 2018

Thank you Leo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.