Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type_let: be more careful generalizing parts of the pattern #2317

Merged
merged 3 commits into from Mar 25, 2019

Conversation

trefis
Copy link
Contributor

@trefis trefis commented Mar 12, 2019

This fixes a bug noticed when trying to build merlin on 4.08: bindings to polymorphic record fields introduced by lets of expansive expressions are not polymorphic anymore (they were in 4.07).
See the example added by the first commit.

In #1748, we removed some calls to generalize from type_pat, assuming that they were needed only for the check that followed in the next few lines. Indeed, the variables would get generalized later from type_cases, so there is no need to generalize then and there.

However, in the case of a let, since we're first typing the pattern and then the expression, we might end up calling generalize_expansive on that binding before calling generalize, which will then do nothing.

The issue could be fix by reintroducing the call to generalize in type_pat, but that felt a bit weird. Also, it didn't answer the question: why is generalize_expansive called on the type of the pattern variable? Question which can also be read as: "why is generalize_expansive called on every subpattern?".

As a result, the patch proposed here ends up "cleaning up" the generalization story inside type_let in a way that is very similar to #1745 (the subtleties come from the fact that the binding might be recursive).

And then some tests (the first two real tests of testsuite/tests/typing-warnings/application.ml) started failing, because the toplevel/expect_test has some special handling of let _ =, and so some plumbing was added to make sure the type which is printed is still generic (search for "if toplevel").

@trefis trefis changed the title Fix poly rec pattern type_let: be more careful generalizing parts of the pattern Mar 12, 2019
@gasche
Copy link
Member

gasche commented Mar 13, 2019

When reading this PR description and code I was very confused as to why the interactive toplevel would need special a type-checking path. By toplevel you actually mean "at the toplevel of a module", or said otherwise "as a structure item rather than local let..in", which makes much more sense. The reason why we cannot see where ~toplevel:true is set in the diff is that the rest of the type-checker plumbing would already pass this flag around, for other reasons.

@trefis
Copy link
Contributor Author

trefis commented Mar 13, 2019

No, I really mean the toplevel as in "REPL", and it does because of this: https://github.com/ocaml/ocaml/blob/trunk/toplevel/toploop.ml#L286

@trefis
Copy link
Contributor Author

trefis commented Mar 13, 2019

Btw: I could move the changes related to the toplevel to a separate commit @gasche, it would probably produce a nicer history.

Also, I don't think I said it explicitly, but I think this should go in 4.08.

@trefis
Copy link
Contributor Author

trefis commented Mar 15, 2019

Note: @lpw25 and I looked at this/decided on the fix together, so he can't really be a reviewer of this PR.
I think either @gasche or @garrigue will have to look at this.

@gasche
Copy link
Member

gasche commented Mar 15, 2019

I don't feel qualified to review this PR. One person that I think would have useful things to say about this is François @fpottier Pottier (I remember discussing generalization of patterns with him, for example during Jonathan Protzenko's M2 internship), but given that he is not involved in the nitty-gritty details of the implementation that may require a bit more high-level presentation of the situation. (And I don't know if he would be interested.)

@trefis trefis requested a review from garrigue March 15, 2019 12:42
@garrigue
Copy link
Contributor

Ok, I'll look at this next week.

@garrigue
Copy link
Contributor

I'm not sure I understand your argument for not reverting to the previous behavior of iterating on all pattern types, rather than just types associated to pattern variables (I think I had missed this change).
The cost of this iteration is essentially amortized, so I don't see it introducing a significant overhead.
Moreover, remember that for many manipulations on types, it is essential that the levels go down inside the type, so preserving this invariant in patterns has some value in itself.

@trefis
Copy link
Contributor Author

trefis commented Mar 19, 2019

I'm not sure I understand your argument for not reverting to the previous behavior of iterating on all pattern types

I'm not sure what you're referring to. If you are saying I should revert #1745, then there is some misunderstanding, because that is not the source of the issue here.

The issue manifested when we removed this line and the one that follows it, and is IMO due to this line that I remove in the present PR.
I believe the question I asked in my first message still stands: why are we calling generalize_expansive on subpatterns? While I can understand why we call it on the top one, calling it on subpatterns seems wrong.


To explain again why #1745 was needed, and why I think this PR has the right approach, let me start by reacting to:

Moreover, remember that for many manipulations on types, it is essential that the levels go down inside the type, so preserving this invariant in patterns has some value in itself.

That is correct. However I believe it is expected that people fetching types from cmt files will call correct_levels before doing any manipulation of these types.

Indeed, while I'm not sure whether the invariant you stated was already violated for patterns (though I'd guess it was), I can say it definitely is for expressions, take this example:

type 'a t = { x: 'a list }

let l = { x = [] }.x

If you look at the type of [], then you can see it is:

exp_type =
 {Types.desc =
   Types.Tconstr (Path.Pident list/9,
    [{Types.desc =
       Types.Tlink
        {Types.desc = Types.Tvar None; level = 100000000;
         scope = None; id = 786};
      level = 1004; scope = None; id = 781}],
    {contents = Types.Mnil});
  level = 1005; scope = None; id = 782};

The only part of the type which is generalized is the part that survives the expression.

As a result of #1745 and the present PR, we would be in a similar state for patterns: the only parts that are generalized, are the ones that escape the patterns, that is: the types of the pattern variables.

The reason why we want to be in this state for patterns (that is: the initial reason for #1745), is that before generalizing any pattern type, we're checking that it can actually escape the pattern, which might not always be the case: remember that we now allow GADTs under or-patterns.
Take an example like this one:

type 'a t =
  | I : int * bool -> int t
  | C: char * bool -> char t

let f (type a) (x : a t) =
  match x with
  | I (3, b) 
  | C('c', b) -> b 
  | _ -> false

Here b is the only thing that survives the pattern, and it is allowed to do so, since its type is simply bool.
However, 3 has type a = int, and that is clearly scoped to the or-pattern, it wouldn't be allowed to escape. So if we were doing iter_pattern (as was the case prior to #1745), we would get an error.

In type_let, as there can be no gadts introducing incompatible equations under or-pattern, we could in theory keep calling iter_pattern. But it seems better not to, if only for consistency.

@garrigue
Copy link
Contributor

I'm of course aware of the situation for expressions, where the assumption is that one will not reuse the internal type of an expression after type_exp left it. Until now, this assumption didn't apply to patterns, due to the iteration.

Your explanation about why one cannot iterate on patterns with generalize_expansive any more is convincing. I also understand why it would be problematic if you allow instantiation of record fields.

However, I'm still wondering why you wouldn't keep the following iterative call to generalize, which should be no problem and will at least ensure that the levels are properly kept.
And the addition of the ~toplevel flag doesn't seem so nice to me.

As to why there was a call to generalize_expansive in the first place, I think it just comes from its replacing an iterative call to make_nongen (see ocaml 2.04 for instance). Maybe @vouillon knows the reason (there was no iteration in csl-1.15). This iterative call was sound, whether it was overly restrictive may be another question.

@trefis
Copy link
Contributor Author

trefis commented Mar 21, 2019

However, I'm still wondering why you wouldn't keep the following iterative call to generalize, which should be no problem and will at least ensure that the levels are properly kept.

Well, it might not be a problem... as long as you also call iter_pattern_variables_types, because iter_pattern on its own is not complete: it won't allow you to reach the type of "as-variables".

So at this point, iter_pattern (fun p -> generalize p.pat_type) only use would be to preserve an invariant which we don't care about for other parts of the AST.
Why can't we make the assumption that no one will reuse the internal type of a pattern after type_cases or type_let left it? Looking at the rest of the compiler, it seems like pat_type is used in exactly the same way as exp_type, that is: it is passed to Typeopt.value_kind.
So, do we actually care?

Sure, that invariant is cute, but I'm honestly not very enthusiastic about adding gratuitous operations just to preserve an invariant that isn't documented anywhere.

And the addition of the ~toplevel flag doesn't seem so nice to me.

Yes, I'm not a fan either.
I'd be happy to remove it and just generalize the type of the expression unconditionally.

@garrigue
Copy link
Contributor

I agree that there is not clear reason that the behavior with patterns should be different from that with expressions. I'm always afraid of removing an invariant without knowing why it was there in the first place, but if you have done enough checks and tests, then it is probably fine. At this point it is hard to justify making the code more complex than it was before.

Then I think I would be happy with doing as you suggest, i.e. replacing the toplevel case by a comment that this generalization is required for toplevel lets.

@trefis
Copy link
Contributor Author

trefis commented Mar 25, 2019

I did the change we talked about, and rebased on top of trunk.

I'll wait for your explicit approval before merging.
Thanks for the review!

@damiendoligez (or @gasche as substitute): since this fixes a regression I'm planning to cherry-pick to 4.08.

Copy link
Contributor

@garrigue garrigue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine to merge. I would prefer a shorter comment, but I leave that to you.

typing/typecore.ml Show resolved Hide resolved
@gasche gasche merged commit 1211b9c into ocaml:trunk Mar 25, 2019
gasche added a commit that referenced this pull request Mar 25, 2019
type_let: be more careful generalizing parts of the pattern

(cherry picked from commit 1211b9c)
@gasche
Copy link
Member

gasche commented Mar 25, 2019

I merged and rebased on 4.08 ( 8c85ca6 ).

(I had the impression that the comment change had been done, but in fact it looks like it wasn't. Oh well. @trefis, feel free to push directly to trunk to fix that.)

@trefis
Copy link
Contributor Author

trefis commented Mar 26, 2019

Thanks @gasche.
I pushed the comment change on trunk (I don't think we need to cherry-pick that to 4.08).

@trefis trefis deleted the fix-poly-rec-pattern branch March 26, 2019 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants