Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type_let: be more careful generalizing parts of the pattern #2317

Merged
merged 3 commits into from Mar 25, 2019

Conversation

Projects
None yet
4 participants
@trefis
Copy link
Contributor

commented Mar 12, 2019

This fixes a bug noticed when trying to build merlin on 4.08: bindings to polymorphic record fields introduced by lets of expansive expressions are not polymorphic anymore (they were in 4.07).
See the example added by the first commit.

In #1748, we removed some calls to generalize from type_pat, assuming that they were needed only for the check that followed in the next few lines. Indeed, the variables would get generalized later from type_cases, so there is no need to generalize then and there.

However, in the case of a let, since we're first typing the pattern and then the expression, we might end up calling generalize_expansive on that binding before calling generalize, which will then do nothing.

The issue could be fix by reintroducing the call to generalize in type_pat, but that felt a bit weird. Also, it didn't answer the question: why is generalize_expansive called on the type of the pattern variable? Question which can also be read as: "why is generalize_expansive called on every subpattern?".

As a result, the patch proposed here ends up "cleaning up" the generalization story inside type_let in a way that is very similar to #1745 (the subtleties come from the fact that the binding might be recursive).

And then some tests (the first two real tests of testsuite/tests/typing-warnings/application.ml) started failing, because the toplevel/expect_test has some special handling of let _ =, and so some plumbing was added to make sure the type which is printed is still generic (search for "if toplevel").

@trefis trefis changed the title Fix poly rec pattern type_let: be more careful generalizing parts of the pattern Mar 12, 2019

@gasche

This comment has been minimized.

Copy link
Member

commented Mar 13, 2019

When reading this PR description and code I was very confused as to why the interactive toplevel would need special a type-checking path. By toplevel you actually mean "at the toplevel of a module", or said otherwise "as a structure item rather than local let..in", which makes much more sense. The reason why we cannot see where ~toplevel:true is set in the diff is that the rest of the type-checker plumbing would already pass this flag around, for other reasons.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Mar 13, 2019

No, I really mean the toplevel as in "REPL", and it does because of this: https://github.com/ocaml/ocaml/blob/trunk/toplevel/toploop.ml#L286

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Mar 13, 2019

Btw: I could move the changes related to the toplevel to a separate commit @gasche, it would probably produce a nicer history.

Also, I don't think I said it explicitly, but I think this should go in 4.08.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2019

Note: @lpw25 and I looked at this/decided on the fix together, so he can't really be a reviewer of this PR.
I think either @gasche or @garrigue will have to look at this.

@gasche

This comment has been minimized.

Copy link
Member

commented Mar 15, 2019

I don't feel qualified to review this PR. One person that I think would have useful things to say about this is François @fpottier Pottier (I remember discussing generalization of patterns with him, for example during Jonathan Protzenko's M2 internship), but given that he is not involved in the nitty-gritty details of the implementation that may require a bit more high-level presentation of the situation. (And I don't know if he would be interested.)

@trefis trefis requested a review from garrigue Mar 15, 2019

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 15, 2019

Ok, I'll look at this next week.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

I'm not sure I understand your argument for not reverting to the previous behavior of iterating on all pattern types, rather than just types associated to pattern variables (I think I had missed this change).
The cost of this iteration is essentially amortized, so I don't see it introducing a significant overhead.
Moreover, remember that for many manipulations on types, it is essential that the levels go down inside the type, so preserving this invariant in patterns has some value in itself.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

I'm not sure I understand your argument for not reverting to the previous behavior of iterating on all pattern types

I'm not sure what you're referring to. If you are saying I should revert #1745, then there is some misunderstanding, because that is not the source of the issue here.

The issue manifested when we removed this line and the one that follows it, and is IMO due to this line that I remove in the present PR.
I believe the question I asked in my first message still stands: why are we calling generalize_expansive on subpatterns? While I can understand why we call it on the top one, calling it on subpatterns seems wrong.


To explain again why #1745 was needed, and why I think this PR has the right approach, let me start by reacting to:

Moreover, remember that for many manipulations on types, it is essential that the levels go down inside the type, so preserving this invariant in patterns has some value in itself.

That is correct. However I believe it is expected that people fetching types from cmt files will call correct_levels before doing any manipulation of these types.

Indeed, while I'm not sure whether the invariant you stated was already violated for patterns (though I'd guess it was), I can say it definitely is for expressions, take this example:

type 'a t = { x: 'a list }

let l = { x = [] }.x

If you look at the type of [], then you can see it is:

exp_type =
 {Types.desc =
   Types.Tconstr (Path.Pident list/9,
    [{Types.desc =
       Types.Tlink
        {Types.desc = Types.Tvar None; level = 100000000;
         scope = None; id = 786};
      level = 1004; scope = None; id = 781}],
    {contents = Types.Mnil});
  level = 1005; scope = None; id = 782};

The only part of the type which is generalized is the part that survives the expression.

As a result of #1745 and the present PR, we would be in a similar state for patterns: the only parts that are generalized, are the ones that escape the patterns, that is: the types of the pattern variables.

The reason why we want to be in this state for patterns (that is: the initial reason for #1745), is that before generalizing any pattern type, we're checking that it can actually escape the pattern, which might not always be the case: remember that we now allow GADTs under or-patterns.
Take an example like this one:

type 'a t =
  | I : int * bool -> int t
  | C: char * bool -> char t

let f (type a) (x : a t) =
  match x with
  | I (3, b) 
  | C('c', b) -> b 
  | _ -> false

Here b is the only thing that survives the pattern, and it is allowed to do so, since its type is simply bool.
However, 3 has type a = int, and that is clearly scoped to the or-pattern, it wouldn't be allowed to escape. So if we were doing iter_pattern (as was the case prior to #1745), we would get an error.

In type_let, as there can be no gadts introducing incompatible equations under or-pattern, we could in theory keep calling iter_pattern. But it seems better not to, if only for consistency.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

I'm of course aware of the situation for expressions, where the assumption is that one will not reuse the internal type of an expression after type_exp left it. Until now, this assumption didn't apply to patterns, due to the iteration.

Your explanation about why one cannot iterate on patterns with generalize_expansive any more is convincing. I also understand why it would be problematic if you allow instantiation of record fields.

However, I'm still wondering why you wouldn't keep the following iterative call to generalize, which should be no problem and will at least ensure that the levels are properly kept.
And the addition of the ~toplevel flag doesn't seem so nice to me.

As to why there was a call to generalize_expansive in the first place, I think it just comes from its replacing an iterative call to make_nongen (see ocaml 2.04 for instance). Maybe @vouillon knows the reason (there was no iteration in csl-1.15). This iterative call was sound, whether it was overly restrictive may be another question.

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Mar 21, 2019

However, I'm still wondering why you wouldn't keep the following iterative call to generalize, which should be no problem and will at least ensure that the levels are properly kept.

Well, it might not be a problem... as long as you also call iter_pattern_variables_types, because iter_pattern on its own is not complete: it won't allow you to reach the type of "as-variables".

So at this point, iter_pattern (fun p -> generalize p.pat_type) only use would be to preserve an invariant which we don't care about for other parts of the AST.
Why can't we make the assumption that no one will reuse the internal type of a pattern after type_cases or type_let left it? Looking at the rest of the compiler, it seems like pat_type is used in exactly the same way as exp_type, that is: it is passed to Typeopt.value_kind.
So, do we actually care?

Sure, that invariant is cute, but I'm honestly not very enthusiastic about adding gratuitous operations just to preserve an invariant that isn't documented anywhere.

And the addition of the ~toplevel flag doesn't seem so nice to me.

Yes, I'm not a fan either.
I'd be happy to remove it and just generalize the type of the expression unconditionally.

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2019

I agree that there is not clear reason that the behavior with patterns should be different from that with expressions. I'm always afraid of removing an invariant without knowing why it was there in the first place, but if you have done enough checks and tests, then it is probably fine. At this point it is hard to justify making the code more complex than it was before.

Then I think I would be happy with doing as you suggest, i.e. replacing the toplevel case by a comment that this generalization is required for toplevel lets.

@trefis trefis force-pushed the trefis:fix-poly-rec-pattern branch from 511a4ac to 433c13d Mar 25, 2019

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Mar 25, 2019

I did the change we talked about, and rebased on top of trunk.

I'll wait for your explicit approval before merging.
Thanks for the review!

@damiendoligez (or @gasche as substitute): since this fixes a regression I'm planning to cherry-pick to 4.08.

@garrigue
Copy link
Contributor

left a comment

Fine to merge. I would prefer a shorter comment, but I leave that to you.

Show resolved Hide resolved typing/typecore.ml

@gasche gasche merged commit 1211b9c into ocaml:trunk Mar 25, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

gasche added a commit that referenced this pull request Mar 25, 2019

Merge pull request #2317 from trefis/fix-poly-rec-pattern
type_let: be more careful generalizing parts of the pattern

(cherry picked from commit 1211b9c)
@gasche

This comment has been minimized.

Copy link
Member

commented Mar 25, 2019

I merged and rebased on 4.08 ( 8c85ca6 ).

(I had the impression that the comment change had been done, but in fact it looks like it wasn't. Oh well. @trefis, feel free to push directly to trunk to fix that.)

@trefis

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

Thanks @gasche.
I pushed the comment change on trunk (I don't think we need to cherry-pick that to 4.08).

@trefis trefis deleted the trefis:fix-poly-rec-pattern branch Mar 26, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.