New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explanation for errors caused by a type constraint propagated from a keyword #1510

Merged
merged 3 commits into from Dec 21, 2017

Conversation

Projects
None yet
8 participants
@Armael
Contributor

Armael commented Dec 3, 2017

Another change extracted from PR #102 . This adds specific explanation for type errors that involve type information propagated by a keyword such as if, for, while...

For example,

if true then 42

highlights "42" and prints:

Error: This expression has type int but an expression was expected of type
         unit because it is the result of a conditional with no else branch

The idea for this patch comes from #102 but the implementation proposed here is different. The original patch changed the order of propagation in order to get more precise information -- but this doesn't interact well with non-principality AFAIU. The approach followed in this patch is as follows: in the type_expect function, we add an "explanation" optional argument. If unification fails between something and the expected type ty_expected, then this explanation (if present) explains why this type was expected (e.g. "because it was the body of a for-loop"). When type_expect calls itself recursively with a different expected type, the explanation is dropped.

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Dec 3, 2017

Member

Some more comments on the design choices:

The explanation type is actually a list of approximations of program contexts, that explain why we propagate a certain expected type. For example, you propagate bool when you go under the context if [_] then e1 else e2, where [_] denotes the hole of the sub-term that is going to be type-checked.

Arthur's implementation is careful to delay the unification of the expected type until the last possible moment. This implementation does not do it, which means that in some case the explanation will not be used while it should be. If you write if b then e1 else e2 and b was already determined to have type int, there will be an error when unifying b's type with bool, and the explanation will be given. But if b has an unknown point at this time, the unification will succeed and the unification will be dropped, while a different propagation order could have resulted in the error being detected there (and will be reliably detected there with Arthur's change-of-propagation-order approach).
One example is the term fun b -> if true then (if b then 1 else 2) else (print_int b): one of the two branches of if true will be type-checked first, and the refined explanation will or will not be used depending on the order (so if you also look at the term with inversed branch order, then one of them will have a nice error message and the other won't.)

The reasoning for having this feature even though the propagation order makes it more fragile than with Arthur's approach is:

  • It is already helpful in the cases where it works (and it works in a lot of situations that beginners may encounter, see the tests).
  • if we later decide to change the type propagation order along Arthur's line (and find a way to make it not break most programs), the work by Armaël in this PR can be reused as-is in this new world, it won't need big changes. So there is no clear reason to delay the merge of this part.

Finally: Armaël commented above that the explanation is not reused for recursive calls to the type-checker to a different expected type. Of course! The context of those sub-calls is different so it would be incorrect to reuse the explanation. The question is rather of why it is correct to reuse the explanation for the sub-calls with the same expected types. For example, if you have a program of the form if (a; b) then e1 else e2, then (a; b) will be type-checked with the if [_] then e1 else e2 explanation, and in fact b will be type-checked recursively with the same explanation, while technically its own context is rather if (a; [_]) then e1 else e2.

In the case of moving from if [_] to if (a; [_]), we considered that the error message for the if context was vague enough that this would not be confused. If b is highlighted with the explanation "this is the condition of an if", users will understand, even if technically (a; b) is the condition. Armaël and I reviewed each case of recursive sub-call with same expected type, and we considered that, each time, the approximation is reasonable. We could define a more refined type of type-forcing contexts (so the explanation would be eg. Sequence_right (If_condition) instead of just (If_condition), and print more precise messages, but we judged that it was not worth it.

This is obvious for if- and while-conditions of for-bounds for examples, which are always very small expressions anyway. It is less clear for the context while cond do [_] done, which forces unit. For example if you write

while cond do
  match foo with
  | K1 -> ()
  | K2 -> 2
  | K3 -> ()
done

then 2 will have the error explanation "... expected of type unit because it is the body of a while". (TODO @Armael: use "it is in the body of a while loop" instead.)

Member

gasche commented Dec 3, 2017

Some more comments on the design choices:

The explanation type is actually a list of approximations of program contexts, that explain why we propagate a certain expected type. For example, you propagate bool when you go under the context if [_] then e1 else e2, where [_] denotes the hole of the sub-term that is going to be type-checked.

Arthur's implementation is careful to delay the unification of the expected type until the last possible moment. This implementation does not do it, which means that in some case the explanation will not be used while it should be. If you write if b then e1 else e2 and b was already determined to have type int, there will be an error when unifying b's type with bool, and the explanation will be given. But if b has an unknown point at this time, the unification will succeed and the unification will be dropped, while a different propagation order could have resulted in the error being detected there (and will be reliably detected there with Arthur's change-of-propagation-order approach).
One example is the term fun b -> if true then (if b then 1 else 2) else (print_int b): one of the two branches of if true will be type-checked first, and the refined explanation will or will not be used depending on the order (so if you also look at the term with inversed branch order, then one of them will have a nice error message and the other won't.)

The reasoning for having this feature even though the propagation order makes it more fragile than with Arthur's approach is:

  • It is already helpful in the cases where it works (and it works in a lot of situations that beginners may encounter, see the tests).
  • if we later decide to change the type propagation order along Arthur's line (and find a way to make it not break most programs), the work by Armaël in this PR can be reused as-is in this new world, it won't need big changes. So there is no clear reason to delay the merge of this part.

Finally: Armaël commented above that the explanation is not reused for recursive calls to the type-checker to a different expected type. Of course! The context of those sub-calls is different so it would be incorrect to reuse the explanation. The question is rather of why it is correct to reuse the explanation for the sub-calls with the same expected types. For example, if you have a program of the form if (a; b) then e1 else e2, then (a; b) will be type-checked with the if [_] then e1 else e2 explanation, and in fact b will be type-checked recursively with the same explanation, while technically its own context is rather if (a; [_]) then e1 else e2.

In the case of moving from if [_] to if (a; [_]), we considered that the error message for the if context was vague enough that this would not be confused. If b is highlighted with the explanation "this is the condition of an if", users will understand, even if technically (a; b) is the condition. Armaël and I reviewed each case of recursive sub-call with same expected type, and we considered that, each time, the approximation is reasonable. We could define a more refined type of type-forcing contexts (so the explanation would be eg. Sequence_right (If_condition) instead of just (If_condition), and print more precise messages, but we judged that it was not worth it.

This is obvious for if- and while-conditions of for-bounds for examples, which are always very small expressions anyway. It is less clear for the context while cond do [_] done, which forces unit. For example if you write

while cond do
  match foo with
  | K1 -> ()
  | K2 -> 2
  | K3 -> ()
done

then 2 will have the error explanation "... expected of type unit because it is the body of a while". (TODO @Armael: use "it is in the body of a while loop" instead.)

@@ -49,7 +66,7 @@ val check_partial:
Location.t -> Typedtree.case list -> Typedtree.partial
val type_expect:
?in_function:(Location.t * type_expr) ->
Env.t -> Parsetree.expression -> type_expr -> Typedtree.expression
Env.t -> Parsetree.expression -> type_expected -> Typedtree.expression

This comment has been minimized.

@alainfrisch

alainfrisch Dec 4, 2017

Contributor

Wouldn't it be lighter to pass the explanation as an (optional) extra argument, instead of wrapping it in a record with the expected type?

@alainfrisch

alainfrisch Dec 4, 2017

Contributor

Wouldn't it be lighter to pass the explanation as an (optional) extra argument, instead of wrapping it in a record with the expected type?

This comment has been minimized.

@Armael

Armael Dec 4, 2017

Contributor

Well, that's what I was doing at first, but then @gasche suggested that I attach the explanation to the expected type...

@Armael

Armael Dec 4, 2017

Contributor

Well, that's what I was doing at first, but then @gasche suggested that I attach the explanation to the expected type...

This comment has been minimized.

@Armael

Armael Dec 4, 2017

Contributor

More precisely, commit 12d26f3 implements @gasche suggestion, so the commits before that correspond to what you mention.

@Armael

Armael Dec 4, 2017

Contributor

More precisely, commit 12d26f3 implements @gasche suggestion, so the commits before that correspond to what you mention.

This comment has been minimized.

@alainfrisch

alainfrisch Dec 4, 2017

Contributor

Ah ok, sorry, I did not see @gasche 's comment. I've a preference for the old version, but I don't mind if the new one is preferred. If we go for the record solution: (i) would it make sense to include in_function in it? (ii) perhaps use a function to build the record instead of using literal; it will simplify adding more optional fields later if needed.

@alainfrisch

alainfrisch Dec 4, 2017

Contributor

Ah ok, sorry, I did not see @gasche 's comment. I've a preference for the old version, but I don't mind if the new one is preferred. If we go for the record solution: (i) would it make sense to include in_function in it? (ii) perhaps use a function to build the record instead of using literal; it will simplify adding more optional fields later if needed.

This comment has been minimized.

@gasche

gasche Dec 4, 2017

Member

My intuition is that we will more towards more and more structured representations of the bidirectional information propagated during type-checking, and that having it packed in one value is the right way for nice code evolution.

@gasche

gasche Dec 4, 2017

Member

My intuition is that we will more towards more and more structured representations of the bidirectional information propagated during type-checking, and that having it packed in one value is the right way for nice code evolution.

This comment has been minimized.

@Armael

Armael Dec 5, 2017

Contributor

I just pushed a commit that provides a function to build the record as @alainfrisch suggested.

@Armael

Armael Dec 5, 2017

Contributor

I just pushed a commit that provides a function to build the record as @alainfrisch suggested.

@sliquister

This comment has been minimized.

Show comment
Hide comment
@sliquister

sliquister Dec 4, 2017

Contributor

I wonder if, in general, these messages talking about syntax will end up being undesirable in generated code.

I suppose they can be disabled by replacing [if e1 then e2] by [if e1 then e2 else ()], or [while b do e done] by [while (b : bool) do e; (); done] etc, so it's not too bad.

But with the rec one, the hints are going to be weird (not tested with an actual ppx, but the result should be the same), since it will say "you are missing the rec keyword" on a line that says type nonrec:

type sexp

type t (* forgot [@@deriving sexp_of] *)

module M = struct
  type nonrec t = t [@@deriving sexp_of]
  let sexp_of_t : sexp -> t = sexp_of_t
end

Error: Unbound value sexp_of_t
      Hint: You are probably missing the `rec' keyword on line 7.
Contributor

sliquister commented Dec 4, 2017

I wonder if, in general, these messages talking about syntax will end up being undesirable in generated code.

I suppose they can be disabled by replacing [if e1 then e2] by [if e1 then e2 else ()], or [while b do e done] by [while (b : bool) do e; (); done] etc, so it's not too bad.

But with the rec one, the hints are going to be weird (not tested with an actual ppx, but the result should be the same), since it will say "you are missing the rec keyword" on a line that says type nonrec:

type sexp

type t (* forgot [@@deriving sexp_of] *)

module M = struct
  type nonrec t = t [@@deriving sexp_of]
  let sexp_of_t : sexp -> t = sexp_of_t
end

Error: Unbound value sexp_of_t
      Hint: You are probably missing the `rec' keyword on line 7.
@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 4, 2017

Contributor

One way of addressing that issue would be to add a let nonrec construction; which would be used by ppx_deriving when called on a type nonrec. Then, the "missing rec" error message could be disabled on lec nonrec constructions.
I assume pull requests implementing this are welcome :-).

Contributor

Armael commented Dec 4, 2017

One way of addressing that issue would be to add a let nonrec construction; which would be used by ppx_deriving when called on a type nonrec. Then, the "missing rec" error message could be disabled on lec nonrec constructions.
I assume pull requests implementing this are welcome :-).

@sliquister

This comment has been minimized.

Show comment
Hide comment
@sliquister

sliquister Dec 5, 2017

Contributor

It's clearly not worth adding syntax for my concern. I suppose the rec hint could be disabled as well if necessary by adding some let _ = sexp_of_t before the real code.

Contributor

sliquister commented Dec 5, 2017

It's clearly not worth adding syntax for my concern. I suppose the rec hint could be disabled as well if necessary by adding some let _ = sexp_of_t before the real code.

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 5, 2017

Contributor

I had forgotten to add an explanation for the left-hand side of a sequence. The last commit implements this.

Contributor

Armael commented Dec 5, 2017

I had forgotten to add an explanation for the left-hand side of a sequence. The last commit implements this.

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 5, 2017

Contributor

I tweaked the formatting to insert a break hint before the "because it is ..." explanation. This is probably better that no break hint; however now the messages in the tests look a bit weird, because of the staggering produced by the structural hovbox that contains the messages.

Contributor

Armael commented Dec 5, 2017

I tweaked the formatting to insert a break hint before the "because it is ..." explanation. This is probably better that no break hint; however now the messages in the tests look a bit weird, because of the staggering produced by the structural hovbox that contains the messages.

Show outdated Hide outdated typing/typecore.ml
@Drup

This comment has been minimized.

Show comment
Hide comment
@Drup

Drup Dec 7, 2017

Contributor

Isn't it one of those cases where we should check if the associated expression is ghost before using it as an explanation ?

Contributor

Drup commented Dec 7, 2017

Isn't it one of those cases where we should check if the associated expression is ghost before using it as an explanation ?

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Dec 8, 2017

Member

@Drup: we agree with that proposal, thanks! @Armael plans to implement it.

@alainfrisch would you be willing to "drive" this issue, and in particular approve it if you do approve of the proposed change? (I have handled previous change proposals by Armaël but I'm basically a co-author of this one so I don't think it would be best.)

Member

gasche commented Dec 8, 2017

@Drup: we agree with that proposal, thanks! @Armael plans to implement it.

@alainfrisch would you be willing to "drive" this issue, and in particular approve it if you do approve of the proposed change? (I have handled previous change proposals by Armaël but I'm basically a co-author of this one so I don't think it would be best.)

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Dec 8, 2017

Contributor

would you be willing to "drive" this issue

I only looked superficially to the implementation, and don't have any opinion on the proposal itself. I think @gasche would do a better job here (as often, and unfortunately for him).

Contributor

alainfrisch commented Dec 8, 2017

would you be willing to "drive" this issue

I only looked superficially to the implementation, and don't have any opinion on the proposal itself. I think @gasche would do a better job here (as often, and unfortunately for him).

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 10, 2017

Contributor

I added a commit that (I think) implements @Drup suggestion.

Contributor

Armael commented Dec 10, 2017

I added a commit that (I think) implements @Drup suggestion.

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Dec 10, 2017

Member

(I believe this commit is correct as well.)

The search for a voluntary review goes on: @sliquister, @Drup, would one of you be willing to review this patch and formally "approve" it if you believe it is correct?

Member

gasche commented Dec 10, 2017

(I believe this commit is correct as well.)

The search for a voluntary review goes on: @sliquister, @Drup, would one of you be willing to review this patch and formally "approve" it if you believe it is correct?

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Dec 12, 2017

Member

( @let-def, maybe you would be interested in reviewing this as well? )

Member

gasche commented Dec 12, 2017

( @let-def, maybe you would be interested in reviewing this as well? )

@Octachron

This comment has been minimized.

Show comment
Hide comment
@Octachron

Octachron Dec 12, 2017

Contributor

I am wondering if ? should count as a keyword in:

let f g = g ?x:1 ();;

Error: This expression has type int but an expression was expected of type
'a option

Contributor

Octachron commented Dec 12, 2017

I am wondering if ? should count as a keyword in:

let f g = g ?x:1 ();;

Error: This expression has type int but an expression was expected of type
'a option

@let-def

The implementation looks good to me.
I don't have a strong opinion about the effect on end-users, though I believe it is a improvement.

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 13, 2017

Contributor

@let-def : thanks for your comments, I just added commits that address them (I think).

I am wondering if ? should count as a keyword in [optional arguments]

@Octachron : maybe -- though after a quick glance at the existing code, supporting that doesn't seem to be an simple extension of the current patch. So I think it better belongs in a separate PR.

Contributor

Armael commented Dec 13, 2017

@let-def : thanks for your comments, I just added commits that address them (I think).

I am wondering if ? should count as a keyword in [optional arguments]

@Octachron : maybe -- though after a quick glance at the existing code, supporting that doesn't seem to be an simple extension of the current patch. So I think it better belongs in a separate PR.

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Dec 13, 2017

Member

There is a minor typo your last commit message: cnotext -> context.

(I wonder if your comments shouldn't go in the .mli rather the .ml. I think that both are fine, but maybe you could manage to start a trend of actually, you know, documenting interfaces, and that would be Big!)

Member

gasche commented Dec 13, 2017

There is a minor typo your last commit message: cnotext -> context.

(I wonder if your comments shouldn't go in the .mli rather the .ml. I think that both are fine, but maybe you could manage to start a trend of actually, you know, documenting interfaces, and that would be Big!)

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 13, 2017

Contributor

Good point. I amended the last commit and put the comments in the .mli rather than in the .ml.

Contributor

Armael commented Dec 13, 2017

Good point. I amended the last commit and put the comments in the .mli rather than in the .ml.

@Octachron

This comment has been minimized.

Show comment
Hide comment
@Octachron

Octachron Dec 13, 2017

Contributor

maybe -- though after a quick glance at the existing code, supporting that doesn't seem to be an simple extension of the current patch. So I think it better belongs in a separate PR.

Thanks for having a look, I was just asking in case it was an one-liner extension.

Contributor

Octachron commented Dec 13, 2017

maybe -- though after a quick glance at the existing code, supporting that doesn't seem to be an simple extension of the current patch. So I think it better belongs in a separate PR.

Thanks for having a look, I was just asking in case it was an one-liner extension.

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 14, 2017

Contributor

I rebased on top of trunk, to benefit from the super awesome error highlighting in expect tests.

Contributor

Armael commented Dec 14, 2017

I rebased on top of trunk, to benefit from the super awesome error highlighting in expect tests.

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 21, 2017

Contributor

Thanks for the approval @let-def ! Should I rebase on top of trunk and clean-up the history now?

Contributor

Armael commented Dec 21, 2017

Thanks for the approval @let-def ! Should I rebase on top of trunk and clean-up the history now?

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Dec 21, 2017

Member

Yes. (I didn't ping explicitly as I didn't know how much OCaml you wanted during your holidays.)

Member

gasche commented Dec 21, 2017

Yes. (I didn't ping explicitly as I didn't know how much OCaml you wanted during your holidays.)

@Armael

This comment has been minimized.

Show comment
Hide comment
@Armael

Armael Dec 21, 2017

Contributor

Rebased.

Contributor

Armael commented Dec 21, 2017

Rebased.

@gasche gasche merged commit c8c4fb5 into ocaml:trunk Dec 21, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@gasche gasche referenced this pull request Mar 24, 2018

Open

Improved error messages #102

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment