Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explanation for errors caused by a type constraint propagated from a keyword #1510

Merged
merged 3 commits into from Dec 21, 2017

Conversation

@Armael
Copy link
Member

@Armael Armael commented Dec 3, 2017

Another change extracted from PR #102 . This adds specific explanation for type errors that involve type information propagated by a keyword such as if, for, while...

For example,

if true then 42

highlights "42" and prints:

Error: This expression has type int but an expression was expected of type
         unit because it is the result of a conditional with no else branch

The idea for this patch comes from #102 but the implementation proposed here is different. The original patch changed the order of propagation in order to get more precise information -- but this doesn't interact well with non-principality AFAIU. The approach followed in this patch is as follows: in the type_expect function, we add an "explanation" optional argument. If unification fails between something and the expected type ty_expected, then this explanation (if present) explains why this type was expected (e.g. "because it was the body of a for-loop"). When type_expect calls itself recursively with a different expected type, the explanation is dropped.

@Armael Armael force-pushed the Armael:type_expect_unif_explanation branch from cf803a1 to 3b3ddc7 Dec 3, 2017
bool because it is the condition of a if-statement
|}];;

fun b -> if true then (if b then ()) else (print_int b);;

This comment has been minimized.

@gasche

gasche Dec 3, 2017
Member

You should explicit point out that this is a case where the refined message doesn't show up because of propagation order.

let loc = sexp.pexp_loc in
(* Record the expression type before unifying it with the expected type *)
let unify_with_explanation = unify_with_explanation explanation in

This comment has been minimized.

@gasche

gasche Dec 3, 2017
Member

This should be with_explanation


and type_expect_
?in_function ?(recarg=Rejected) ?explanation
env sexp ty_expected =

This comment has been minimized.

@gasche

gasche Dec 3, 2017
Member

Instead of a separate explanation argument, have you considered carrying the information within ty_expected, turning it into a pair of a type and an optional explanation?

@gasche
Copy link
Member

@gasche gasche commented Dec 3, 2017

Some more comments on the design choices:

The explanation type is actually a list of approximations of program contexts, that explain why we propagate a certain expected type. For example, you propagate bool when you go under the context if [_] then e1 else e2, where [_] denotes the hole of the sub-term that is going to be type-checked.

Arthur's implementation is careful to delay the unification of the expected type until the last possible moment. This implementation does not do it, which means that in some case the explanation will not be used while it should be. If you write if b then e1 else e2 and b was already determined to have type int, there will be an error when unifying b's type with bool, and the explanation will be given. But if b has an unknown point at this time, the unification will succeed and the unification will be dropped, while a different propagation order could have resulted in the error being detected there (and will be reliably detected there with Arthur's change-of-propagation-order approach).
One example is the term fun b -> if true then (if b then 1 else 2) else (print_int b): one of the two branches of if true will be type-checked first, and the refined explanation will or will not be used depending on the order (so if you also look at the term with inversed branch order, then one of them will have a nice error message and the other won't.)

The reasoning for having this feature even though the propagation order makes it more fragile than with Arthur's approach is:

  • It is already helpful in the cases where it works (and it works in a lot of situations that beginners may encounter, see the tests).
  • if we later decide to change the type propagation order along Arthur's line (and find a way to make it not break most programs), the work by Armaël in this PR can be reused as-is in this new world, it won't need big changes. So there is no clear reason to delay the merge of this part.

Finally: Armaël commented above that the explanation is not reused for recursive calls to the type-checker to a different expected type. Of course! The context of those sub-calls is different so it would be incorrect to reuse the explanation. The question is rather of why it is correct to reuse the explanation for the sub-calls with the same expected types. For example, if you have a program of the form if (a; b) then e1 else e2, then (a; b) will be type-checked with the if [_] then e1 else e2 explanation, and in fact b will be type-checked recursively with the same explanation, while technically its own context is rather if (a; [_]) then e1 else e2.

In the case of moving from if [_] to if (a; [_]), we considered that the error message for the if context was vague enough that this would not be confused. If b is highlighted with the explanation "this is the condition of an if", users will understand, even if technically (a; b) is the condition. Armaël and I reviewed each case of recursive sub-call with same expected type, and we considered that, each time, the approximation is reasonable. We could define a more refined type of type-forcing contexts (so the explanation would be eg. Sequence_right (If_condition) instead of just (If_condition), and print more precise messages, but we judged that it was not worth it.

This is obvious for if- and while-conditions of for-bounds for examples, which are always very small expressions anyway. It is less clear for the context while cond do [_] done, which forces unit. For example if you write

while cond do
  match foo with
  | K1 -> ()
  | K2 -> 2
  | K3 -> ()
done

then 2 will have the error explanation "... expected of type unit because it is the body of a while". (TODO @Armael: use "it is in the body of a while loop" instead.)

@Armael Armael force-pushed the Armael:type_expect_unif_explanation branch 5 times, most recently from e0ea009 to 597bc79 Dec 3, 2017
@@ -49,7 +66,7 @@ val check_partial:
Location.t -> Typedtree.case list -> Typedtree.partial
val type_expect:
?in_function:(Location.t * type_expr) ->
Env.t -> Parsetree.expression -> type_expr -> Typedtree.expression
Env.t -> Parsetree.expression -> type_expected -> Typedtree.expression

This comment has been minimized.

@alainfrisch

alainfrisch Dec 4, 2017
Contributor

Wouldn't it be lighter to pass the explanation as an (optional) extra argument, instead of wrapping it in a record with the expected type?

This comment has been minimized.

@Armael

Armael Dec 4, 2017
Author Member

Well, that's what I was doing at first, but then @gasche suggested that I attach the explanation to the expected type...

This comment has been minimized.

@Armael

Armael Dec 4, 2017
Author Member

More precisely, commit 12d26f3 implements @gasche suggestion, so the commits before that correspond to what you mention.

This comment has been minimized.

@alainfrisch

alainfrisch Dec 4, 2017
Contributor

Ah ok, sorry, I did not see @gasche 's comment. I've a preference for the old version, but I don't mind if the new one is preferred. If we go for the record solution: (i) would it make sense to include in_function in it? (ii) perhaps use a function to build the record instead of using literal; it will simplify adding more optional fields later if needed.

This comment has been minimized.

@gasche

gasche Dec 4, 2017
Member

My intuition is that we will more towards more and more structured representations of the bidirectional information propagated during type-checking, and that having it packed in one value is the right way for nice code evolution.

This comment has been minimized.

@Armael

Armael Dec 5, 2017
Author Member

I just pushed a commit that provides a function to build the record as @alainfrisch suggested.

@sliquister
Copy link
Contributor

@sliquister sliquister commented Dec 4, 2017

I wonder if, in general, these messages talking about syntax will end up being undesirable in generated code.

I suppose they can be disabled by replacing [if e1 then e2] by [if e1 then e2 else ()], or [while b do e done] by [while (b : bool) do e; (); done] etc, so it's not too bad.

But with the rec one, the hints are going to be weird (not tested with an actual ppx, but the result should be the same), since it will say "you are missing the rec keyword" on a line that says type nonrec:

type sexp

type t (* forgot [@@deriving sexp_of] *)

module M = struct
  type nonrec t = t [@@deriving sexp_of]
  let sexp_of_t : sexp -> t = sexp_of_t
end

Error: Unbound value sexp_of_t
      Hint: You are probably missing the `rec' keyword on line 7.
@Armael
Copy link
Member Author

@Armael Armael commented Dec 4, 2017

One way of addressing that issue would be to add a let nonrec construction; which would be used by ppx_deriving when called on a type nonrec. Then, the "missing rec" error message could be disabled on lec nonrec constructions.
I assume pull requests implementing this are welcome :-).

@sliquister
Copy link
Contributor

@sliquister sliquister commented Dec 5, 2017

It's clearly not worth adding syntax for my concern. I suppose the rec hint could be disabled as well if necessary by adding some let _ = sexp_of_t before the real code.

@Armael
Copy link
Member Author

@Armael Armael commented Dec 5, 2017

I had forgotten to add an explanation for the left-hand side of a sequence. The last commit implements this.

@Armael
Copy link
Member Author

@Armael Armael commented Dec 5, 2017

I tweaked the formatting to insert a break hint before the "because it is ..." explanation. This is probably better that no break hint; however now the messages in the tests look a bit weird, because of the staggering produced by the structural hovbox that contains the messages.

@@ -3163,7 +3164,8 @@ and type_expect_
exp_env = env }
end
| Pexp_sequence(sexp1, sexp2) ->
let exp1 = type_statement env sexp1 in
let exp1 = type_statement ~explanation:Sequence_left_hand_side
env sexp1 in

This comment has been minimized.

@gasche

gasche Dec 6, 2017
Member

I would have expected a test for -strict-sequence to show up in the patch. Where does this happen?

This comment has been minimized.

@trefis

trefis Dec 6, 2017
Contributor

It doesn't appear in this patch because it is already present in type_statement which was modified previously (loop bodies are handled in exactly the same way as the lhs of a sequence).

This comment has been minimized.

@gasche

gasche Dec 6, 2017
Member

Thanks!

@Drup
Copy link
Contributor

@Drup Drup commented Dec 7, 2017

Isn't it one of those cases where we should check if the associated expression is ghost before using it as an explanation ?

@gasche
Copy link
Member

@gasche gasche commented Dec 8, 2017

@Drup: we agree with that proposal, thanks! @Armael plans to implement it.

@alainfrisch would you be willing to "drive" this issue, and in particular approve it if you do approve of the proposed change? (I have handled previous change proposals by Armaël but I'm basically a co-author of this one so I don't think it would be best.)

@alainfrisch
Copy link
Contributor

@alainfrisch alainfrisch commented Dec 8, 2017

would you be willing to "drive" this issue

I only looked superficially to the implementation, and don't have any opinion on the proposal itself. I think @gasche would do a better job here (as often, and unfortunately for him).

@Armael
Copy link
Member Author

@Armael Armael commented Dec 10, 2017

I added a commit that (I think) implements @Drup suggestion.

@gasche
Copy link
Member

@gasche gasche commented Dec 10, 2017

(I believe this commit is correct as well.)

The search for a voluntary review goes on: @sliquister, @Drup, would one of you be willing to review this patch and formally "approve" it if you believe it is correct?

@gasche
Copy link
Member

@gasche gasche commented Dec 12, 2017

( @let-def, maybe you would be interested in reviewing this as well? )

@Octachron
Copy link
Member

@Octachron Octachron commented Dec 12, 2017

I am wondering if ? should count as a keyword in:

let f g = g ?x:1 ();;

Error: This expression has type int but an expression was expected of type
'a option

Copy link
Contributor

@let-def let-def left a comment

The implementation looks good to me.
I don't have a strong opinion about the effect on end-users, though I believe it is a improvement.

type type_expected = private {
ty: type_expr;
explanation: type_forcing_context option;
}

This comment has been minimized.

@let-def

let-def Dec 13, 2017
Contributor

For future reference, would you mind adding a small comment to explain the point of type_forcing_context and type_expected?
At least that type_forcing_context is not affecting type checking itself but is used for generating error messages.

Line _, characters 3-4:
Error: This expression has type int but an expression was expected of type
bool
because it is in the condition of a if-statement

This comment has been minimized.

@let-def

let-def Dec 13, 2017
Contributor

an if statement?

@Armael
Copy link
Member Author

@Armael Armael commented Dec 13, 2017

@let-def : thanks for your comments, I just added commits that address them (I think).

I am wondering if ? should count as a keyword in [optional arguments]

@Octachron : maybe -- though after a quick glance at the existing code, supporting that doesn't seem to be an simple extension of the current patch. So I think it better belongs in a separate PR.

@gasche
Copy link
Member

@gasche gasche commented Dec 13, 2017

There is a minor typo your last commit message: cnotext -> context.

(I wonder if your comments shouldn't go in the .mli rather the .ml. I think that both are fine, but maybe you could manage to start a trend of actually, you know, documenting interfaces, and that would be Big!)

@Armael Armael force-pushed the Armael:type_expect_unif_explanation branch from c96053d to b2e7c75 Dec 13, 2017
@Armael
Copy link
Member Author

@Armael Armael commented Dec 13, 2017

Good point. I amended the last commit and put the comments in the .mli rather than in the .ml.

@Octachron
Copy link
Member

@Octachron Octachron commented Dec 13, 2017

maybe -- though after a quick glance at the existing code, supporting that doesn't seem to be an simple extension of the current patch. So I think it better belongs in a separate PR.

Thanks for having a look, I was just asking in case it was an one-liner extension.

@Armael Armael force-pushed the Armael:type_expect_unif_explanation branch from b2e7c75 to 73d4e8b Dec 14, 2017
@Armael
Copy link
Member Author

@Armael Armael commented Dec 14, 2017

I rebased on top of trunk, to benefit from the super awesome error highlighting in expect tests.

@Armael
Copy link
Member Author

@Armael Armael commented Dec 21, 2017

Thanks for the approval @let-def ! Should I rebase on top of trunk and clean-up the history now?

@gasche
Copy link
Member

@gasche gasche commented Dec 21, 2017

Yes. (I didn't ping explicitly as I didn't know how much OCaml you wanted during your holidays.)

@Armael Armael force-pushed the Armael:type_expect_unif_explanation branch from 73d4e8b to 55624d7 Dec 21, 2017
@Armael
Copy link
Member Author

@Armael Armael commented Dec 21, 2017

Rebased.

@gasche gasche merged commit c8c4fb5 into ocaml:trunk Dec 21, 2017
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

8 participants
You can’t perform that action at this time.