New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Preserve literal formatting in "Did you mean ...?" #10767
Conversation
This is a draft. In the first step, I have implemented the change to preserve the original formatting of Edit: the PR now includes changes for |
(function Constant(Const_int32 (i, _)) -> i | _ -> assert false) | ||
(function i -> Tpat_constant(Const_int32 (i, Int32.to_string i))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how "hot" this function is. If it is "hot", then it may be worth considering changing Const_int32
from Const_int32 of int32 * string
to Const_int32 of int32 * string option
and using None
here instead of Int32.to_string
to avoid the overhead of conversion. This would complicate handling of Const_int32
in some other places.
(Same for Const_int64
and Const_nativeint
.)
TODO: Run a benchmark to see if this is an issue at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really about performance; this function is only called once an appropriate candidate counter-example is found. (In some case this part of the space search may be reached several times due to GADT troubles occurring earlier, but still I think you would be hard-pressed to find a counter-example where the performance actually matter.) On the other hand, None
has the benefit of providing an easy way out for constants that are produced by code, instead of parsed from user input, and this may be convenient in other places at well. You would know better, as you have a good grasp of this slice of functionality right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gasche! For future reference: the function
(function i -> Tpat_constant(Const_int32 (i, Int32.to_string i)))
is passed as the make
argument to build_other_constant
, and make
is only called once the counter-example has been established. I'm no longer worried about performance in this instance.
On the other hand,
None
has the benefit of providing an easy way out for constants that are produced by code, instead of parsed from user input, and this may be convenient in other places at well.
This makes me wonder where the string comes from in contexts where there is no source code, e.g. in dumpobj
. So far in this PR, this place in parmatch.ml
is the only place where we have to synthesise the string representation from the integer value. I would have expected there to be other places (e.g. in dumpobj
) where this needs to happen as well, but haven't found any.
If there were more places where we would have to "fake" the original string using something like Int32.to_string
then that would strengthen the case for string option
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(For reference: #10767 (comment) explains more of what is going on in build_other
generally and why synthesising a decimal is a sensible choice here.)
| Const_base(Const_int32 (i, _)) -> printf "%ldl" i | ||
| Const_base(Const_nativeint (i, _)) -> printf "%ndn" i | ||
| Const_base(Const_int64 (i, _)) -> printf "%LdL" i |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In all other places where pretty-printing takes place, the original string formatting (the second tuple element) is now used. I somewhat arbitrarily went for "normalisation to decimal" here, as this is "dumpobj", so the input is a binary, which means that we don't have the original formatting at hand anyway (I have not read the surrounding code in depth, so this is an assumption -- e.g. I'm not sure where the string representation actually comes from in this context).
| Const_int32 i -> Printf.sprintf "%ldl" i | ||
| Const_int64 i -> Printf.sprintf "%LdL" i | ||
| Const_nativeint i -> Printf.sprintf "%ndn" i | ||
| Const_int32 (_, s) -> Printf.sprintf "%sl" s | ||
| Const_int64 (_, s) -> Printf.sprintf "%sL" s | ||
| Const_nativeint (_, s) -> Printf.sprintf "%sn" s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Where is this used? Is the old behaviour (conversion to decimal) more useful here?
| Const_base(Const_int32 n) -> fprintf ppf "%lil" n | ||
| Const_base(Const_int64 n) -> fprintf ppf "%LiL" n | ||
| Const_base(Const_nativeint n) -> fprintf ppf "%nin" n | ||
| Const_base(Const_int32 (_, s)) -> fprintf ppf "%sl" s | ||
| Const_base(Const_int64 (_, s)) -> fprintf ppf "%sL" s | ||
| Const_base(Const_nativeint (_, s)) -> fprintf ppf "%sn" s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Would it be more useful to keep the previous behaviour (conversion to decimal) here?
Before: 1 | let _ = Int64.(add 0xAA_BBl 2L);; ^^^^^^^^ Error: This expression has type int32 but an expression was expected of type int64 Hint: Did you mean `43707L'? After: 1 | let _ = Int64.(add 0xAA_BBl 2L);; ^^^^^^^^ Error: This expression has type int32 but an expression was expected of type int64 Hint: Did you mean `0xAA_BBL'?
fc5fec1
to
480af00
Compare
| Const_int32 (i) -> fprintf f "Const_int32 %ld" i; | ||
| Const_int64 (i) -> fprintf f "Const_int64 %Ld" i; | ||
| Const_nativeint (i) -> fprintf f "Const_nativeint %nd" i; | ||
| Const_int32 (_, s) -> fprintf f "Const_int32 %sl" s; | ||
| Const_int64 (_, s) -> fprintf f "Const_int64 %sL" s; | ||
| Const_nativeint (_, s) -> fprintf f "Const_nativeint %sn" s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Would it be more useful to keep the previous behaviour (conversion to decimal) here?
Ready for review. I have left a few questions ("Q: ...?") as comments where I could use feedback. They essentially boil down to this question: given that we now carry the original literal around as a string, whenever we output the literal, we have a choice: we can either use the original literal formatting, or we can normalise to decimal formatting. As of right now, this PR changes the formatting so instead of decimals without underscores, the original formatting is used everywhere except in The more conservative approach would be to keep normalising to decimals without underscores everywhere (the current behaviour) except in the type hint (as preserving the literal formatting in the type hint is the goal of this PR). I don't really have an opinion or good arguments either way, so let me know if you do! |
I am not that fond of this PR current approach of modifying the |
So far, existing approaches to keep more information on literals have changed things in the On the other hand, maybe we want to split the notion of constants used in the typedtree from the one of the middle-end, so we would have a separate type |
Thanks @Octachron and @gasche for taking a look! I don't have enough context to judge the pros and cons of
I am willing to prototype though. Would you find it helpful if I had a go at implementing a proof-of-concept PR for the improved error message on top of just a change to I'm afraid that proposal (C) is a bit too abstract for me to go and create a proof-of-concept, I would need more details on that idea. In short, I appreciate that there are design issues at play here that I don't have much context on. But I would like to keep this PR moving. If there is anything I can do to to move the discussion forward, please let me know. |
Mis-clicked, sorry. I did not mean to close the PR. |
I'll let @Octachron make a suggestion :-) (Your proposal to prototype the |
I agree that modifying the typedtree for the error message is not ideal. But then, I would argue that a better fix would be to make sure that the error-handling part of the functions In particular, the |
@Octachron wrote:
I have an implementation based on this idea in #10818. It is a smaller, more local change that still achieves the goal of preserving the original integer formatting in the type hint. Let me know what you think -- my hunch is that #10818 is preferable, in which case this PR can be closed. |
Closing in favour of #10818. |
Implements "Proposal 1" in #10766.
Alternative implementation: see #10818.
User-visible changes
The goal of this PR is to preserve the original formatting of integer literals (
int32
,int64
andnativeint
) in the "Did you mean ...?" hint.Before
After
API changes
Asttypes.constant
changes:Const_int32
,Const_int64
andConst_nativeint
are now tuples; the second value is a string containing the original formatting of the literal (but without the suffix).Note that
Asttypes.mli
warns about this module being unstable and part ofcompiler-libs
.