-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build an explicit abstract environment for check_regularity (fix #12334) #12368
Conversation
messages (#12334) *) | ||
let abs_env = | ||
List.fold_left2 (enter_type ~abstract_abbrevs:false rec_flag) | ||
env sdecl_list ids_list in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected abstract_abbrevs:true
here. Are you using true
anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a whole new env here seems potentially expensive. Is there instead a way to catch and reinterpret the error message in check_regularity
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gasche Well-spotted. I did various testing, and using true or false doesn't seem to have much impact, but it should be true
here for keeping the original behavior. Fixed now
@goldfirere Actually, this is not very expensive. We are just building a dummy env here, which contains only abstract types, and whose type parameters are not constrained. Using the original environment was really a dirty hack.
1 | type 'a t = 'a foo foo | ||
^^^^^^^^^^ | ||
Error: Constraints are not satisfied in this type. | ||
Type "'a foo foo" should be an instance of "int foo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be very nice to get an additional
Constraints cannot be inferred within a mutually recursive group of definitions
or something. Because 'a foo foo
is actually an instance of int foo
.
Not sure how hard it would be to arrange for the better message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having an exact error message is going to be difficult here. We could track the recursive use, and point it as a potential cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that tracking the recursive use would be very helpful, if it's not too hard.
messages (#12334) *) | ||
let abs_env = | ||
List.fold_left2 (enter_type ~abstract_abbrevs:false rec_flag) | ||
env sdecl_list ids_list in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Creating a whole new env here seems potentially expensive. Is there instead a way to catch and reinterpret the error message in check_regularity
?
Ok, new attempt. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the new implementation. Can you explain why you decided against the previous approach? (Is it related to performance concerns on creating an environment?)
Type "M.b" is abstract because no corresponding cmi file | ||
was found in path. | ||
Type "M.a" is abstract because no corresponding cmi file | ||
was found in path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line breaks here read like a regression caused by the use of @
instead of
in the error message. Why the change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the addition of Type
at the beginning makes the line potentially too long for a standard terminal.
If you think there is no use for line breaks, we can also go back to
.
typing/printtyp.ml
Outdated
| _ -> | ||
fprintf ppf | ||
"@,@[Type %a is abstract because no corresponding cmi file@ \ | ||
was found in path.@]" (Style.as_inline_code path) p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you are catching all errors on missing references, and implicitly assuming that non-qualified paths all come from the specific errors of constraint checking, and specializing the error message for that. This sounds fragile and hard to understand and not a good idea.
This implementation is an answer to @goldfirere point that it would be nice to have a warning that we used the abstract version of the types. This seems to be the easiest way to do that. Performance is not an issue here. I'm definitely not suggesting that we merge this code in its current state, I am rather exploring the design space. |
Doing this properly probably needs to put some kind of "reason" on the |
@lpw25 That's precisely what I was trying to do yesterday, but I had no time to push then. Done now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I didn't do a full review but just glanced at the last commit)
typing/printtyp.ml
Outdated
@@ -2432,19 +2432,20 @@ let explain mis ppf = | |||
| None -> () | |||
| Some explain -> explain ppf | |||
|
|||
let warn_on_missing_def env ppf t = | |||
type env_kind = Normal_env | Abstract_env of string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use a variant type for the Abstract_env
payload, turned into a string only in the error-message production code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a new policy ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is a policy yet, but I concur that messages at destination to humans should be emitted in one piece if possible, in order to have the full context available.
I see three major benefits for this constraint on the error message:
- it can be read and copy edited in one place
- it can be formatted as a whole
- it can be translated with enough context to enable a good translation
typing/printtyp.ml
Outdated
match p with | ||
| Path.Pident _ -> | ||
match p, env_kind with | ||
| Path.Pident _, Abstract_env where -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This match should have env_kind
as the only scrutinee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is not a Pident
, the abstract environment is not the cause.
Of course, if we start to have other cases, this may not be the case, so in some way, your above suggestion of making the payload a variant may make this code future-safe.
The code doesn't look like what I was suggesting. I was suggesting the following in type abstract_reason =
| Missing_cmi
| Recursive_dummy
and ('lbl, 'cstr) type_kind =
Type_abstract of abstract_reason option
| ... |
This seems like a fine approach to me, and I’m happy with the final error message. Thanks, @garrigue! I will Approve, but others may want to chime in if they see a better way. |
I see. This is indeed more elegant, but also more invasive. |
Personally I still feel grumpy about the proposed approach. The problem is that even if we know that the environment we have is was intentionally built with some definitions missing, we can't know when we get a I see two ways to avoid relying on this criterion:
Both approaches avoid the potential confusion that there is today between "Not_found errors on recursive neighbors" and "other Not_found errors". |
Good point @gasche — this approach will get the message wrong if there
really is a missing cmi. I was too eager here!
…On Wed, Jul 19, 2023 at 8:31 AM Gabriel Scherer ***@***.***> wrote:
Personally I still feel grumpy about the proposed approach. The problem is
that even if we know that the environment we have is was intentionally
built with some definitions missing, we can't know when we get a Not_found
error whether this particular error comes from one of the definitions
intentionally hidden from the environment, or from an actually undefined
compilation unit somewhere else. So we do still need to rely on the
criterion that "unqualified paths come from from our fake environment",
which I find fragile and inelegant. (When I reviewed the change earlier I
was under the impression that the match on p, env_kind was just a match
on env_kind with the condition on p redundant, but in fact this is not
the case.)
I see two ways to avoid relying on this criterion:
- Leo's proposal to mark abstract types with information, so that
lookup of "types from the current recursive nest" return an abstract type
marked as a dummy instead of failing with Not_found -- but I agree
that it is more invasive.
- As a workaround, the Partial_env_check_regularity construct could
come with a list/set of paths that are intentionally missing in the
environment, and we could check membership into this list/set instead of
relying on the Pident _ catch-all.
Both approaches avoid the potential confusion that there is today between
"Not_found errors on recursive neighbors" and "other Not_found errors".
—
Reply to this email directly, view it on GitHub
<#12368 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAO6A4ZQ3MQSYO3766GK6Z3XQ7HTZANCNFSM6AAAAAA2ECW65Y>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
So we discussed this during a type-system meeting and @garrigue was basically interested in trying out Leo's suggestions, with tweak to the definitions. (There is no |
8f2a2cb
to
778fe23
Compare
778fe23
to
78bf6fa
Compare
Switched to Leo's approach. Of course this requires changes in many more files. I kept the previous approach at https://github.com/cocti/ocaml/tree/flag_in_Constraint_failed . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new approach better, thanks!
Reading the patch, I understand @lpw25's original proposal better. When he proposed to add a Missing_cmi
reason, I think the idea was that the Env.find_type*
lookup functions would return Type_abstract Missing_cmi
instead of raising Not_found
. This is a yet more invasive change but it would be fairly nice, because it could remove a lot of Not_found
handling in the codebase (sharing the logic with the Type_abstract _
case), and make it more robust in places where we don't currently catch Not_found
.
This would also let us get rid of the warn_on_missing_def*
machinery on printtyp: if each abstract type comes with an explanation of why it is abstract, we can print it (in non-obvious cases) when printing the explanation of the kind difference, instead of adding an extra explanation in an ad-hoc way as we do currently.
I think that these improvements could be implemented as separate / follow-up PRs, and that the current state is already an improvement. See minor comments on the code still.
| None, _ -> Abstract_def, None | ||
| Some _, None -> Abstract_def, Some (Btype.newgenvar ()) | ||
| Some _, Some reason -> reason, None | ||
in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is obscure. Instead of taking an optional abstract_abbrevs
parameter, could you take a non-optional abstract_reason
parameter? There are two calls to enter_type
, one with Abstract_def
and one with Abstract_rec_check_regularity
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was that this is more versatile: one could eventually build abstract environments with other reasons.
But if you think this is an overgeneralization, I can go back to the original approach of using a boolean for abstract_abbrevs
, and hard-coding the use of Abstract_rec_check_regularity
here.
Or do you just mean that we should pass a value of type abstract_reason option
at call sites ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading the code again, I understand the idea better, and I agree that my "simplification" proposed above was not such a good idea. Some documentation of the abstract_abbrevs
parameter would have helped.
My understanding of the intended behavior: enter_type
will turn datatype declarations into abstract types, and type synonyms into "dummy type synonym" of the form type t = <newgenvar()>
. If abstract_abbrevs
is provided, then all types (datatypes or synonyms) are made abstract with the provided reason.
Documenting this intention could be nice, because the code is not so clear. I also wonder what is the purpose of those 'dummy type synonyms" -- why not just turn type synonyms into abstract types as datatype declarations? (Ideally this could be explained in a comment.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some explanations.
I thought of this too, but there is a problem: when we look up a type in the environment, we do not give its arity. So the dummy type returned in case of missing cmi would be nullary, requiring extra care when reporting. I suppose this may still be better than raising |
This PR was hit by a flaky CI failure: on AppVeyor (Windows) the test |
Imported in a simplified form from ocaml#12368
The call to
check_regularity
inTypecore
was using the original environment, which does not contain the newly added types. From a typing point of view, this is correct (those types will be handled as abstract), but this could cause strange error messages such as in #12334.We now build an explicit abstract environment, extending
enter_type
with an extra switch.Also fixed a bud in
check_regularity
which reported wrong constraints in a reversed order.Note: it might be correct to used the partially abstract
temp_env
built above, but this requires a thorough examination.