Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(PUP-5964) Improve how type mismatches are asserted and reported #4723

Conversation

thallgren
Copy link
Contributor

This PR contains several commits targeted to improve reporting of type mismatches and to make them more consistent. The details are in the commit messages.

This PR is for stable and supersedes PR-4709

@thallgren thallgren added Language blocked PRs blocked on work external to the PR itself labels Feb 26, 2016
@thallgren thallgren force-pushed the issue/pup-5964/improve-type-assertion-reporting branch from e01f30f to ef3e353 Compare February 26, 2016 12:45
@@ -66,7 +72,42 @@ def to_s
end
end

# Module to handle present/past tense.
#
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? Don't understand.

@hlindberg
Copy link
Contributor

I don't get all the stuff with pasta and present tense - why are they not all in present tense?

@thallgren
Copy link
Contributor Author

Type assertion has always used past tense. Consider:

"The default value returned from the block has wrong type: expected an Integer value, got String"

"The default value returned from the block has wrong type: expects an Integer value, got String"

the latter doesn't look too good. This didn't use to be a problem because the TypeAsserter always used past tense. Now it uses the TypeMismatchDescriber. Incidentally, the mismatch describer is also used when validating parameter assignments. So we see things like:

"Function 'a' has no parameter named 'x'"

this could of course be changed to always use past tense:

"Function 'a' did not have a parameter named 'x'"

but as you noticed above, that too looks odd. There is also a common ground:

"Function 'a' parameter 'x' expects an Integer value, got String"

@hlindberg
Copy link
Contributor

It must say: "Function 'a' has no parameter named 'x'", since if it says "Function 'a' did not have a parameter named 'x'", it is undefined how far back in the past it did not have it, did it loose it, does it have it now, but not a while ago?
Past tense on the expectancy makes sense as it refers to the just failed assertion.
Thus, this works: "The default value returned from the block has wrong type: expected an Integer value, got String".

So why not always "expected" ?

@thallgren
Copy link
Contributor Author

Parameter validation always uses present tense. The test is just there to verify that the mechanism works for past tense as well. I can remove that test if you don't want it there. The mechanism is generic and past tense is only used for failed assertions.

@hlindberg
Copy link
Contributor

I am not referring to the test. The handling of "tense" seems like something that should not be needed.
The condition is naturally in present tense since it was and still is the same way (e.g. "has no parameter"). The check just took place and it then expected. I don't understand what it is I am not getting here. Can you give an example of an error that would look bad with "expected" ?

@thallgren
Copy link
Contributor Author

Yes I can. But it requires an explanation.

Things are often nested. When multiple signatures are used, an none of them matches, each mismatch is reported as the signature 'expects' something. That "something" can be complex in itself and if it doesn't use the same tense as the top level, things will look odd.

You concluded above that "Function 'a' did not have a parameter named 'x'" looks very odd. This is one of the "somethings" I refer to in the previous paragraph. So past tense is out for a "something". Consequently, we cannot use "expected" here. All parameter mismatch reporting must be in present tense (as it always has been).

Type assertion is different. I gave examples of why it has to be past tense above.

@hlindberg
Copy link
Contributor

Can you show one that would look odd for the multiple signatures case? (it needs to state what it exects since it still expects the same signatures after the error is reported). I am not sure it really looks odd to say expected after that.

@thallgren
Copy link
Contributor Author

Does it matter that the tense is correct throughout?

@hlindberg
Copy link
Contributor

I am trying to understand what you consider correct tense and why all of this is needed.

@thallgren
Copy link
Contributor Author

Before I made the TypeAsserter take advantage of the TypeMismatchDescriber, the type asserter always used past tense and the TypeMismatchDescriber always used present tense. When I made the TypeAsserter use the TypeMismatchDescriber, some constructs became bad.

I'm sure there are several ways this can be resolved. I chose a way that made it possible to control whether or not the TypeMismatchDescriber should use past or present tens. This works in all cases. Unless there's a compelling reason not to do it this way, I would prefer if we let that solution stand rather than wasting more time coming up with other solutions.

@hlindberg
Copy link
Contributor

Seems a bad thing that they use different tense, and then fixing it with more code. Are people helped by messages being in different tense? I wondered why it was this way, your answer is basically that they are different because they are different and with a parameter they can continue to be different. I got that, you would not have added the parameter otherwise.

I asked for examples to be able to asses if it is best to do what you did, or if messages of one kind should be changed to the other (or if that is worse), or if there is middle ground. The example showing a change to past tense in the first part and present in the second did indeed look strange - also because the condition that is mentioned in the first part still exist. Maybe the messages need to make a difference because they have to make distinction between conditions that still exist, and things that were produced (and do now not exist) - in such sentences there can be a difference of tense.

Back to my question: is it really needed (or is the parameter there only for the reason to bridge between two different styles of messages)? I have not said one word about that I think we must change this, I just asked why.

I am happy to help edit messages if that makes it a better solution for users. I don't think it is a waste of time to reduce complexity and making things clear.

@thallgren
Copy link
Contributor Author

OK. Let's leave this PR unmerged for now then. We have more important things to do. It would have been nice to get this in now (I obviously need some of it) but it's not that important.

@hlindberg
Copy link
Contributor

Stable is blocked at the moment yes. Has nothing to do with asking about the "tense" and messages.

@hlindberg
Copy link
Contributor

This needs a rebase. And we can then talk about the handling of "tense" in person.

@hlindberg hlindberg removed the blocked PRs blocked on work external to the PR itself label Mar 2, 2016
@thallgren thallgren force-pushed the issue/pup-5964/improve-type-assertion-reporting branch from ef3e353 to dfe4a48 Compare March 3, 2016 12:40
The TypeAsserter methods takes a `subject` parameter that explains in
what context the assertion takes place. It is often desirable to create
this subject string from things in that context (i.e. format a string).
Formatting strings that may or may not be used is however not a good
practice since the formatting itself is resource consuming.

This commit makes it possible to defer the formatting of the subject
string until an error actually has been detected.

This commit enables deferred formatting in one of two ways:

1. As an alternative to passing the subject as a String, it can now also
   be passed as an Array. When an Array is used, the first element is
   treated as a format string and the remaining elements are passed as
   arguments to that format string.
2. A block can be given provided to the assert call. The result of
   calling that block with the given subject as an argument will become
   the new subject. If the result is an Array, then it will be formatted
   as described in #1.
Before this commit, no Tuple was assignable to a PArrayType that had
no defined element type. This was obviously wrong since without an
element type, the Array should have no opinion about it and instead
allow all Tuples.

The fix will have little or no impact since the type parser is unable
to produce Array[undef]. It will parse the string 'Array' to mean
Array[Data,0,default].
Before this commit, the TypeMismatchDescriber would sometimes
generalize too much because the classes of the expected and actual
types didn't match (and PStructType != PHashType) or not generalize
although it was possible (generalized form of actual type still not
assignable).

This commit improves the logic that decides where or not to generalize
the types before reporting the mismatch.
Before this commit, TypeAsserter had it's own poor-mans solution for
how to format the string used for reporting a mismatch. This changes
in this commit so that the TypeMismatchDescriber is utilized instead.

The commit also changes a couple of places where the TypeAsserter
is used since the generated messages no longer made sense.
Before this commit, the assert_type() function had it's own way of
presenting a type mismatch. This commit replaces that and instead
delegates the formatting to the TypeMismatchDescriber.
This commit adds a 'tense' argument to the TypeMismatchDescriptor. The
tense can be either :past or :present and controls the wording of the
created sentences accordingly.

The commit also adjusts the wording of two unit test descriptions for
the mismatch describer.
@thallgren thallgren force-pushed the issue/pup-5964/improve-type-assertion-reporting branch from dfe4a48 to 7970636 Compare March 3, 2016 14:27
@hlindberg
Copy link
Contributor

This was manually rebased (well git did the work), and merged to stable at f1a0301 and then on to master. Closing this PR.

@hlindberg hlindberg closed this Mar 3, 2016
@thallgren thallgren deleted the issue/pup-5964/improve-type-assertion-reporting branch March 9, 2016 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants