Enhance error messages of conversion errors #1212

Askaholic · 2020-10-03T22:02:42Z

Adds the name of arguments that generate conversion errors as mentioned in bullet 1 of #1055 and changes the message of PyDowncastError to match more closely with existing python errors. I'm very new to procedural macros so this is likely not the best way to go about things. Here's what it looks like now:

#[pyfunction]
fn tuple_func(foo: (i64, &str)) {}

>>> tuple_func("bar")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument 'foo': 'str' object cannot be converted to 'PyTuple'

However I did notice an oddity. When converting to an integer type it doesn't seem to trigger the same pyo3 code and instead generates a different error from within python.

#[pyfunction]
fn int_func(foo: i64) {}

>>> int_func("bar")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: argument 'foo': 'str' object cannot be interpreted as an integer

I looked for possible sources of the error and it might be coming from here:
https://github.com/python/cpython/blob/fb0a4651f1be4ad936f8277478f73f262d8eeb72/Objects/abstract.c#L1265
But there are actually a number of different places where this same text is present in an error string.

davidhewitt

Thanks very much for this PR, very pleased to see it! I'll do my best to find time to review it tomorrow evening.

kngwyu · 2020-10-04T13:58:39Z

So.... what is the motivation?
I don't think this is very helpful.

davidhewitt · 2020-10-04T20:15:32Z

I think for users of libraries this can be very nice when figuring out what went wrong. C++'s pybind11 has similar.

I'm afraid I didn't find time to do any reviewing today, will do my best to find time asap.

kngwyu · 2020-10-05T14:00:19Z

But I think users don't know argument names...

Askaholic · 2020-10-07T04:26:41Z

So.... what is the motivation?

Go look at the issue ticket #1055

I don't think this is very helpful.

I entirely disagree. This is one of the first features that I noticed was missing when I started using PyO3. It's extremely annoying to debug your code when your function calls are throwing type errors, but they don't tell you what the cause is. Like, imagine if you have some code that takes function arguments from the user:

some_args = get_from_cli()

pyo3_function(**some_args)
# raises TypeError: str cannot be interpreted as int

That doesn't tell the user which one of their arguments has an error. There's lots of crazy stuff you can do in python, and there are lots of situations in which you may have a function with a lot of arguments, or a few arguments with the same type. TBH even if I have 2 arguments like:

def print_range(lo: int, hi: int):
    ...

lo, hi = do_some_stuff()
print_range(lo, hi)
# raises TypeError: str cannot be interpreted as int

It's annoying to have to go back and debug your code just to figure out which argument has the wrong type, when the library really has access to that information already, but refuses to tell you.

Just as you may look at this code:

def foo(bar, **kwargs):
    pass

foo(1, bar=10)
# raises TypeError: foo() got multiple values for argument 'bar'

and say, "well the error message really doesn't need to include the name of the function or the name of the argument because it's obvious from the code," well, there's a reason why the python developers chose to include all of that information in the error message: people will do crazy things, context will get lost, and it helps if your error messages tell you the complete reason for an error.

Rant over. For now...

C++'s pybind11 has similar.

Yes, although there are a few things I don't like about pybind11's approach. For instance I don't like that it prints out the value of the argument and not just the type. This is pretty annoying for arguments that have large string representations e.g. numpy arrays, as a simple type error can suddenly print out 50 lines of text.

Also I don't think pybind11 handles function signatures properly, it just includes annotations in the doc string.

@davidhewitt BTW what would be a good file to add unit tests in? I'm also not sure how the spec.is_args branch is taken. For my simple test example it returned false and I'm not exactly sure when it's supposed to be true as there's no doc comment for that function.

And lastly, is it normal for some of the tests to fail? Even the master branch had a few failing tests when I ran them locally after cloning.

kngwyu · 2020-10-07T04:37:45Z

OK, thank you for the explanation.

But I feel TypeError: argument 'foo': Can't convert 'bar' to PyTuple confusing. What is bar?
I think the original one by David (TypeError: A to B (argument a)) is clearer.
If you want to include the local variable name, please make it clear that it is not an argument name.

BTW what would be a good file to add unit tests in?

I recommend tests/test_pyfunction.rs.

And lastly, is it normal for some of the tests to fail?

No. The failure is because you changed the error message.
Please change this test: https://github.com/PyO3/pyo3/blob/0a346dfa7cb45ebc04aec54e2168699f3ad84bee/tests/test_string.rs

Askaholic · 2020-10-07T04:54:23Z

Oh I see. I thought you were objecting to including the argument name in error messages. I also agree that the error message of conversion errors isn't ideal, but that part of the message already exists on the master, it's not part of my changes. I think again, the main problem is that the error message includes the value of the argument instead of the type. If I were to write the error, I would model it off of the format of the builtin python errors which afaik only use object types:

None.foo
# AttributeError: 'NoneType' object has no attribute 'foo'

"foo" + 1
# TypeError: can only concatenate str (not "int") to str

1 + "foo"
# TypeError: unsupported operand type(s) for +: 'int' and 'str'

"".join([1, 2])
# TypeError: sequence item 0: expected str instance, int found

So I'd probably go with something like TypeError: argument 'foo': 'str' object cannot be converted to 'PyTuple'. Not sure if quotes around the types are necessary as the python native errors are not consistent on this.

No. The failure is because you changed the error message.

Well as I said, even when building the master branch locally (before I made any changes), I got 2 errors.

kngwyu · 2020-10-07T05:05:30Z

I also agree that the error message of conversion errors isn't ideal, but that part of the message already exists on the master, it's not part of my changes.

Oops, I'm sorry I misunderstood it!

TypeError: argument 'foo': 'str' object cannot be converted to 'PyTuple'

Yeah, I agree to use the type name instead.

Well as I said, even when building the master branch locally (before I made any changes), I got 2 errors.

I know there happenes some errors with pyenv, but if not, please file a separate issue with error messages.

davidhewitt

Many thanks again for implementing this, and sorry for my delay in reviewing. Finally had a chance to think about this properly!

I'm definitely persuaded that using the type name is better than using the repr. I was thinking about this further and it's easy enough for debugging to print the problematic value. We should change this in PyDowncastError.

It seems we have two different arrangements for the argument name already proposed:

TypeError: argument 'foo': 'str' object cannot be converted to 'PyTuple'
TypeError: 'str' object cannot be converted to 'PyTuple' (for argument 'foo')

Thinking about this further, I also quite like the idea of using Python exception chaining to add context. I think if we did this right, the effect could be something like the below:

TypeError: 'str' object cannot be converted to 'PyTuple'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
    ...
ValueError: invalid value for argument 'foo'

Which style does everyone prefer?

pyo3-derive-backend/src/pymethod.rs

Askaholic · 2020-10-12T00:45:38Z

I thought about the exception chaining before, but I decided that it's uglier. Having a chain of exceptions kindof implies that there are multiple pieces of code that are failing, but really there is only one: the argument conversion. It feels a bit more like an implementation detail that the argument name isn't available further down the chain, and therefore needs to be injected into the error message later. So, I prefer just modifying the error message.

kngwyu · 2020-10-12T05:46:45Z

I agree @Askaholic. I feel the chained message confusing.

davidhewitt · 2020-10-12T08:07:38Z

👍 happy to agree with that. I guess the choice is then between the argument name at the front, or at the end.

I am personally not fond of multiple colons in exception messages, which is why I suggested adding (for argument 'foo') at the end of the message. But I am not sure what is best, so would like to listen to everyone's opinion.

Askaholic · 2020-10-13T07:45:00Z

I prefer the multiple colon way, even though it's kindof weird having multiple layers of colons. It's almost like a replacement for the error chaining where it gives you the same context as the error chain, but more condensed. It also exists in some native python error messages like the one I posted earlier:

"".join([1, 2])
# TypeError: sequence item 0: expected str instance, int found

This brings up another interesting point, which is that in order for the approach we're taking here (catch an internal error and reformat like "{context}: {old_msg}") to look right, it makes sense to always start error messages with a lower case letter, as seen above.

davidhewitt · 2020-10-13T08:02:08Z

It also exists in some native python error messages like the one I posted earlier:

👍 I am persuaded that the colon way is the right style!

it makes sense to always start error messages with a lower case letter, as seen above

Agreed. Feel free to amend the change the existing message for PyDowncastError as part of this PR (I guess you will do so anyway to change from repr to type name):

pyo3/src/err/mod.rs

Line 493 in 95cebd8

"Can't convert {} to {}",

Askaholic · 2020-10-13T08:05:09Z

Yea, I wasn't sure if I should do that in this PR or not, but I totally can. I'll just have to expand the scope a bit and rename it.

Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.

Askaholic · 2020-10-13T21:57:29Z

I'm running into a bit of a tricky situation whenever extract returns something other than a TypeError. For example the test with the bad unicode string returns a UnicodeEncodeError, however, this does not store it's error messages as a single string, but rather as a tuple of the relevant components that are formatted to the error message in __str__:

>>> e = UnicodeEncodeError('utf-8', '\ud800', 0, 1, 'surrogates not allowed')
>>> str(e)
"'utf-8' codec can't encode character '\ud800' in position 0: surrogates not allowed"

I see 3 possible solutions:

Error chain everything. If we just create a new error that's caused by the other error then we can handle anything, but the messages are uglier and harder to read especially for cases where there is just a simple TypeError.
Error chain for non-TypeErrors. Use the colon in method for TypeErrors and error chaining for everything else. The downside is then the error reporting becomes somewhat inconsistent, although I would guess that TypeErrors would be the most common error type.
Only add argument context for TypeErrors. We just leave all other error types untouched as we don't know if we can manipulate their error messages easily.

I'm not really sure what the best choice is.

Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.

kngwyu · 2020-10-14T05:49:14Z

Only add argument context for TypeErrors

👍 for this

davidhewitt · 2020-10-14T07:39:44Z

Only add argument context for TypeErrors

+1 for this. The only other built-in exception commonly used for invalid arguments is ValueError. Up to you if you want to handle that case also or leave that for the future - I don't think we have many uses of it in pyO3 for argument conversion (if any) so much less important to handle than TypeError

Askaholic · 2020-10-14T20:21:26Z

Just thought of another possible solution:

Create our own wrapper exception ArgumentError which displays the way we want. This error could either take the error message as a single string argument, or it could have 2 fields arg_name (or just arg?) and cause and format them in the __str__ method. Using 2 fields would allow the user to more easily interact with the error programmatically, although I imagine the use cases for doing so would be quite rare.

This wrapper exception basically does the same thing as python error chaining, but it displays nicer. The only thing is if the user is trying to handle exceptions within a try-except block, they can't directly catch the underlying error and instead need to catch ArgumentError first and then check the type of cause manually.

davidhewitt · 2020-10-14T20:52:11Z

Have to say I'm 👎 on creating a custom exception type here. I know we've got a couple already in the codebase (and I kind of wish we didn't, but they're necessary for other reasons). It's very awkward for Python users to be able to import these exceptions in Python (see #805). And being unable to import them makes it very difficult for Python users to figure out even what super type they have that they can catch.

Having used boost.Python and pybind11 in C++, it was so much nicer to just catch TypeError in pybind11 compared to digging through the boost.Python C++ source code to eventually find boost.python.ArgumentError inherited from TypeError and I could catch that.

So in summary I'm in favour of keeping the design here as simple as possible.

(If we were to create a new argument type, I think the correct design is to release a pyo3 Python package on pypi and expect users to import the exception that way. However that has its own complications - we would never be able to have two different PyO3 python packages in the same project, so we'd have to keep the contents of that package as stable as possible.)

Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.

Askaholic · 2020-10-15T01:30:56Z

Alright, I'll stick with only adding the argument name to TypeErrors.

…sion

Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.

davidhewitt

Great, this LGTM! Just needs an ## Added entry in the changelog please.

I see this is still in draft; anything else you want to do to this PR before it's merged?

davidhewitt · 2020-10-15T11:31:26Z

pyo3-derive-backend/src/from_pyobject.rs

-                .map(|s| format!("{} ({})", s.to_string_lossy(), type_name))
-                .unwrap_or_else(|_| type_name.to_string());
-            let err_msg = format!("Can't convert {} to {}", from, #error_names);
+            let err_msg = format!("'{}' object cannot be converted to '{}'", type_name, #error_names);


kngwyu

Thanks. I think it's now nicely shaped up, too.

kngwyu · 2020-10-15T13:16:02Z

tests/common.rs

        if !err.matches($py, $py.get_type::<pyo3::exceptions::$err>()) {
            panic!("Expected {} but got {:?}", stringify!($err), err)
        }
+        err
+    }};
+    ($py:expr, $val:ident, $code:expr, $err:ident, $err_msg:expr) => {{


Thank you for this 👍

Askaholic · 2020-10-15T19:53:29Z

Yea last bullet would be:

docs to all new functions and / or detail in the guide

Anything I should add there?

davidhewitt · 2020-10-16T08:06:15Z

docs to all new functions and / or detail in the guide

Ah yes good memory. I would suggest just doing a quick search of the repository for the old error message "Can't convert" and replace all uses in docs with the equivalent new error message. That should be enough 👍

Thanks again for this PR!

Askaholic · 2020-10-16T19:49:16Z

I can't find any more references to the old error message. I don't think it was included in the docs/guide anywhere so I guess this is ready.

It would be awesome if you could add a "hacktoberfest-accepted" label so this will count towards my hacktoberfest PR's. https://hacktoberfest.digitalocean.com/details#details

davidhewitt · 2020-10-18T08:15:01Z

Perfect. Thanks very much again, will definitely add the label!

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from a80c86f to f3b8257 Compare October 3, 2020 22:03

davidhewitt reviewed Oct 3, 2020

View reviewed changes

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from f3b8257 to f0c5dc7 Compare October 3, 2020 22:36

davidhewitt reviewed Oct 11, 2020

View reviewed changes

pyo3-derive-backend/src/pymethod.rs Outdated Show resolved Hide resolved

davidhewitt mentioned this pull request Oct 11, 2020

Function argument error handling improvements #1055

Open

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from f0c5dc7 to 067c5d0 Compare October 13, 2020 07:42

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch 3 times, most recently from 3929845 to e0a80e4 Compare October 13, 2020 07:58

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from e0a80e4 to a006efd Compare October 13, 2020 08:03

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from a006efd to 01e22f0 Compare October 13, 2020 08:20

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from 01e22f0 to 31b64cd Compare October 13, 2020 21:41

Refactor py_expect_exception to also verify error string representation

007bfb7

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from 31b64cd to 3b6a5d0 Compare October 13, 2020 22:02

Askaholic changed the title ~~Add argument name to error messages in conversion errors~~ Enhance error messages of conversion errors Oct 13, 2020

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from 3b6a5d0 to ca908ba Compare October 15, 2020 01:30

Askaholic added 2 commits October 14, 2020 17:32

Add argument name to TypeError messages caused during argument conver…

19889bc

…sion

Change wording of PyDowncastError display implementation

6724783

Displays type(obj) instead of repr(obj) and uses `cannot` instead of `can't` to be more consistent with existing python error messages. See discussion at PyO3#1212.

Askaholic force-pushed the issue/#1055-add-arg-name-to-conversion-error branch from ca908ba to 6724783 Compare October 15, 2020 01:32

davidhewitt reviewed Oct 15, 2020

View reviewed changes

kngwyu approved these changes Oct 15, 2020

View reviewed changes

Add entries to changelog

1d70344

Askaholic marked this pull request as ready for review October 15, 2020 20:03

davidhewitt added the hacktoberfest-accepted label Oct 18, 2020

davidhewitt merged commit 27f2d0e into PyO3:master Oct 18, 2020

davidhewitt mentioned this pull request Nov 19, 2020

Char support #1282

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance error messages of conversion errors #1212

Enhance error messages of conversion errors #1212

Askaholic commented Oct 3, 2020 •

edited

Loading

davidhewitt left a comment

kngwyu commented Oct 4, 2020 •

edited

Loading

davidhewitt commented Oct 4, 2020

kngwyu commented Oct 5, 2020

Askaholic commented Oct 7, 2020 •

edited

Loading

kngwyu commented Oct 7, 2020 •

edited

Loading

Askaholic commented Oct 7, 2020

kngwyu commented Oct 7, 2020

davidhewitt left a comment

Askaholic commented Oct 12, 2020

kngwyu commented Oct 12, 2020

davidhewitt commented Oct 12, 2020

Askaholic commented Oct 13, 2020

davidhewitt commented Oct 13, 2020

Askaholic commented Oct 13, 2020

Askaholic commented Oct 13, 2020

kngwyu commented Oct 14, 2020

davidhewitt commented Oct 14, 2020

Askaholic commented Oct 14, 2020

davidhewitt commented Oct 14, 2020 •

edited

Loading

Askaholic commented Oct 15, 2020

davidhewitt left a comment

davidhewitt Oct 15, 2020

kngwyu left a comment

kngwyu Oct 15, 2020

Askaholic commented Oct 15, 2020

davidhewitt commented Oct 16, 2020 •

edited

Loading

Askaholic commented Oct 16, 2020

davidhewitt commented Oct 18, 2020

Enhance error messages of conversion errors #1212

Enhance error messages of conversion errors #1212

Conversation

Askaholic commented Oct 3, 2020 • edited Loading

davidhewitt left a comment

Choose a reason for hiding this comment

kngwyu commented Oct 4, 2020 • edited Loading

davidhewitt commented Oct 4, 2020

kngwyu commented Oct 5, 2020

Askaholic commented Oct 7, 2020 • edited Loading

kngwyu commented Oct 7, 2020 • edited Loading

Askaholic commented Oct 7, 2020

kngwyu commented Oct 7, 2020

davidhewitt left a comment

Choose a reason for hiding this comment

Askaholic commented Oct 12, 2020

kngwyu commented Oct 12, 2020

davidhewitt commented Oct 12, 2020

Askaholic commented Oct 13, 2020

davidhewitt commented Oct 13, 2020

Askaholic commented Oct 13, 2020

Askaholic commented Oct 13, 2020

kngwyu commented Oct 14, 2020

davidhewitt commented Oct 14, 2020

Askaholic commented Oct 14, 2020

davidhewitt commented Oct 14, 2020 • edited Loading

Askaholic commented Oct 15, 2020

davidhewitt left a comment

Choose a reason for hiding this comment

davidhewitt Oct 15, 2020

Choose a reason for hiding this comment

kngwyu left a comment

Choose a reason for hiding this comment

kngwyu Oct 15, 2020

Choose a reason for hiding this comment

Askaholic commented Oct 15, 2020

davidhewitt commented Oct 16, 2020 • edited Loading

Askaholic commented Oct 16, 2020

davidhewitt commented Oct 18, 2020

Askaholic commented Oct 3, 2020 •

edited

Loading

kngwyu commented Oct 4, 2020 •

edited

Loading

Askaholic commented Oct 7, 2020 •

edited

Loading

kngwyu commented Oct 7, 2020 •

edited

Loading

davidhewitt commented Oct 14, 2020 •

edited

Loading

davidhewitt commented Oct 16, 2020 •

edited

Loading