pyfunction: refactor argument extraction #1440

davidhewitt · 2021-02-20T09:12:36Z

This PR fixes a bug in #[pyfunction] where a keyword-only argument is incorrectly rejected when used in combination with *args:

This is an example of the problem code:

#[pyfunction(args = "*", param)]
fn my_func(
    args: &PyTuple,
    param: bool,
) -> PyResult<(&PyTuple, bool)> {
    Ok((args, param))
}

And this is the call which didn't work:

>>> my_func(1, param=True)
TypeError: my_func() got multiple values for argument: param

At the same time I also changed error messages from argument: foo to argument 'foo', so that they match what CPython errors look like.

davidhewitt · 2021-02-20T09:15:03Z

cc @1frag thanks for reporting on Gitter

davidhewitt · 2021-02-20T12:22:35Z

Given the clippy failure I might rework this PR to also provide a solution for #1439 at the same time.

davidhewitt · 2021-02-21T18:05:01Z

I've refactored the argument extraction in derive_utils to be future-compatible with positional-only arguments. I'm happy with the overall implementation I've written, however I think there's a couple of kinks that need to be worked out before this is review-worthy.

davidhewitt · 2021-02-22T00:39:12Z

Ok, tests all passing, and I added a benchmark for curiosity. Performance hasn't really changed versus the original implementation. However the new implementation seems easier to maintain to me - it should be much easier to add support for vectorcall (#684) and positional-only arguments (#1439).

examples/rustapi_module/tests/test_misc.py

kngwyu

Good catch, thanks!
I think required/total_positional_parameters are nice but don't understand some parts in the implementation.

src/derive_utils.rs

pyo3-macros-backend/src/pymethod.rs

examples/rustapi_module/tests/test_datetime.py

davidhewitt · 2021-02-22T21:59:46Z

Thanks for the reviews. I pushed a couple of commits addressing the comments and making the keyword arguments portion clearer (hopefully).

kngwyu · 2021-02-23T15:30:32Z

I'm sorry, but I'm struggling to understand all changes. Comments may delay.

davidhewitt · 2021-02-23T21:19:56Z

No problem. Feel free to ask as many questions on the code as necessary. I might need to add better inline comments & documentation.

kngwyu · 2021-02-25T16:00:24Z

Just curious, why did you change the current

for param in positional_params + keyword_params:
    if param in kwargs:
        ...

loop to

for name, value in kwargs:
    if name in positional_params:
        ....
    elif name in keyword_params:
        ....

?
(I meant the kwargs matching code in extract_keyword_arguments)

davidhewitt · 2021-02-25T21:35:55Z

Great question. There's a few reasons I did this, however I am happy to change this design:

This is exactly the same as what cpython does
When we implement Vectorcall support then we get keyword names as an array, and we don't have a dict/hashmap to look up any more in the way the code on master currently does.
Looking up the parameter name in the kwargs dict requires making a new PyString for each lookup. With the new way, we don't make any allocations.

I know the algorithmic complexity of the new method looks worse. For most functions we're never going to have more than 10 parameters, so the fact that the new algorithm is O(n) whereas the old one was O(1) matters less. The cost of allocating the PyString (or even for Vectorcall, a whole new HashMap) is relevant.

I'm primarily motivated by benchmarks - in the end we should pick whichever algorithm is fastest. But I think we should first implement vectorcall and the owned objects API, because they will also have big performance effects. So for now I wanted to keep the algorithm the same as cpython for simplicity of maintenance.

kngwyu · 2021-02-26T05:56:38Z

Thank you for your explanation. I don't care about the computational complexity here, let's choose a simpler one :)

kngwyu

Thanks, I left some comments.

pyo3-macros-backend/src/pymethod.rs

src/derive_utils.rs

davidhewitt · 2021-02-26T07:32:03Z

Thanks for the review! I will add the extra tests and try out all the suggested changes over the weekend. As (imo) this is performance-sensitive code I'll try each change in turn and see whether it has any noticeable effect on performance. I suspect that your suggested .find() / .position() is probably better than using .zip(), we'll see ;)

src/derive_utils.rs

kngwyu · 2021-03-01T14:08:05Z

examples/pyo3_benchmarks/src/lib.rs

@@ -0,0 +1,33 @@
+use pyo3::prelude::*;


I'm not sure how this benchmark is useful.
It may be difficult to observe some speedup when trying to extract only a few args.

Actually although the variance is quite high, there's still a reasonably clear signal as to how good a change is using these benchmarks. I wouldn't take these numbers super seriously for any kind of scientific conclusion. However I can see (for example) that there's still some overhead calling pyo3 functions vs CPython functions, and also that returning Vec from extract_keyword_arguments leads to noticeable slowdown.

Some example numbers I see at the moment:

-------------------------------------------------------------------------------------------- benchmark: 6 tests ------------------------------------------------------------------------------------------- Name (time in ns) Min Max Mean StdDev Median IQR Outliers OPS (Mops/s) Rounds Iterations ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- test_no_args_py 60.0000 (1.0) 4,658.0000 (2.86) 67.6030 (1.0) 34.3033 (1.08) 66.0000 (1.0) 2.0000 (inf) 1857;3556 14.7923 (1.0) 149254 100 test_no_args 70.0000 (1.17) 1,627.0000 (1.0) 78.7283 (1.16) 31.8182 (1.0) 74.0000 (1.12) 2.0000 (inf) 3261;8374 12.7019 (0.86) 128206 100 test_args_and_kwargs_py 219.0476 (3.65) 5,566.6667 (3.42) 245.2637 (3.63) 57.6773 (1.81) 242.8571 (3.68) 4.7619 (inf) 1156;10254 4.0772 (0.28) 196079 21 test_mixed_args_py 288.2353 (4.80) 6,329.4118 (3.89) 320.9264 (4.75) 83.6458 (2.63) 317.6471 (4.81) 5.8824 (inf) 1406;18940 3.1160 (0.21) 192308 17 test_args_and_kwargs 299.9999 (5.00) 86,200.0001 (52.98) 412.2963 (6.10) 466.4672 (14.66) 400.0000 (6.06) 0.0000 (1.0) 359;47106 2.4254 (0.16) 163935 1 test_mixed_args 340.0000 (5.67) 21,400.0000 (13.15) 385.6509 (5.70) 207.9585 (6.54) 380.0000 (5.76) 15.0000 (inf) 1187;1616 2.5930 (0.18) 128206 20 -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

At the moment it's clear we're a bit slower than CPython for function calls, but I hope vectorcall support and #1308 will improve things.

Later I would like to add more numbers to these benchmarks, for example #661.

kngwyu · 2021-03-03T15:15:22Z

LGTM 💯 , thank you!

davidhewitt changed the title ~~pyfunction: fix args conflicting with keyword only arg~~ pyfunction: refactor argument extraction Feb 21, 2021

davidhewitt marked this pull request as draft February 21, 2021 18:05

davidhewitt force-pushed the fix-multiple-kw-only-arg branch 6 times, most recently from 6e5e6af to 8d4f8f4 Compare February 22, 2021 00:36

davidhewitt marked this pull request as ready for review February 22, 2021 00:38

1frag reviewed Feb 22, 2021

View reviewed changes

examples/rustapi_module/tests/test_misc.py Outdated Show resolved Hide resolved

davidhewitt force-pushed the fix-multiple-kw-only-arg branch 2 times, most recently from c072b29 to 9a01a81 Compare February 22, 2021 07:18

kngwyu requested changes Feb 22, 2021

View reviewed changes

1frag reviewed Feb 22, 2021

View reviewed changes

examples/rustapi_module/tests/test_datetime.py Outdated Show resolved Hide resolved

kngwyu reviewed Feb 26, 2021

View reviewed changes

pyo3-macros-backend/src/pymethod.rs Outdated Show resolved Hide resolved

src/derive_utils.rs Outdated Show resolved Hide resolved

src/derive_utils.rs Show resolved Hide resolved

src/derive_utils.rs Outdated Show resolved Hide resolved

src/derive_utils.rs Outdated Show resolved Hide resolved

davidhewitt force-pushed the fix-multiple-kw-only-arg branch 2 times, most recently from 3e2a5a5 to 7c5b162 Compare March 1, 2021 09:16

kngwyu reviewed Mar 1, 2021

View reviewed changes

davidhewitt force-pushed the fix-multiple-kw-only-arg branch from 7c5b162 to f74daa6 Compare March 1, 2021 15:18

davidhewitt force-pushed the fix-multiple-kw-only-arg branch from f74daa6 to 52ca30a Compare March 1, 2021 17:49

davidhewitt added 2 commits March 1, 2021 17:49

pyfunction: fix args conflicting with keyword only arg

ffd5874

pyfunction: refactor argument extraction

29a525b

davidhewitt force-pushed the fix-multiple-kw-only-arg branch from 52ca30a to 0e29ef5 Compare March 1, 2021 17:49

davidhewitt added 2 commits March 1, 2021 19:12

pyo3_benchmarks: benchmarks as separate Python example

fe74160

[review] kngwyu

e86c16d

davidhewitt force-pushed the fix-multiple-kw-only-arg branch from 0e29ef5 to e86c16d Compare March 1, 2021 19:12

kngwyu approved these changes Mar 3, 2021

View reviewed changes

kngwyu merged commit 21b26fc into PyO3:master Mar 3, 2021

davidhewitt deleted the fix-multiple-kw-only-arg branch March 4, 2021 19:54

davidhewitt mentioned this pull request Mar 6, 2021

Why is the plugins built by pyo3 run slower than cython ones? #1470

Closed

scalexm mentioned this pull request Mar 18, 2021

Hygiene issues with quote_spanned! #1506

Closed

davidhewitt mentioned this pull request May 15, 2021

Support positional-only arguments in #[pyfunction] #1439

Closed

alex mentioned this pull request Jul 15, 2021

-Zinstrument-coverage showing missed coverage on _part_ of function signature (with proc-macro?) rust-lang/rust#86972

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pyfunction: refactor argument extraction #1440

pyfunction: refactor argument extraction #1440

davidhewitt commented Feb 20, 2021

davidhewitt commented Feb 20, 2021

davidhewitt commented Feb 20, 2021

davidhewitt commented Feb 21, 2021 •

edited

Loading

davidhewitt commented Feb 22, 2021

kngwyu left a comment

davidhewitt commented Feb 22, 2021

kngwyu commented Feb 23, 2021

davidhewitt commented Feb 23, 2021

kngwyu commented Feb 25, 2021 •

edited

Loading

davidhewitt commented Feb 25, 2021 •

edited

Loading

kngwyu commented Feb 26, 2021

kngwyu left a comment

davidhewitt commented Feb 26, 2021

kngwyu Mar 1, 2021

davidhewitt Mar 1, 2021

kngwyu commented Mar 3, 2021

pyfunction: refactor argument extraction #1440

pyfunction: refactor argument extraction #1440

Conversation

davidhewitt commented Feb 20, 2021

davidhewitt commented Feb 20, 2021

davidhewitt commented Feb 20, 2021

davidhewitt commented Feb 21, 2021 • edited Loading

davidhewitt commented Feb 22, 2021

kngwyu left a comment

Choose a reason for hiding this comment

davidhewitt commented Feb 22, 2021

kngwyu commented Feb 23, 2021

davidhewitt commented Feb 23, 2021

kngwyu commented Feb 25, 2021 • edited Loading

davidhewitt commented Feb 25, 2021 • edited Loading

kngwyu commented Feb 26, 2021

kngwyu left a comment

Choose a reason for hiding this comment

davidhewitt commented Feb 26, 2021

kngwyu Mar 1, 2021

Choose a reason for hiding this comment

davidhewitt Mar 1, 2021

Choose a reason for hiding this comment

kngwyu commented Mar 3, 2021

davidhewitt commented Feb 21, 2021 •

edited

Loading

kngwyu commented Feb 25, 2021 •

edited

Loading

davidhewitt commented Feb 25, 2021 •

edited

Loading