-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster parsing keyword arguments #71761
Comments
Parsing keyword arguments is much more slow than parsing positional arguments. Parsing time can be larger that useful execution time. $ ./python -m timeit "b'a:b:c'.split(b':', 1)"
1000000 loops, best of 3: 0.638 usec per loop
$ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 1.64 usec per loop The main culprit is that Python strings are created for every keyword name on every call. Proposed patch adds alternative API that caches keyword names as Python strings in special object. Argument Clinic is changed to use this API in generated file. An effect of the optimization: $ ./python -m timeit "b'a:b:c'.split(b':', maxsplit=1)"
1000000 loops, best of 3: 0.826 usec per loop Invocations of PyArg_ParseTupleAndKeywords() in non-generated code are kept, since API is not stable yet. Later I'm going to cache parsed format strings and speed up parsing positional arguments too. |
I haven't reviewed the patch, but the idea is great as I know one of Larry's hopes of using Argument Clinic was to allow for this kind of speed-up. |
Ping. |
Updated patch addresses Antoine's comments. All checks of format string are moved into parser_init. I experimented with Antoine's idea about making vgetargskeywords a simple wrapper around vgetargskeywordsfast with one-shot parser, but this slows down parsing positional arguments too much (due to creating Python strings for unused keyword names). |
See also the old issue bpo-17170 "string method lookup is too slow". |
Indeed, in bpo-17170 this issue was discussed first. |
Normally, LGTM is an almost useless comment, but the patch does in fact look good to me. I like how compact and straight-forward the changes are to the individual parsing calls. |
New changeset e527715bd0b3 by Serhiy Storchaka in branch 'default': |
The issue can now be closed no? |
I left this issue open for three reasons.
|
I think for converting uses to Argument Clinic it can be done in a more iterative process on a per-module basis. How many modules do we have left to convert? If it isn't ridiculously huge we could open individual issues to convert them each. |
Yes, I came to conclusion than needed to push existing issues for separate files. I'm sure there are ready patches waiting for review. Now there is additional reason for converting to Argument Clinic. But some files contain only one PyArg_ParseTupleAndKeywords(), I think we can convert them in one patch. |
Just for the history, there are two alternative patches. They unpack keyword arguments to linear array. I expected this approach can add more optimization, but actually the benefit is too small or negative. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: