-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Faster utf-32 decoder #58830
Comments
I suggest two variants of patch, accelerating the utf-32 decoder. With PEP-393 utf-32 decoder slowed down up to 2x, these patches returns a performance at the level of Python 3.2 and even much higher (2-3x over 3.2). The variant A is simpler, but the variant B is a little faster (+8-15%). |
See also bpo-14624 for UTF-16 decoder. |
Here are the results of benchmarking (numbers in MB/s). On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:
utf-32le 'A'*10000 461 (+215%) 454 (+220%) 292 (+398%) 1213 (+20%) 1454 utf-32be 'A'*10000 461 (+216%) 454 (+221%) 292 (+399%) 1209 (+20%) 1456 On 32-bit Linux, Intel Atom N570 @ 1.66GHz:
utf-32le 'A'*10000 165 (+173%) 165 (+173%) 100 (+350%) 389 (+16%) 450 utf-32be 'A'*10000 165 (+172%) 165 (+172%) 100 (+348%) 390 (+15%) 448 For scripts see bpo-14624. |
The patches updated to stylistic conformity of the UTF-8 decoder. Patch B is significantly accelerated for aligned input data (i. e. almost always), especially for natural order. The UTF-32 decoder can now be faster than ASCII decoder! May be it is time to change the title to "Amazingly faster UTF-32 decoding"? ;)
utf-32le 'A'*10000 162 (+462%) 100 (+810%) 391 (+133%) 910 utf-32be 'A'*10000 162 (+199%) 100 (+384%) 393 (+23%) 484 |
Patches updated to 3.4. |
I suggest apply patch A to 3.3 as it fixes performance regression (2x) and is very simple. |
Very simple? You're changing most of the code there. |
It was too complicated code. Actually patched code is smaller. 1 file changed, 71 insertions(+), 80 deletions(-) UTF-16 codec was modified in some way. |
That the new code is smaller is no guarantee that it's as correct :) That is exactly the reason we don't put optimizations in bugfix releases. |
ASCII and UTF-8 are the two most common codecs in the world, so it's justified to have heavily optimized encoders and decoders. I don't know any application using UTF-32-LE or UTF-32-BE. So I don't want to waste Python memory/code size with a heavily optimized decoder. The patch A looks to be enough. -- 32 bit units is commonly used with wchar_t, but this format already has a fast decoder, PyUnicode_FromWideChar(), which uses memcpy() or _PyUnicode_CONVERT_BYTES(). |
Agree. I had the same doubts. That's why I proposed two patches for your choice. |
New changeset 9badfe3a31a7 by Victor Stinner in branch 'default': |
I applied the patch "A" with minor changes: replace multiple goto with classic break/continue and if/else. |
Looks good. Thanks. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: