-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace_interleave can be optimized for single character byte strings #70761
Comments
replace_interleave in Objects/bytesobject.c and Objects/bytearrayobject.c can be optimized for the special case where the interleaving byte string is a single character. Here's some quick results from timeit showing that it's about three times faster for the special case.
* Before (cold start):
>>> timeit.timeit('(b"x" * 2000000).replace(b"", b".")', number=1000)
7.619218342995737
* After (cold start):
>>> timeit.timeit('(b"x" * 2000000).replace(b"", b".")', number=1000)
2.7605581780080684 For the non-special case, running timeit.timeit('(b"x" * 2000000).replace(b"", b".0")', number=10000) takes ~173 seconds on both versions. |
I reviewed your patch on Rietveld (you should get an email notification). |
Addresses review comments. |
I wrote a microbenchmark with my benchmark.py tool. The patch always make bytes.replace(b'', char) and bytearray.replace(b'', char) faster even for strings of 10 bytes, the speedup on string of 1000 bytes or more is very interesting, even I never used this Python instruction :-) -------------+-------------+--------------- length=10 | 250 ns (*) | 211 ns (-15%)
length=10**3 | 4.67 us (*) | 1.07 us (-77%)
length=10**5 | 441 us (*) | 78.2 us (-82%)
-------------+-------------+ Total | 446 us (*) | 79.5 us (-82%) ---------------+-------------+--------------- length=10 | 266 ns (*) | 224 ns (-16%)
length=10**3 | 4.67 us (*) | 1.08 us (-77%)
length=10**5 | 441 us (*) | 78.3 us (-82%)
---------------+-------------+ Total | 446 us (*) | 79.6 us (-82%) ---------------+------------+--------------- |
New changeset 62e3b7af0697 by Victor Stinner in branch 'default': |
I pushed my latest patches, thanks for your contribution Josh. |
Is it worth to optimize this pretty rare special case? |
There was a TODO in the code, so I guess that the author wanted to write specialized code for 1-char replacement. Since the patch is short (adds 8 lines of C code), I consider that it's ok to optimize it. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: