-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize UTF-8 encoder with error handlers #69454
Comments
Attached patch optimizes the UTF-8 encoder for error handlers: ignore, replace, surrogateescape, surrogatepass. It is based on the patch faster_surrogates_hadling.patch written by Serhiy Storchaka in the issue bpo-24870. It also modifies unicode_encode_ucs1() to use memset() for the replace error handler. It should be faster for long sequences of unencodable characters, but it may be slower for short sequences of unencodable characters. The patch adds new unit tests and fix unit tests to ensure that utf-8-sig codec is also well tested. TODO: write a benchmark. See also the issue bpo-25227 which optimized ASCII and latin1 encoders with the surrogateescape error handlers. |
Oh, there is a bug in utf8_encoder() (not in my patch!), newpos was not used after calling the error handler. It's now fixed in the new patch. |
Benchmark results. Sorry for the very long output. There are some (corner?) cases where the patched Python is a little bit slower. I consider that it's ok since it's *much* faster in the other cases. What do you think? Common platform: Platform of campaign before: Platform of campaign after: --------------------------+-------------+---------------- length=10**1 | 3.16 us (*) | 279 ns (-91%)
length=10**3 | 241 us (*) | 1.08 us (-100%)
length=10**2 | 23.9 us (*) | 346 ns (-99%)
length=10**4 | 2.39 ms (*) | 6.48 us (-100%)
--------------------------+-------------+ Total | 2.66 ms (*) | 8.19 us (-100%) --------------------------------+-------------+--------------- length=10**1 | 1.12 us (*) | 295 ns (-74%)
length=10**3 | 2.2 us (*) | 1.57 us (-29%)
length=10**2 | 1.21 us (*) | 408 ns (-66%)
length=10**4 | 10.4 us (*) | 12.3 us (+18%)
--------------------------------+-------------+ Total | 15 us (*) | 14.6 us --------------------------------------------+-------------+--------------- length=10**1 | 238 us (*) | 2.46 us (-99%)
length=10**3 | 23.7 ms (*) | 234 us (-99%)
length=10**2 | 2.38 ms (*) | 20.8 us (-99%)
length=10**4 | 238 ms (*) | 2.56 ms (-99%)
--------------------------------------------+-------------+ Total | 265 ms (*) | 2.82 ms (-99%) ---------------------------------------+-------------+---------------- length=10**1 | 239 us (*) | 1.29 us (-99%)
length=10**3 | 23.8 ms (*) | 80.9 us (-100%)
length=10**2 | 2.4 ms (*) | 8.44 us (-100%)
length=10**4 | 236 ms (*) | 839 us (-100%)
---------------------------------------+-------------+ Total | 263 ms (*) | 930 us (-100%) --------------------------------+-------------+--------------- length=10**1 | 1.09 us (*) | 297 ns (-73%)
length=10**3 | 2.19 us (*) | 1.58 us (-28%)
length=10**2 | 1.19 us (*) | 409 ns (-66%)
length=10**4 | 10.5 us (*) | 12.3 us (+17%)
--------------------------------+-------------+ Total | 14.9 us (*) | 14.6 us ---------------------------+-------------+---------------- length=10**1 | 3.47 us (*) | 317 ns (-91%)
length=10**3 | 263 us (*) | 1.07 us (-100%)
length=10**2 | 26.4 us (*) | 383 ns (-99%)
length=10**4 | 2.65 ms (*) | 6.75 us (-100%)
---------------------------+-------------+ Total | 2.94 ms (*) | 8.52 us (-100%) ---------------------------------+-------------+--------------- length=10**1 | 1.16 us (*) | 319 ns (-72%)
length=10**3 | 2.25 us (*) | 1.62 us (-28%)
length=10**2 | 1.25 us (*) | 432 ns (-65%)
length=10**4 | 13.4 us (*) | 12.4 us (-7%)
---------------------------------+-------------+ Total | 18 us (*) | 14.7 us (-18%) ---------------------------------------------+-------------+--------------- length=10**1 | 267 us (*) | 2.52 us (-99%)
length=10**3 | 26.2 ms (*) | 210 us (-99%)
length=10**2 | 2.63 ms (*) | 21.3 us (-99%)
length=10**4 | 264 ms (*) | 2.98 ms (-99%)
---------------------------------------------+-------------+ Total | 293 ms (*) | 3.21 ms (-99%) ----------------------------------------+-------------+---------------- length=10**1 | 263 us (*) | 1.29 us (-100%)
length=10**3 | 26.1 ms (*) | 86.6 us (-100%)
length=10**2 | 2.63 ms (*) | 9.02 us (-100%)
length=10**4 | 261 ms (*) | 925 us (-100%)
----------------------------------------+-------------+ Total | 290 ms (*) | 1.02 ms (-100%) ---------------------------------+-------------+--------------- length=10**1 | 1.14 us (*) | 317 ns (-72%)
length=10**3 | 2.24 us (*) | 1.6 us (-28%)
length=10**2 | 1.23 us (*) | 428 ns (-65%)
length=10**4 | 10.5 us (*) | 12.3 us (+17%)
---------------------------------+-------------+ Total | 15.1 us (*) | 14.7 us -----------------------------------+-------------+--------------- length=10**1 | 3.48 us (*) | 281 ns (-92%)
length=10**3 | 267 us (*) | 1.77 us (-99%)
length=10**2 | 26.7 us (*) | 424 ns (-98%)
length=10**4 | 2.67 ms (*) | 13.9 us (-99%)
-----------------------------------+-------------+ Total | 2.97 ms (*) | 16.3 us (-99%) -----------------------------------------+-------------+--------------- length=10**1 | 1.14 us (*) | 277 ns (-76%)
length=10**3 | 2.32 us (*) | 1.57 us (-32%)
length=10**2 | 1.24 us (*) | 391 ns (-68%)
length=10**4 | 10.6 us (*) | 12.3 us (+17%)
-----------------------------------------+-------------+ Total | 15.3 us (*) | 14.6 us -----------------------------------------------------+-------------+--------------- length=10**1 | 266 us (*) | 3.26 us (-99%)
length=10**3 | 26.4 ms (*) | 285 us (-99%)
length=10**2 | 2.65 ms (*) | 28.9 us (-99%)
length=10**4 | 266 ms (*) | 3.73 ms (-99%)
-----------------------------------------------------+-------------+ Total | 295 ms (*) | 4.04 ms (-99%) ------------------------------------------------+-------------+--------------- length=10**1 | 265 us (*) | 2.04 us (-99%)
length=10**3 | 26.2 ms (*) | 165 us (-99%)
length=10**2 | 2.64 ms (*) | 17 us (-99%)
length=10**4 | 263 ms (*) | 1.75 ms (-99%)
------------------------------------------------+-------------+ Total | 292 ms (*) | 1.93 ms (-99%) -----------------------------------------+-------------+--------------- length=10**1 | 1.12 us (*) | 278 ns (-75%)
length=10**3 | 2.25 us (*) | 1.59 us (-29%)
length=10**2 | 1.21 us (*) | 389 ns (-68%)
length=10**4 | 10.5 us (*) | 12.3 us (+17%)
-----------------------------------------+-------------+ Total | 15.1 us (*) | 14.6 us ---------------------------------+-------------+--------------- length=10**1 | 3.71 us (*) | 306 ns (-92%)
length=10**3 | 289 us (*) | 2.61 us (-99%)
length=10**2 | 28.9 us (*) | 532 ns (-98%)
length=10**4 | 2.88 ms (*) | 22.4 us (-99%)
---------------------------------+-------------+ Total | 3.2 ms (*) | 25.8 us (-99%) ---------------------------------------+-------------+--------------- length=10**1 | 1.16 us (*) | 299 ns (-74%)
length=10**3 | 2.36 us (*) | 1.59 us (-32%)
length=10**2 | 1.27 us (*) | 413 ns (-68%)
length=10**4 | 10.6 us (*) | 12.3 us (+16%)
---------------------------------------+-------------+ Total | 15.4 us (*) | 14.6 us (-5%) ---------------------------------------------------+-------------+--------------- length=10**1 | 289 us (*) | 3.99 us (-99%)
length=10**3 | 28.5 ms (*) | 362 us (-99%)
length=10**2 | 2.86 ms (*) | 36.7 us (-99%)
length=10**4 | 287 ms (*) | 5.18 ms (-98%)
---------------------------------------------------+-------------+ Total | 319 ms (*) | 5.59 ms (-98%) ----------------------------------------------+-------------+--------------- length=10**1 | 288 us (*) | 2.91 us (-99%)
length=10**3 | 28.5 ms (*) | 242 us (-99%)
length=10**2 | 2.86 ms (*) | 24.7 us (-99%)
length=10**4 | 284 ms (*) | 2.53 ms (-99%)
----------------------------------------------+-------------+ Total | 316 ms (*) | 2.8 ms (-99%) ---------------------------------------+-------------+--------------- length=10**1 | 1.13 us (*) | 301 ns (-73%)
length=10**3 | 2.3 us (*) | 1.59 us (-31%)
length=10**2 | 1.24 us (*) | 409 ns (-67%)
length=10**4 | 10.6 us (*) | 12.1 us (+15%)
---------------------------------------+-------------+ Total | 15.2 us (*) | 14.4 us (-5%) ------------------------------------+-------------+--------------- length=10**1 | 4.28 us (*) | 1.58 us (-63%)
length=10**3 | 320 us (*) | 11.1 us (-97%)
length=10**2 | 32.3 us (*) | 2.56 us (-92%)
length=10**4 | 3.17 ms (*) | 96.6 us (-97%)
------------------------------------+-------------+ Total | 3.52 ms (*) | 112 us (-97%) ------------------------------------------+-------------+--------------- length=10**1 | 1.44 us (*) | 1.47 us
length=10**3 | 2.43 us (*) | 2.77 us (+14%)
length=10**2 | 1.52 us (*) | 1.64 us (+8%)
length=10**4 | 10.6 us (*) | 13.3 us (+25%)
------------------------------------------+-------------+ Total | 16 us (*) | 19.2 us (+20%) ------------------------------------------------------+-------------+--------------- length=10**1 | 316 us (*) | 16 us (-95%)
length=10**3 | 31.3 ms (*) | 1.46 ms (-95%)
length=10**2 | 3.14 ms (*) | 147 us (-95%)
length=10**4 | 313 ms (*) | 15.3 ms (-95%)
------------------------------------------------------+-------------+ Total | 347 ms (*) | 16.9 ms (-95%) -------------------------------------------------+-------------+--------------- length=10**1 | 317 us (*) | 14.7 us (-95%)
length=10**3 | 31.3 ms (*) | 1.34 ms (-96%)
length=10**2 | 3.17 ms (*) | 135 us (-96%)
length=10**4 | 313 ms (*) | 13.8 ms (-96%)
-------------------------------------------------+-------------+ Total | 347 ms (*) | 15.3 ms (-96%) ------------------------------------------+-------------+--------------- length=10**1 | 1.43 us (*) | 1.45 us
length=10**3 | 2.36 us (*) | 2.58 us (+9%)
length=10**2 | 1.51 us (*) | 1.57 us
length=10**4 | 10.5 us (*) | 13.2 us (+26%)
------------------------------------------+-------------+ Total | 15.8 us (*) | 18.8 us (+19%) ------------------------------------------------------+--------------+---------------- |
Oh, the default handler for errror handlers uses a loop to check for non-ASCII characters. It can be replaced with PyUnicode_IS_ASCII(str) which has a complexity O(1). Done in new patch. |
New changeset 2b5357b38366 by Victor Stinner in branch 'default': |
I pushed my optimization. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: