-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use _PyBytesWriter for bytes%args #69536
Comments
Attached patch is a work-in-progress patch to use the new private _PyBytesWriter API in bytes % args. The usage of the _PyBytesWriter API will allow further optimization. For example, it avoids the creation of a temporary bytes object to format b'%f' % 1.2. The _PyBytesWriter API allocates a small buffer of 512 bytes on the stack to delay the allocation of the final bytes objects. It can avoid the need to call _PyBytes_Resize() completly, or at least reduce the number of calls. See also the issue bpo-25318 which added the _PyBytesWriter API. |
See also the PEP-461 "Adding % formatting to bytes and bytearray". FYI bytes % args is tested by test_format (good to know to test quickly changes). |
bench_bytes_format.py: micro-benchmark testing a few formats. Some tests are focused on the implementation of _PyBytesWriter to ensure that the optimization is efficient. Except of a single test (which is not really revelant, it takes less than 500 nanoseconds), all tests are faster. The b"xxxxxx %s" % b"y" test confirms that the optimization disabling overallocation for the last write is effective. Results: Common platform: Platform of campaign orig: Platform of campaign writer: ---------------------------+------------+-------------- -------------------------------------------------+-------------+--------------- fmt = b"hello %s"; arg = b"x" * 10; fmt % arg | 98 ns (*) | 86 ns (-12%)
fmt = b"hello %s"; arg = b"x" * 100; fmt % arg | 85 ns (*) | 87 ns
fmt = b"hello %s"; arg = b"x" * 10**3; fmt % arg | 298 ns (*) | 208 ns (-30%)
fmt = b"hello %s"; arg = b"x" * 10**5; fmt % arg | 4.8 us (*) | 4.39 us (-9%)
-------------------------------------------------+-------------+ Total | 5.28 us (*) | 4.77 us (-10%) ---------------------------------------+-------------+--------------- fmt = b"x" * 10 + b"%s"; fmt % b"y" | 99 ns (*) | 81 ns (-18%)
fmt = b"x" * 100 + b"%s"; fmt % b"y" | 189 ns (*) | 87 ns (-54%)
fmt = b"x" * 10**3 + b"%s"; fmt % b"y" | 1.12 us (*) | 209 ns (-81%)
fmt = b"x" * 10**5 + b"%s"; fmt % b"y" | 88.4 us (*) | 8.49 us (-90%)
---------------------------------------+-------------+ Total | 89.8 us (*) | 8.87 us (-90%) ----------------------------------------------------------+-------------+--------------- n = 200; fmt = b"%f" * n; arg = tuple([1.2]*n); fmt % arg | 37.2 us (*) | 29.6 us (-21%)
----------------------------------------------------------+-------------+ ------------------------------------------------------------+-------------+--------------- n = 200; fmt = b"%f" * n; arg = tuple([12345]*n); fmt % arg | 49.4 us (*) | 42.8 us (-13%)
------------------------------------------------------------+-------------+ -------------------------+-------------+--------------- |
New changeset b2f3cbdc0f2d by Victor Stinner in branch 'default': |
bytes_formatlong.patch: Fast-path for b'%d' % int and other integer formatters. It avoids the creation of a temporary bytes object, it writes directly into the writer, as '%d' % int (Unicode). |
I wrote bench_bytes_int.py micro-benchmark, results are below. Oh, I did'n expected a real difference even for simple code like b'%d' % 12345 (32% faster). So I consider that it's enough to apply the optimization. Common platform: Platform of campaign orig: Platform of campaign writer: ------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 155 ns (*) | 105 ns (-32%)
n = 5; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 546 ns (*) | 306 ns (-44%)
n = 10; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 1.03 us (*) | 543 ns (-47%)
n = 25; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 2.49 us (*) | 1.27 us (-49%)
n = 100; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 10.1 us (*) | 5.25 us (-48%)
n = 200; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 20.5 us (*) | 10.8 us (-47%)
n = 500; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 48.8 us (*) | 24.6 us (-50%)
------------------------------------------------------------+-------------+ Total | 83.6 us (*) | 42.9 us (-49%) ---------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 173 ns (*) | 123 ns (-29%)
n = 5; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 602 ns (*) | 372 ns (-38%)
n = 10; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 1.14 us (*) | 668 ns (-42%)
n = 25; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 2.8 us (*) | 1.56 us (-44%)
n = 100; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 11.1 us (*) | 6.12 us (-45%)
n = 200; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 21.5 us (*) | 12.1 us (-44%)
n = 500; fmt = b"x=%d " * n; arg = tuple([12345]*n); fmt % arg | 53.5 us (*) | 29.8 us (-44%)
---------------------------------------------------------------+-------------+ Total | 90.8 us (*) | 50.7 us (-44%) ------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 155 ns (*) | 105 ns (-32%)
n = 5; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 545 ns (*) | 306 ns (-44%)
n = 10; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 1.03 us (*) | 543 ns (-47%)
n = 25; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 2.49 us (*) | 1.26 us (-49%)
n = 100; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 9.9 us (*) | 5.07 us (-49%)
n = 200; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 19.8 us (*) | 10.1 us (-49%)
n = 500; fmt = b"%d" * n; arg = tuple([12345]*n); fmt % arg | 48.9 us (*) | 24.5 us (-50%)
------------------------------------------------------------+-------------+ Total | 82.8 us (*) | 41.9 us (-49%) ------------------------------------------------------------------+-------------+--------------- n = 1; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 183 ns (*) | 132 ns (-28%)
n = 5; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 651 ns (*) | 419 ns (-36%)
n = 10; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 1.23 us (*) | 761 ns (-38%)
n = 25; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 2.96 us (*) | 1.79 us (-40%)
n = 100; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 11.9 us (*) | 7.13 us (-40%)
n = 200; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 23.5 us (*) | 14 us (-41%)
n = 500; fmt = b"x=%d " * n; arg = tuple([0xabcdef]*n); fmt % arg | 58.3 us (*) | 34.3 us (-41%)
------------------------------------------------------------------+-------------+ Total | 98.6 us (*) | 58.5 us (-41%) --------------------------------------------+-------------+-------------- fmt = b"%i"; arg = 10 ** 0 - 1; fmt % arg | 115 ns (*) | 74 ns (-36%)
fmt = b"%i"; arg = 10 ** 50 - 1; fmt % arg | 288 ns (*) | 242 ns (-16%)
fmt = b"%i"; arg = 10 ** 100 - 1; fmt % arg | 538 ns (*) | 494 ns (-8%)
fmt = b"%i"; arg = 10 ** 150 - 1; fmt % arg | 865 ns (*) | 812 ns (-6%)
fmt = b"%i"; arg = 10 ** 200 - 1; fmt % arg | 1.33 us (*) | 1.28 us
--------------------------------------------+-------------+ Total | 3.14 us (*) | 2.9 us (-8%) ----------------------------------------------+-------------+--------------- fmt = b"x=%i"; arg = 10 ** 0 - 1; fmt % arg | 140 ns (*) | 100 ns (-28%)
fmt = b"x=%i"; arg = 10 ** 50 - 1; fmt % arg | 298 ns (*) | 249 ns (-16%)
fmt = b"x=%i"; arg = 10 ** 100 - 1; fmt % arg | 548 ns (*) | 502 ns (-8%)
fmt = b"x=%i"; arg = 10 ** 150 - 1; fmt % arg | 874 ns (*) | 822 ns (-6%)
----------------------------------------------+-------------+ Total | 1.86 us (*) | 1.67 us (-10%) -------------------+-------------+--------------- |
New changeset d9a89c9137d2 by Victor Stinner in branch 'default': New changeset 4d46d1588629 by Victor Stinner in branch 'default': |
Ok, I implemented all optimizations which were already implemented in str % args. I close the issue. |
New changeset 090502a0c69c by Victor Stinner in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: