Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder #119398

Merged
merged 3 commits into from
May 22, 2024

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented May 22, 2024

Add unicode_decode_utf8_writer() to write directly characters into a _PyUnicodeWriter writer: avoid the creation of a temporary string. Optimize PyUnicode_FromFormat() by using the new
unicode_decode_utf8_writer().

Rename unicode_fromformat_write_cstr() to
unicode_fromformat_write_utf8().

Microbenchmark on the code:

return PyUnicode_FromFormat(
    "%s %s %s %s %s.",
    "format", "multiple", "utf8", "short", "strings");

Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.

@vstinner
Copy link
Member Author

vstinner commented May 22, 2024

Benchmark:

diff --git a/Modules/_testcapimodule.c b/Modules/_testcapimodule.c
index f99ebf0dde..0752b2b1d2 100644
--- a/Modules/_testcapimodule.c
+++ b/Modules/_testcapimodule.c
@@ -3312,6 +3312,14 @@ function_set_warning(PyObject *Py_UNUSED(module), PyObject *Py_UNUSED(args))
     Py_RETURN_NONE;
 }
 
+static PyObject *
+bench(PyObject *Py_UNUSED(module), PyObject *Py_UNUSED(args))
+{
+    return PyUnicode_FromFormat(
+        "%s %s %s %s %s.",
+        "format", "multiple", "utf8", "short", "strings");
+}
+
 static PyMethodDef TestMethods[] = {
     {"set_errno",               set_errno,                       METH_VARARGS},
     {"test_config",             test_config,                     METH_NOARGS},
@@ -3454,6 +3462,7 @@ static PyMethodDef TestMethods[] = {
     {"check_pyimport_addmodule", check_pyimport_addmodule, METH_VARARGS},
     {"test_weakref_capi", test_weakref_capi, METH_NOARGS},
     {"function_set_warning", function_set_warning, METH_NOARGS},
+    {"bench", bench, METH_NOARGS},
     {NULL, NULL} /* sentinel */
 };

Command:

./python -m venv env
env/bin/python -m pip install pyperf
env/bin/python -m pyperf timeit -s 'import _testcapi; func=_testcapi.bench' 'func()' -v -o ref.json

Result, Python built with gcc -O3:

620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster

@vstinner
Copy link
Member Author

Oh, there was a performance regression on b"abc".decode(): I fixed it.

Benchmark:

import pyperf
import _testcapi
runner = pyperf.Runner()

utf8 = b'abc'
runner.bench_func('abc', utf8.decode)

utf8 = 'abcé'.encode()
runner.bench_func('abc + UTF-8', utf8.decode)

utf8 = 'éabc'.encode()
runner.bench_func('UTF-8 + abc', utf8.decode)

utf8 = b'x' * (1024 * 1024)
runner.bench_func('ASCII 1 MiB', utf8.decode)

utf8 = ('x' * (1024 * 1024) + 'é').encode()
runner.bench_func('ASCII 1 MiB + UTF-8', utf8.decode)

utf8 = ('é' + 'x' * (1024 * 1024)).encode()
runner.bench_func('UTF-8 + ASCII 1 MiB', utf8.decode)

utf8 = ('€' + 'x' * (1024 * 1024)).encode()
runner.bench_func('UTF-8 euro + ASCII 1 MiB', utf8.decode)

Results, Python built with gcc -O3, CPU isolation.

+---------------------+---------+-----------------------+
| Benchmark           | ref     | change                |
+=====================+=========+=======================+
| abc                 | 73.7 ns | 74.7 ns: 1.01x slower |
+---------------------+---------+-----------------------+
| abc + UTF-8         | 167 ns  | 172 ns: 1.03x slower  |
+---------------------+---------+-----------------------+
| ASCII 1 MiB         | 118 us  | 118 us: 1.00x faster  |
+---------------------+---------+-----------------------+
| ASCII 1 MiB + UTF-8 | 1.08 ms | 1.07 ms: 1.00x faster |
+---------------------+---------+-----------------------+
| UTF-8 + ASCII 1 MiB | 572 us  | 570 us: 1.00x faster  |
+---------------------+---------+-----------------------+
| Geometric mean      | (ref)   | 1.00x slower          |
+---------------------+---------+-----------------------+

Benchmark hidden because not significant (2): UTF-8 + abc, UTF-8 euro + ASCII 1 MiB

=> There is no significant impact on bytes.decode() performance (no slow down).

@vstinner
Copy link
Member Author

cc @serhiy-storchaka

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Objects/unicodeobject.c Outdated Show resolved Hide resolved
Objects/unicodeobject.c Outdated Show resolved Hide resolved
Add unicode_decode_utf8_writer() to write directly characters into a
_PyUnicodeWriter writer: avoid the creation of a temporary string.
Optimize PyUnicode_FromFormat() by using the new
unicode_decode_utf8_writer().

Rename unicode_fromformat_write_cstr() to
unicode_fromformat_write_utf8().

Microbenchmark on the code:

    return PyUnicode_FromFormat(
        "%s %s %s %s %s.",
        "format", "multiple", "utf8", "short", "strings");

Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.
@vstinner
Copy link
Member Author

I enabled automerge. Thanks for the review @serhiy-storchaka.

@vstinner vstinner disabled auto-merge May 22, 2024 20:45
@vstinner vstinner enabled auto-merge (squash) May 22, 2024 20:45
@vstinner vstinner changed the title gh-119182: Optimize PyUnicode_FromFormat() UTF-8 decoder gh-119398: Optimize PyUnicode_FromFormat() UTF-8 decoder May 22, 2024
@vstinner vstinner changed the title gh-119398: Optimize PyUnicode_FromFormat() UTF-8 decoder gh-119396: Optimize PyUnicode_FromFormat() UTF-8 decoder May 22, 2024
@vstinner vstinner merged commit 9b422fc into python:main May 22, 2024
34 checks passed
@vstinner vstinner deleted the utf8_writer branch May 22, 2024 21:05
estyxx pushed a commit to estyxx/cpython that referenced this pull request Jul 17, 2024
…n#119398)

Add unicode_decode_utf8_writer() to write directly characters into a
_PyUnicodeWriter writer: avoid the creation of a temporary string.
Optimize PyUnicode_FromFormat() by using the new
unicode_decode_utf8_writer().

Rename unicode_fromformat_write_cstr() to
unicode_fromformat_write_utf8().

Microbenchmark on the code:

    return PyUnicode_FromFormat(
        "%s %s %s %s %s.",
        "format", "multiple", "utf8", "short", "strings");

Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants