Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-119396: Optimize unicode_decode_utf8_writer() #119957

Merged
merged 1 commit into from
Jun 3, 2024

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Jun 2, 2024

Take the ascii_decode() fast-path even if dest is not aligned on size_t bytes.

Take the ascii_decode() fast-path even if dest is not aligned on
size_t bytes.
@vstinner
Copy link
Member Author

vstinner commented Jun 2, 2024

Benchmark (C part, _testcapimodule.c):

static PyObject *
bench(PyObject *Py_UNUSED(module), PyObject *args)
{
    const char *str;
    Py_ssize_t loops;
    if (!PyArg_ParseTuple(args, "ny", &loops, &str)) {
        return NULL;
    }

    PyTime_t t1, t2;
    (void)PyTime_PerfCounterRaw(&t1);
    for (Py_ssize_t i=0; i < loops; i++) {
        PyObject *obj = PyUnicode_FromFormat("=%s", str);
        if (obj == NULL) {
            return NULL;
        }
        Py_DECREF(obj);
    }
    (void)PyTime_PerfCounterRaw(&t2);
    return PyFloat_FromDouble(PyTime_AsSecondsDouble(t2 - t1));
}

Benchmark (Python part):

import pyperf
import _testcapi
runner = pyperf.Runner()

arg = b'x' * 10
runner.bench_time_func("FromFormat('x'*10)", _testcapi.bench, arg)
arg = b'x' * 1_000
runner.bench_time_func("FromFormat('x'*1_000)", _testcapi.bench, arg)

Result:

+-----------------------+---------+-----------------------+
| Benchmark             | ref     | change                |
+=======================+=========+=======================+
| FromFormat('x'*10)    | 75.9 ns | 68.7 ns: 1.11x faster |
+-----------------------+---------+-----------------------+
| FromFormat('x'*1_000) | 200 ns  | 191 ns: 1.05x faster  |
+-----------------------+---------+-----------------------+

@vstinner vstinner changed the title Optimize unicode_decode_utf8_writer() gh-119396: Optimize unicode_decode_utf8_writer() Jun 2, 2024
@vstinner vstinner merged commit 3ea9b92 into python:main Jun 3, 2024
36 checks passed
@vstinner vstinner deleted the ascii_decode branch June 3, 2024 06:45
mliezun pushed a commit to mliezun/cpython that referenced this pull request Jun 3, 2024
Optimize unicode_decode_utf8_writer()

Take the ascii_decode() fast-path even if dest is not aligned on
size_t bytes.
barneygale pushed a commit to barneygale/cpython that referenced this pull request Jun 5, 2024
Optimize unicode_decode_utf8_writer()

Take the ascii_decode() fast-path even if dest is not aligned on
size_t bytes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant