-
-
Notifications
You must be signed in to change notification settings - Fork 29.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize _PyUnicodeWriter implementation #119396
Comments
Add fast paths for str, int and float object types. Benchmark on %S and %R formats: +----------------+--------+----------------------+ | Benchmark | ref | change | +================+========+======================+ | str() | 654 ns | 556 ns: 1.18x faster | +----------------+--------+----------------------+ | repr() | 722 ns | 627 ns: 1.15x faster | +----------------+--------+----------------------+ | Geometric mean | (ref) | 1.16x faster | +----------------+--------+----------------------+
I do not know how much this is needed, but the code looks correct. Could you please collect some data about types used with |
Sure, here you have (I didn't check width or precision):
For str, int and float:
Patch: diff --git a/Objects/unicodeobject.c b/Objects/unicodeobject.c
index 480b671390..d2df7bf62d 100644
--- a/Objects/unicodeobject.c
+++ b/Objects/unicodeobject.c
@@ -2750,6 +2750,7 @@ unicode_fromformat_arg(_PyUnicodeWriter *writer,
PyObject *obj = va_arg(*vargs, PyObject *);
PyObject *str;
assert(obj);
+fprintf(stderr, "@@@ PyUnicode_FromFormat(%%S): %s\n", Py_TYPE(obj)->tp_name);
str = PyObject_Str(obj);
if (!str)
return NULL;
@@ -2766,6 +2767,7 @@ unicode_fromformat_arg(_PyUnicodeWriter *writer,
PyObject *obj = va_arg(*vargs, PyObject *);
PyObject *repr;
assert(obj);
+fprintf(stderr, "@@@ PyUnicode_FromFormat(%%R): %s\n", Py_TYPE(obj)->tp_name);
repr = PyObject_Repr(obj);
if (!repr)
return NULL;
@@ -2782,6 +2784,7 @@ unicode_fromformat_arg(_PyUnicodeWriter *writer,
PyObject *obj = va_arg(*vargs, PyObject *);
PyObject *ascii;
assert(obj);
+fprintf(stderr, "@@@ PyUnicode_FromFormat(%%A): %s\n", Py_TYPE(obj)->tp_name);
ascii = PyObject_ASCII(obj);
if (!ascii)
return NULL; |
Thank you. According to these results we can ignore Unfortunately, |
I also write PR gh-119398 for this issue: "Optimize PyUnicode_FromFormat() UTF-8 decoder". |
Add unicode_decode_utf8_writer() to write directly characters into a _PyUnicodeWriter writer: avoid the creation of a temporary string. Optimize PyUnicode_FromFormat() by using the new unicode_decode_utf8_writer(). Rename unicode_fromformat_write_cstr() to unicode_fromformat_write_utf8(). Microbenchmark on the code: return PyUnicode_FromFormat( "%s %s %s %s %s.", "format", "multiple", "utf8", "short", "strings"); Result: 620 ns +- 8 ns -> 382 ns +- 2 ns: 1.62x faster.
Use stringlib to specialize unicode_repr() for each string kind (UCS1, UCS2, UCS4). Benchmark: +-------------------------------------+---------+----------------------+ | Benchmark | ref | change2 | +=====================================+=========+======================+ | repr('abc') | 100 ns | 103 ns: 1.02x slower | +-------------------------------------+---------+----------------------+ | repr('a' * 100) | 369 ns | 369 ns: 1.00x slower | +-------------------------------------+---------+----------------------+ | repr(('a' + squote) * 100) | 1.21 us | 946 ns: 1.27x faster | +-------------------------------------+---------+----------------------+ | repr(('a' + nl) * 100) | 1.23 us | 907 ns: 1.36x faster | +-------------------------------------+---------+----------------------+ | repr(dquote + ('a' + squote) * 100) | 1.08 us | 858 ns: 1.25x faster | +-------------------------------------+---------+----------------------+ | Geometric mean | (ref) | 1.16x faster | +-------------------------------------+---------+----------------------+
Use stringlib to specialize unicode_repr() for each string kind (UCS1, UCS2, UCS4). Benchmark: +-------------------------------------+---------+----------------------+ | Benchmark | ref | change2 | +=====================================+=========+======================+ | repr('abc') | 100 ns | 103 ns: 1.02x slower | +-------------------------------------+---------+----------------------+ | repr('a' * 100) | 369 ns | 369 ns: 1.00x slower | +-------------------------------------+---------+----------------------+ | repr(('a' + squote) * 100) | 1.21 us | 946 ns: 1.27x faster | +-------------------------------------+---------+----------------------+ | repr(('a' + nl) * 100) | 1.23 us | 907 ns: 1.36x faster | +-------------------------------------+---------+----------------------+ | repr(dquote + ('a' + squote) * 100) | 1.08 us | 858 ns: 1.25x faster | +-------------------------------------+---------+----------------------+ | Geometric mean | (ref) | 1.16x faster | +-------------------------------------+---------+----------------------+
Optimize unicode_decode_utf8_writer() Take the ascii_decode() fast-path even if dest is not aligned on size_t bytes.
Optimize unicode_decode_utf8_writer() Take the ascii_decode() fast-path even if dest is not aligned on size_t bytes.
Optimize unicode_decode_utf8_writer() Take the ascii_decode() fast-path even if dest is not aligned on size_t bytes.
To prepare gh-119182 implementation, I propose to optimize first the _PyUnicodeWriter implementation. For example, optimize the UTF-8 decoder in PyUnicode_FromFormat() by avoiding the creation of a temporary buffer: write directly characters in the writer.
Linked PRs
The text was updated successfully, but these errors were encountered: