-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize PyUnicode_FromString for short ASCII strings #81529
Comments
_PyUnicode_FromASCII(s, len) is faster than PyUnicode_FromString(s) because PyUnicode_FromString() uses temporary _PyUnicodeWriter to support UTF-8. But _PyUnicode_FromASCII("hello", strlen("hello")) _PyUnicode_FROM_ASCII() is simple macro which wraps _PyUnicode_FromASCII which calls strlen() automatically:
I believe recent compilers optimize away calls of strlen(). |
LGTM. |
I confirmed at least one measurable speedup: ./python -m pyperf timeit -s 'd={}' -- 'repr(d)' FromString: Mean +- std dev: 157 ns +- 3 ns >>> (157-132)/157
0.1592356687898089 |
Should we make these APIs public?
|
Most of changes are in not performance sensitive code. I do not think there is a benefit of using new macro there. If PyUnicode_FromString() is slow I prefer to optimize it instead of adding yet one esoteric private function for internal needs. In case of dict.__repr__() we can get even more gain by using _Py_IDENTIFIER or more general API proposed by Victor. |
Because I used just sed.
OK. There are some code like
Of course, I used it just for micro benchmarking. Optimizing it is not a goal. In case of PR 14273: $ ./python -m pyperf timeit -s 'd={}' -- 'repr(d)'
.....................
Mean +- std dev: 138 ns +- 2 ns |
Oh, wait. Why we used _PyUnicodeWriter here? |
I don't understand how _PyUnicodeWriter could be slow. It does not overallocate by default. It's just wrapper to implement efficient memory management.
To optimize decoding errors: the error handler can use replacement string longer than 1 character. Overallocation is used in this case. |
I misunderstood _PyUnicodeWriter. I thought it caused one more allocation, but it doesn't. But _PyUnicodeWriter is still slow, because gcc and clang are not smart enough to optimize _PyUnicodeWriter_Init() & _PyUnicodeWriter_Prepare(). See this example:
PyUnicode_FromString() takes about 4 sec on my machine. _PyUnicode_FromASCII() is about 2 sec.
|
Another micro benchmark:
|
I'm confused by the issue title: PyUnicode_GetString() doesn't exist, it's PyUnicode_FromString() :-) I changed the title. |
Could you please measure the performance for long strings (1000, 10000 and 100000 characters): a long ASCII string and a long ASCII string ending with a non-ASCII character? |
This optimization is only for short strings. There is no significant difference for long and non-ASCII strings.
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: