-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
CPython has a specific optimization when concatenating strings - it checks the reference count of the first operand and tries to concatenate in place if possible. This is done in ceval: https://github.com/python/cpython/blob/309d7cc5df4e2bf3086c49eb2b1b56b929554500/Python/ceval.c#L5354. For some specific cases this can make a big performance difference https://stackoverflow.com/questions/35787022/cython-string-concatenation-is-super-slow-what-else-does-it-do-poorly
I had an initial go at it here: #3451. However there's definite failure paths since it can NULL out variables that Cython isn't expecting to be NULL.
A couple of possible options:
-
It might be possible to create something with that basically re-implements
PyUnicode_Appendbut without clearing operand1. (i.e. remove this line https://github.com/python/cpython/blob/b146568dfcbcd7409c724f8917e4f77433dd56e4/Objects/unicodeobject.c#L11517) -
(probably easier) ensure that operand1 is always set to something on exit, even if it's a dummy value like an empty string. This could mostly be based on the current PR, but it would ocassionally lead to unexpected behaviour (mostly when exceptions are caught and handled)
cdef unicode val = "X" try: val += "x" except: pass return val # wouldn't crash, but would be an odd placeholder string.