Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode/Bytes concatenation is inefficient #3453

da-woods opened this issue Mar 22, 2020 · 1 comment

Unicode/Bytes concatenation is inefficient #3453

da-woods opened this issue Mar 22, 2020 · 1 comment


Copy link

da-woods commented Mar 22, 2020

CPython has a specific optimization when concatenating strings - it checks the reference count of the first operand and tries to concatenate in place if possible. This is done in ceval: For some specific cases this can make a big performance difference

I had an initial go at it here: #3451. However there's definite failure paths since it can NULL out variables that Cython isn't expecting to be NULL.

A couple of possible options:

  1. It might be possible to create something with that basically re-implements PyUnicode_Append but without clearing operand1. (i.e. remove this line

  2. (probably easier) ensure that operand1 is always set to something on exit, even if it's a dummy value like an empty string. This could mostly be based on the current PR, but it would ocassionally lead to unexpected behaviour (mostly when exceptions are caught and handled)

      cdef unicode val = "X"
           val += "x"
      return val  # wouldn't crash, but would be an odd placeholder string.
Copy link
Contributor Author

da-woods commented Apr 2, 2020

Closed in #3451

@da-woods da-woods closed this as completed Apr 2, 2020
@scoder scoder added this to the 3.0 milestone Apr 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet

No branches or pull requests

2 participants