Only used PyUnicode_Concat on unicode objects #3433

da-woods · 2020-03-15T17:49:53Z

(I'm getting a segmentation fault on the fstring tests with the current master, which is making it difficult to run this through a thorough set of tests. It happens with and without this PR. I'm not sure where this is due to something being messed up on my side or a real bug, but I'll look into and report it separately. Just noting it here because it means I couldn't test this PR with with "runtest.py string") Caching issues to do with Cython.inline - I'd clearly ended up with a broken version cached - ignore

cython#3426

Cython/Compiler/ExprNodes.py

tests/run/test_unicode_string_tests.pxi

Co-Authored-By: Stefan Behnel <stefan_ml@behnel.de>

Optimized inplace operations for bytes and unicode so that they're genuinely done in place if no-one else needs the object. This is what CPython tries to do (and was a string concatenation was a point where it significantly beat Cython at times) This only works if the types are known at compile time, so with unknown types CPython will still be faster in some cases

da-woods · 2020-03-16T21:57:16Z

I've updated this to also optimize str_type where its clear that it's unicode (i.e. when language_level = 3).

I've also added in some optimization of inplace concatenation. This does give a significant speed-up in some (probably mostly artificial) cases. However, it also uses some slightly awkward macros to make it fit. If you'd rather drop it then I can just remove 7d2608e reverted because it breaks stuff. I'll investigate some more and see if it is usable, but it doesn't need to be in this PR

This reverts commit 7d2608e.

Better if unicode concat is done with more settings of str_type

Cython/Compiler/ExprNodes.py

scoder · 2020-03-21T08:32:42Z

CPython has a specific optimization to check the reference count of the RHS and append rather than concat if possible

Yes, that seems worth doing on our side as well. Especially since 2-value f-strings are already replaced by a concatenation rather than joining because it's much faster. Making this even faster with this "hack" would be nice.

Cython/Compiler/ExprNodes.py

da-woods · 2020-03-21T08:52:59Z

CPython has a specific optimization to check the reference count of the RHS and append rather than concat if possible

Yes, that seems worth doing on our side as well. Especially since 2-value f-strings are already replaced by a concatenation rather than joining because it's much faster. Making this even faster with this "hack" would be nice.

~~I've pushed my patch for this. I had managed to get it working but was a bit undecided about if it should go in this PR. Notes from the relevant commit below~~

Will be submitted separately

Cython/Utility/ObjectHandling.c

… clearer.

Only used PyUnicode_Concat on unicode objects

f8e3968

cython#3426

scoder reviewed Mar 16, 2020

View reviewed changes

Cython/Compiler/ExprNodes.py Outdated Show resolved Hide resolved

tests/run/test_unicode_string_tests.pxi Show resolved Hide resolved

da-woods and others added 3 commits March 16, 2020 15:52

Update tests/run/test_unicode_string_tests.pxi

6c4a0d9

Co-Authored-By: Stefan Behnel <stefan_ml@behnel.de>

Optimized for string_type as well as unicode_type

7ef92d3

da-woods added 2 commits March 16, 2020 22:30

Revert "Optimized (some) inplace operations"

1e55b85

This reverts commit 7d2608e.

Corrected choice with str_type

e8ea61d

Better if unicode concat is done with more settings of str_type

scoder reviewed Mar 20, 2020

View reviewed changes

Cython/Compiler/ExprNodes.py Outdated Show resolved Hide resolved

Removed unnecessary language_level test

0c571a8

scoder reviewed Mar 21, 2020

View reviewed changes

Cython/Compiler/ExprNodes.py Show resolved Hide resolved

scoder reviewed Mar 21, 2020

View reviewed changes

Cython/Utility/ObjectHandling.c Outdated Show resolved Hide resolved

da-woods force-pushed the liststr branch from 58a1efd to 0c571a8 Compare March 21, 2020 08:56

da-woods and others added 2 commits March 21, 2020 09:53

Re-added FormattedValueNode check

5dcb4b6

Refactor conditions to make them more extensible and (hopefully) also…

19738c2

… clearer.

da-woods mentioned this pull request Mar 21, 2020

Try to handle string concatenation in place if possible #3451

Merged

scoder merged commit d0d3673 into cython:master Mar 21, 2020

scoder added Code Generation defect labels Mar 21, 2020

scoder added this to the 0.29.16 milestone Mar 21, 2020

scoder pushed a commit that referenced this pull request Mar 21, 2020

Only use PyUnicode_Concat on unicode object operations (GH-3433)

6642d16

da-woods deleted the liststr branch March 26, 2020 12:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only used PyUnicode_Concat on unicode objects #3433

Only used PyUnicode_Concat on unicode objects #3433

da-woods commented Mar 15, 2020 •

edited

Loading

da-woods commented Mar 16, 2020 •

edited

Loading

scoder commented Mar 21, 2020

da-woods commented Mar 21, 2020 •

edited

Loading

Only used PyUnicode_Concat on unicode objects #3433

Only used PyUnicode_Concat on unicode objects #3433

Conversation

da-woods commented Mar 15, 2020 • edited Loading

da-woods commented Mar 16, 2020 • edited Loading

scoder commented Mar 21, 2020

da-woods commented Mar 21, 2020 • edited Loading

da-woods commented Mar 15, 2020 •

edited

Loading

da-woods commented Mar 16, 2020 •

edited

Loading

da-woods commented Mar 21, 2020 •

edited

Loading