fix isequal_normalized for combining-char reordering #52447

stevengj · 2023-12-08T02:29:25Z

(Note that this function was added in Julia 1.8, in #42493.)

In the future it would be good to further optimize this function by adding a fast path for the common case of strings that are mostly ASCII characters. Perhaps simply skip ahead to the first byte that doesn't match before we begin doing decomposition etcetera.

stdlib/Unicode/src/Unicode.jl

stevengj · 2023-12-09T05:12:41Z

CI failures seem unrelated.

stevengj · 2023-12-12T19:57:36Z

Should be good to merge?

StefanKarpinski

LGTM (already merged)

Fixes #52408. (Note that this function was added in Julia 1.8, in #42493.) In the future it would be good to further optimize this function by adding a fast path for the common case of strings that are mostly ASCII characters. Perhaps simply skip ahead to the first byte that doesn't match before we begin doing decomposition etcetera. (cherry picked from commit 3b250c7)

fix isequal_normalized for combining-char reordering

9721022

stevengj added 6 commits December 7, 2023 21:29

add missing compat entry

a6d4827

another test

d03b7b5

slight optimization for ASCII case (25% faster)

daf1b45

fix ascii path to include casefolding

a2e5ba9

tweak

6955182

further slight improvements

895c9ed

stevengj requested a review from StefanKarpinski December 8, 2023 21:31

stevengj commented Dec 8, 2023

View reviewed changes

stdlib/Unicode/src/Unicode.jl Outdated Show resolved Hide resolved

Update stdlib/Unicode/src/Unicode.jl

89bfafa

Merge branch 'master' into sgj/isequal_normalized_fix

b6a8d0c

KristofferC mentioned this pull request Dec 12, 2023

Backports release 1.10 #52503

Merged

17 tasks

StefanKarpinski merged commit 3b250c7 into master Dec 19, 2023
7 checks passed

StefanKarpinski deleted the sgj/isequal_normalized_fix branch December 19, 2023 12:55

StefanKarpinski reviewed Dec 19, 2023

View reviewed changes

aviatesk removed the backport 1.10 Change should be backported to the 1.10 release label Dec 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix isequal_normalized for combining-char reordering #52447

fix isequal_normalized for combining-char reordering #52447

stevengj commented Dec 8, 2023 •

edited

Loading

stevengj commented Dec 9, 2023

stevengj commented Dec 12, 2023

StefanKarpinski left a comment

fix isequal_normalized for combining-char reordering #52447

fix isequal_normalized for combining-char reordering #52447

Conversation

stevengj commented Dec 8, 2023 • edited Loading

stevengj commented Dec 9, 2023

stevengj commented Dec 12, 2023

StefanKarpinski left a comment

Choose a reason for hiding this comment

stevengj commented Dec 8, 2023 •

edited

Loading