perf: improve utf_char2cells() performance #27353

vanaigr · 2024-02-05T23:13:38Z

utf_char2cells() calls utf_printable() twice (sometimes indirectly, through vim_isprintc()) for characters >= 128. The function can be refactored to call to it only once.

utf_printable() uses binary search on ranges of unprintable characters to determine if a given character is printable. Since there are only 9 ranges, and the first range contains only one character, binary search can be replaced with SSE2 SIMD comparisons that check 8 ranges at a time, and the first range is checked separately.
SSE2 is enabled by default in GCC, Clang and MSVC for x86-64.

I tested the function and it returns the same results for all 2^32 values.

Benchmarks:

I modified the benchmarks from #26813 to better reflect the performance difference and added a new benchmark that measures 3 byte utf-8 separately.

	ascii	2 byte	3 byte	random
before	14	85	194	170
after	13	83	172	154

src/nvim/mbyte.c

zeertzjq · 2024-02-06T00:11:24Z

You may also want to take #18690 into consideration.

vanaigr · 2024-02-07T02:58:58Z

Should I make characters above 10FFFF unprintable?

zeertzjq · 2024-02-07T03:05:33Z

Hmm, actually it's not hard to add that check later, so not really necessary.

src/nvim/mbyte.c

marcelbeumer · 2024-02-08T20:35:54Z

Thanks so much everyone!

`utf_char2cells()` calls `utf_printable()` twice (sometimes indirectly, through `vim_isprintc()`) for characters >= 128. The function can be refactored to call to it only once. `utf_printable()` uses binary search on ranges of unprintable characters to determine if a given character is printable. Since there are only 9 ranges, and the first range contains only one character, binary search can be replaced with SSE2 SIMD comparisons that check 8 ranges at a time, and the first range is checked separately. SSE2 is enabled by default in GCC, Clang and MSVC for x86-64. Add 3-byte utf-8 to screenpos_spec benchmarks.

test: add 3 byte utf8 to screenpos_spec benchmark

76c1509

zeertzjq reviewed Feb 5, 2024

View reviewed changes

src/nvim/mbyte.c Outdated Show resolved Hide resolved

zeertzjq added performance issues reporting performance problems unicode 💩 (multibyte) unicode characters labels Feb 6, 2024

vanaigr force-pushed the faster-char2cells branch from f94aafe to ce39694 Compare February 7, 2024 02:27

zeertzjq added ci:s390x Enable CI for s390x and removed ci:s390x Enable CI for s390x labels Feb 7, 2024

zeertzjq reviewed Feb 7, 2024

View reviewed changes

src/nvim/mbyte.c Outdated Show resolved Hide resolved

src/nvim/mbyte.c Outdated Show resolved Hide resolved

zeertzjq requested a review from ii14 February 7, 2024 04:03

vanaigr force-pushed the faster-char2cells branch from ce39694 to c2491d7 Compare February 7, 2024 05:14

zeertzjq reviewed Feb 7, 2024

View reviewed changes

src/nvim/mbyte.c Outdated Show resolved Hide resolved

perf: improve utf_char2cells() performance

cc1ddce

ii14 approved these changes Feb 7, 2024

View reviewed changes

vanaigr force-pushed the faster-char2cells branch from c2491d7 to cc1ddce Compare February 7, 2024 06:32

zeertzjq merged commit cca8a78 into neovim:master Feb 7, 2024
24 checks passed

vanaigr deleted the faster-char2cells branch March 29, 2024 21:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: improve utf_char2cells() performance #27353

perf: improve utf_char2cells() performance #27353

vanaigr commented Feb 5, 2024

zeertzjq commented Feb 6, 2024

vanaigr commented Feb 7, 2024

zeertzjq commented Feb 7, 2024

marcelbeumer commented Feb 8, 2024

perf: improve utf_char2cells() performance #27353

perf: improve utf_char2cells() performance #27353

Conversation

vanaigr commented Feb 5, 2024

Benchmarks:

zeertzjq commented Feb 6, 2024

vanaigr commented Feb 7, 2024

zeertzjq commented Feb 7, 2024

marcelbeumer commented Feb 8, 2024