Add support for unicode width #46

ratmice · 2022-09-09T14:32:03Z

In just testing some byte offset to char index conversion code, I noticed
#41

When testing with "🦀".
Naively just modifying char_width() prints multiple instances of the unicode character in question.
"🦀" becomes "🦀🦀",

I left this behavior enabled for whitespace since I believe it is a part of the treatment of '\t' tab characters.

This perhaps doesn't fully fix the issue as reported, they might need a boolean to Config setting, which sets things up to call either width and width_cjk, but I don't know the right behavior to shoot for regarding whitespace in cjk.

zesterer · 2022-09-10T12:27:41Z

Thanks, this seems like it's am improvement on master's behaviour if nothing else!

Add support for unicode width

f423f90

ratmice force-pushed the unicode_width branch from 7835860 to f423f90 Compare September 10, 2022 04:42

zesterer merged commit efbdf6b into zesterer:main Sep 10, 2022

ratmice deleted the unicode_width branch September 10, 2022 12:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for unicode width #46

Add support for unicode width #46

ratmice commented Sep 9, 2022 •

edited

Loading

zesterer commented Sep 10, 2022

Add support for unicode width #46

Add support for unicode width #46

Conversation

ratmice commented Sep 9, 2022 • edited Loading

zesterer commented Sep 10, 2022

ratmice commented Sep 9, 2022 •

edited

Loading