Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for unicode width #46

Merged
merged 1 commit into from
Sep 10, 2022
Merged

Conversation

ratmice
Copy link
Contributor

@ratmice ratmice commented Sep 9, 2022

In just testing some byte offset to char index conversion code, I noticed
#41

When testing with "馃".
Naively just modifying char_width() prints multiple instances of the unicode character in question.
"馃" becomes "馃馃",

I left this behavior enabled for whitespace since I believe it is a part of the treatment of '\t' tab characters.

This perhaps doesn't fully fix the issue as reported, they might need a boolean to Config setting, which sets things up to call either width and width_cjk, but I don't know the right behavior to shoot for regarding whitespace in cjk.

@zesterer
Copy link
Owner

Thanks, this seems like it's am improvement on master's behaviour if nothing else!

@zesterer zesterer merged commit efbdf6b into zesterer:main Sep 10, 2022
@ratmice ratmice deleted the unicode_width branch September 10, 2022 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants