Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly handle utf-8 string arguments in alignment #628

Closed
vitaut opened this issue Dec 10, 2017 · 3 comments
Closed

Correctly handle utf-8 string arguments in alignment #628

vitaut opened this issue Dec 10, 2017 · 3 comments

Comments

@vitaut
Copy link
Contributor

vitaut commented Dec 10, 2017

As pointed out by Артём Голубихин in the comments to https://www.youtube.com/watch?v=ptba_AqFYCM:

Alignment is the problem, because utf-8 symbols are treated as a couple of chars, so, for example, u8"{:>2}"_format(u8"я") is just "я" instead of " я". And, of course, not just single unicode codepoints should be considered.

@vitaut vitaut changed the title Correctly handle utf-8 in alignment Correctly handle utf-8 string arguments in alignment Dec 10, 2017
vitaut added a commit that referenced this issue Oct 4, 2018
vitaut added a commit that referenced this issue Oct 4, 2018
vitaut added a commit that referenced this issue Oct 4, 2018
@vitaut
Copy link
Contributor Author

vitaut commented Oct 4, 2018

Fixed in 3832524 for UTF-8 strings.

@vitaut vitaut closed this as completed Oct 4, 2018
@gulrak
Copy link

gulrak commented Sep 12, 2019

Actually this still seems to be wrong for some codepoints, as e.g. u8"작" is now handled as having a width of 1 (as it is one codepoint) but actually when printing it on a terminal, it is two characters wide.

For systems supporting it, int wcwidth(wchar_t c); from <wchar.h> will give the visual width in character columns if current locale supports unicode, otherwise something along the lines of https://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c might be needed.

@vitaut
Copy link
Contributor Author

vitaut commented Sep 15, 2019

@gulrak, you are right, some characters have double display width. Thanks for the link! I plan to improve handling of width and the ideas from the wcwidth implementation are very useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants