-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doesn't respect Unicode character width; all characters take up one cell #265
Comments
Thank you for the thorough bug report! This is an essential feature that needs to be supported. |
Is this related to trying to use certain fonts too? For example I'm trying to replicate my gnome-terminal settings and setting the font to "Ubuntu Mono Regular" but all characters take up two cells for some reason. |
@medwards that sounds like it's an issue with how we use font metrics. The glyphs likely don't occupy two cells, it's just that single cells are calculated to be far too wide. |
OK, let me know if that is in another issue or if I should open a new bug for it. Thanks for the fast reply :) |
I think #83 covers it. |
When you do implement character width, please use the unicode9 width tables (not You may want to optionally enable pre-unicode9 widths as a configuration option (or vice versa, iTerm.app nightlies has this option), but the repo I listed will give correct unicode widths for all characters including east asian and emoji. Many terminal apps are unicode9 ready including tmux, vim and neovim. |
Thanks @joshuarubin. It looks like Rust's unicode-width crate is up-to-date with Unicode 9, so we are set on that front.
Is there a lot of demand for this? |
Based on the size of the tables in Rust's unicode-width, it definitely seems like there are things missing. Also, an issue there states that it doesn't calculate emoji width (and doesn't seem to think it should). There are additional issues of how to handle east asian "ambiguous" width characters (iTerm defaults to 1, but has a config option, discouraged, to change to 2) and "private use area" characters. My library (for C, don't know much about rust [...yet]) lets the user decide on a char by char basis how to handle things. Its return values:
|
Dunno. I've been fighting for unicode9 widths for months across a bunch of projects. Any change causes users to complain. As this is a new project, maybe just include unicode9 and see how it works for people? |
We handle double width characters now. Lots of work for full unicode support. Closing in favor of #306. Please subscribe there if you want to follow development on this subject. |
According to alacritty/alacritty#265 (comment) , unicode in monospace can only have width 0, 1, or 2, so, just accounting for width 2 (in addition to what we already do) should be enough.
In Unicode, "monospaced" characters are meant to take up 0, 1, or 2 character cells based on what kind of character they are. Combining characters take up 0 cells because they stack on top of the previous character. CJK characters (except for the ones designated as "halfwidth") take up 2 cells, because that's how the original CJK terminal displays were designed.
In alacritty, all characters take up 1 cell, even the ones that aren't supposed to, leading to display problems. This leads to incorrect scrolling and wrapping in tmux, for example.
To demonstrate the problem, I defined this string using python3:
Without Python, you could just try pasting in this string, I believe (but don't use the pre-combined character
ü
):My terminal is 240 characters wide, so when I
print(text)
, the result should fit on one line for me. Instead, not only does it wrap, but the unexpected wrapping causes tmux to glitch, so the entire window gets filled with columns of un-combinedu
and¨
characters.The Japanese word
ありがとう
should take up 10 character cells. Instead, it takes up 5 character cells, with the characters overlapping each other.This is not a font issue, it's an issue with the actual behavior of the terminal. The lack of fallback fonts just makes the issue harder to see.
C code uses the
wcwidth(3)
function to determine how wide a character is. This function seems to have at one point been in the Rust standard library, then moved out. http://unicode-rs.github.io/unicode-width/unicode_width/index.html is a crate that seems to provide it.The text was updated successfully, but these errors were encountered: