-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weโll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fish mis-identifies rightmost point of prompt with utf8 characters #2199
Comments
Is the emoji double width? We may need to teach |
I don't have those characters in any of my fonts! Certainly some UTF-8 characters are okay - โ and โ work fine in my git prompt. |
The happy face emoji is definitely double width. I'm not sure what the other one is. So it looks like |
What font are you using? It have rare characters that I seldom found in terminal. |
I'm using a monospace font and both the happy face and the triangle are single width in the font. (I edited a local copy of monofur for my own use for the happy face - the triangle is coming from Deja Vu Mono) I'll admit that I'm somewhat beyond my depth re the width of characters etc. That said, the prompt is all me, so I should be able to dig out e.g. codepoints if that'll help. In the meantime, is there some way I can determine what characters would be safe to use? I suspect the branch symbol I'm using in my git status is "wide" - it makes the offset worse. |
There's this comment:
That probably answers your question in far more detail than you want :) |
Do you mean I personally uses CJK double-width font and I did not find any issue with them when using |
@pickfire - do you mean a recent patch to st? I tried st-0.5 direct from suckless out for a second but specifically couldn't get these characters to render and dropped it. For what it's worth, the positioning is worse in st than urxvt. At least urxvt is consistent; in st, cycles of deleting and re-completing put the cursor in different places. ๐ฑ Comparing directory-specific behavior of my prompt provides two more data points:
It's not conclusive, but the number of > 7bit characters in the prompt == the offset from the correct position of the cursor (synthesized here by |) Per charmap and http://www.unicode.org/versions/Unicode7.0.0/ch02.pdf So: none are in C0/C1, Mn, Me, Cf, medial vowels or CJK. The latter two, charmap reports as "non printable" but according to the above comment I think they should all be width 1. Here's a thing: I'm on Gentoo Linux, which for better or worse means that fish and the libs it depends on can have different e.g. configure settings. (For instance, I just tried recompiling ncurses with the tinfo use variable set, which also didn't help...) Are there versions and configurations of fish's deps that might impact how it calculates where to move the cursor and draw? |
Okay, looking deeper: my build sets HAS_WCWIDTH, so it seems like I'm using glibc's wcwidth() function. Which leads to the next question: I have glibc-2.19-r1 installed. I'd think that would be current enough, but who can say for sure. Maybe I'll have a few minutes later tonight to build a wcwidth tester and see how it goes. |
I'm plainly out of my depth. After 30 minutes of hacking I've got:
Which gets me:
I've tried switching my LC_CTYPE around to no avail. The whole point here was to try to see what widths the various characters are reported with, but I seem to have some fundamental misunderstanding of processing wide characters in C. |
One thing that came of this experiment: fish gets even more confused when commands have multibyte characters in them. |
I tried looping through every value and could not find any that returned a |
The upshot there would be forcing HAVE_WCWIDTH 0 (or undefined) or BROKEN_WCWIDTH, right? That might be doable. |
How does this look since #2217 was merged? |
Please also check this: https://github.com/JuliaLang/utf8proc |
@nyarly: Any improvement here with 2.3.0? |
I wasn't having this issue until 2.3.0. Now I am. |
Does #3143 fix it? |
@floam Yes! ๐ฏ Strange that those changes are related. I just happened to have |
Cool! I am closing this as a duplicate of #3124 - we'll follow it there. |
Minor cleanup related to issue #2199.
Remaining places that call |
Handling of e.g. some color emojis does seem wrong. Do we simply need to update our rules for newer unicode specs? |
Maybe. I checked the source of our wcwidth implementation and it hasn't been modified since we copied it. Googling "markus kuhn wcwidth" didn't turn up any documents saying it was out of date. I don't think we need to spend more effort on this until someone reports a problem that isn't due to my screwup in the 2.3.0 release setting the locale when a local locale var goes out of scope. |
Google "Unicode 7" or crib off https://github.com/jquast/wcwidth/ - we're several revisions out of date I think? I think it'd serve us well to follow what they figured out in the above github project. |
I looked at the changes in https://github.com/jquast/wcwidth/ and they appear to be refinements to how combining characters are handled. An edge case for us that probably only affects Cygwin users using characters outside the BMP. I'm also quite leery of using random sources of "the truth" such as the CharWidths.txt link you added. If you're passionate about this please open a new issue and take ownership of it. Personally I think we should stop using our own implementation of wcwidth. If someone is using a distro that has a broken implementation they should be demanding the distro maintainer to fix their implementation. Unicode is no longer bleeding edge technology. Having our own implementation might have made sense in 2007. It doesn't make any sense now. |
This issue is not fixed at all. I cannot understand why this issue is eagerly closed for another issue just because nobody keeps complaining and for someone's indifference. It is just lucky that those emojis do not cause layout problems on some terminals that some people are using. An update to the hard-coded width table may cause problems again, for the terminal may use another table.
I don't see any hope for such an infamous bug about
But it is a monster, hence the ICU project. Do you know I mentioned another similar project utf8proc before? The solution mentioned by @floam in #2484 is quite flexible, although fish needs to update the width table once the users change the font settings of their terminals. I think that gives a good direction. (EDIT: fix typos) |
@jakwings, I don't know what you expect us to do. We should be able to rely on the unicode implementation provided by a given distro at this point in time. As you said in your last update to issue #2484, which is still open,
I'm happy to review any changes you, or anyone else, submits to improve how fish handles the problematic characters. Since issue #2484 is still open I don't understand what you're complaining about. |
Also, this issue was not "eagerly closed". It's been open for 11 months. And the original problem was fixed seven days after the issue was opened. That the fix may not solve all unicode character width problems isn't relevant to this issue. |
Sorry, in #2484, I was just replying to floam for that multibyte issue. Now I've appended the name floam to it.
I don't think so, even @ridiculousfish (he added the code for
Really? I am suspicious. |
Ok I remember the fix submitted by the reporter. I've no complaint for closing this issue now. But |
The issue of character cell width reported by Given that various terminals, including pseudo terminals like tmux, have their own private wcwidth implementations it is impossible for fish to do the right thing in every single circumstance. I'm more than happy to review and merge any changes to make fish friendlier to people using characters not in the ASCII or European char sets. If you're someone who cares about this issue and has expertise on the topic I encourage you to create pull-requests for review or open issues with enough information to allow someone else to make the necessary changes. |
Not decided yet, the solution from floam may not be universal. Need investigation. |
Forgive me for my ignorance, but might there a way to have the right prompt positioned correctly without getting |
Did you trip it up in the right prompt? Was the total width wrong (i.e. you didn't have two wrong characters, one over- and one undercounted)? Here's what fish does to write the right prompt: s_move(scr, &output, (int)(screen_width - right_prompt_width), (int)i);
s_set_color(scr, &output, 0xffffffff);
s_write_str(&output, right_prompt); i.e. it moves the cursor to the starting position of the right prompt (which is screen_width - right_prompt_width), sets the color (which by definition has a width of 0 since no character shows up on screen) and then writes the prompt. This depends on three things:
Now, there might be other ways of doing it (not that I can come up with one), but that's essentially what zsh does as well (at least that's what an However, note that this bug is closed. I'd direct you to #2652 instead. Or open a new one and explain your actual problem there in detail. |
FWIW, I had an issue with color emoji in the left prompt, which messed things up, but turned out to be an issue with tmux. Making tmux use libutf8proc version >2 solved that one for me. The right prompt is still not aligned with the right border, though, so I searched here for leads. Maybe this helps. |
Heeeeeey, everybody!! So yeah, I hit this issue, in iTerm2 on a Mac. Eventually figured out that iTerm has an option under "Text" to "Use Unicode version 9 widths", which is now checked by default. After unchecking, the problem was fixed. I imagine this sort of thing will keep coming up until everyone standardizes on the same widths, but at least I understand why it was wonky now and will have a sense of where to go looking next time it hits me. Oh, and for more context, read iTerm's tooltip, which I captured in the screenshot. |
We will never standardize on the same widths ๐ข fish 3.0 has the |
If I start entering a command:
and start moving through the pager:
where what I'd expect would be:
Hitting ^c removes the last few characters of the prompt.
I've experimented with my fish_prompt - if I replace the ๐ and โฎ with simple ASCII characters, all behaves as expected.
I'd tried instead removing all terminal coloring from the prompt - helps not at all, and in the absence of UTF8 characters, works fine (I'm using set_color everywhere.)
Using one of the git_status prompts is worse, since they use checkmarks etc.
Relevant environment:
I'm on Gentoo Linux, using urxvt.
I've done a quick tryout of a couple of other terminals and get similarly broken results. Weirdest is st, which doesn't display the characters but gets really confused about the correct insertion point.
The text was updated successfully, but these errors were encountered: