Powerline symbols appear as '?' #3124

Closed
letheed opened this Issue Jun 7, 2016 · 18 comments

Projects

None yet

4 participants

@letheed
letheed commented Jun 7, 2016 edited

The regression appeared with commit 0b385f1 a few days ago. The theme is bobthefish using oh-my-fish (up-to-date). I'm not sure if the fix introduced this bug or uncovered one, or if the theme is to blame.

I'm using the fish nightly PPA builds on wily (15.10).
Tell me if the env or the config files are needed.

capture du 2016-06-07 20-47-02


For 2.3.1 release:

  • Update release notes
@krader1961 krader1961 self-assigned this Jun 7, 2016
@krader1961
Member

What does locale output?

@letheed
letheed commented Jun 7, 2016
$ locale
LANG=fr_FR.UTF-8
LANGUAGE=fr_FR:en
LC_CTYPE="fr_FR.UTF-8"
LC_NUMERIC=fr_FR.UTF-8
LC_TIME=fr_FR.UTF-8
LC_COLLATE="fr_FR.UTF-8"
LC_MONETARY=fr_FR.UTF-8
LC_MESSAGES="fr_FR.UTF-8"
LC_PAPER=fr_FR.UTF-8
LC_NAME=fr_FR.UTF-8
LC_ADDRESS=fr_FR.UTF-8
LC_TELEPHONE=fr_FR.UTF-8
LC_MEASUREMENT=fr_FR.UTF-8
LC_IDENTIFICATION=fr_FR.UTF-8
LC_ALL=
@krader1961
Member

Okay, your locale looks reasonable with the exception of LANGUAGE but that's probably not relevant. Too, I'd have to check the docs but that value is probably legal.

I can't reproduce on Ubuntu 16.04. I'm not using PowerLine fonts but if I insert \u2311\u2612 in my prompt a small square box and ballot box w/X shows up as expected. No question marks.

How are the PowerLine chars being inserted into your prompt? Presumably as UTF-8 sequences but please confirm. If you replace those with unicode literals as in my example above do you see the expected PowerLine glyphs? Obviously you'll need to figure out what codepoints correspond to the symbols in your sample screenshot.

P.S., The change in question has probably uncovered a bug as the new code should be functionally equivalent to the old code. I simply removed the logic for choosing which locale vars to use from fish in favor of the setlocale() function which already does that when called in the idiomatic setlocale(LC_ALL, "") manner.

@floam
Member
floam commented Jun 7, 2016

I'm getting a "?" with the built-in Terlar in Terminal.app but not iTerm2.

On Tue, Jun 7, 2016 at 3:21 PM Kurtis Rader notifications@github.com
wrote:

Okay, your locale looks reasonable with the exception of LANGUAGE but
that's probably not relevant. Too, I'd have to check the docs but that
value is probably legal.

I can't reproduce on Ubuntu 16.04. I'm not using PowerLine fonts but if I
insert \u2311\u2612 in my prompt a small square box and ballot box w/X
shows up as expected. No question marks.

How are the PowerLine chars being inserted into your prompt? Presumably as
UTF-8 sequences but please confirm. If you replace those with unicode
literals as in my example above do you see the expected PowerLine glyphs?
Obviously you'll need to figure out what codepoints correspond to the
symbols in your sample screenshot.

P.S., The change in question has probably uncovered a bug as the new code
should be functionally equivalent to the old code. I simply removed the
logic for choosing which locale vars to use from fish in favor of the
setlocale() function which already does that when called in the idiomatic setlocale(LC_ALL,
"") manner.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#3124 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AARxRvY-ajIBQlYb6txpMwhk7gT4xf3Lks5qJe8EgaJpZM4IwREW
.

@floam
Member
floam commented Jun 7, 2016

Hm. It goes back to normal if I get rid of the iTerm2 terminal integration
plugin. It didn't used to cause problems with Terminal.app.

On Tue, Jun 7, 2016 at 3:35 PM Aaron Gyes me@aaron.gy wrote:

I'm getting a "?" with the built-in Terlar in Terminal.app but not iTerm2.

On Tue, Jun 7, 2016 at 3:21 PM Kurtis Rader notifications@github.com
wrote:

Okay, your locale looks reasonable with the exception of LANGUAGE but
that's probably not relevant. Too, I'd have to check the docs but that
value is probably legal.

I can't reproduce on Ubuntu 16.04. I'm not using PowerLine fonts but if I
insert \u2311\u2612 in my prompt a small square box and ballot box w/X
shows up as expected. No question marks.

How are the PowerLine chars being inserted into your prompt? Presumably
as UTF-8 sequences but please confirm. If you replace those with unicode
literals as in my example above do you see the expected PowerLine glyphs?
Obviously you'll need to figure out what codepoints correspond to the
symbols in your sample screenshot.

P.S., The change in question has probably uncovered a bug as the new code
should be functionally equivalent to the old code. I simply removed the
logic for choosing which locale vars to use from fish in favor of the
setlocale() function which already does that when called in the
idiomatic setlocale(LC_ALL, "") manner.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#3124 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AARxRvY-ajIBQlYb6txpMwhk7gT4xf3Lks5qJe8EgaJpZM4IwREW
.

@zanchey zanchey added this to the fish-future milestone Jun 7, 2016
@krader1961
Member

I replaced the unicode literal \u2311 with (/bin/echo -ne "\xe2\x8c\x91") and observed the expected small square glyph. So decoding UTF-8 on input (e.g., reading the output of a subcommand) and subsequently encoding it to UTF-8 when we write it is working correctly.

Could someone who can reproduce this run fish inside the script command then exit after a single incorrect prompt and attach the resulting typescript file? Please also include a screen capture of the typescript session.

@floam
Member
floam commented Jun 7, 2016
@krader1961
Member

There is a literal ASCII question-mark in the prompt. This suggests that the code point conflicts with the unicode private-use range reserved by fish for its special tokens. When that happens fish replaces the character with a question-mark so as not to confuse its internal state. That's a change I made back in march (see issue #2684) and the recent change to how the locale is set shouldn't have changed the behavior.

@floam can you confirm that if you build a fish binary without commit 0b385f1 the behavior changes? If it does please repeat your earlier typescript experiment and attach the data again.

P.S., If the issue is due to a character that conflicts with fish's use of private-use code points it's an example of another problem that would be solved by using UTF-8 internally rather than wide-chars.

@floam
Member
floam commented Jun 7, 2016

screenshot 2016-06-07 at 4 43 18 pm

[typescript.txt](https://github.com/fish-shell/fish-shell/files/303748/typescript.txt)
@krader1961
Member

Okay, the character in @floam's experiment is \u27A4 which is not one of the code points that fish reserves for internal use. So my change from March isn't a factor. Nonetheless, I still can't reproduce the problem. I added that character to my prompt:

echo -ns (set_color $fish_color_git) (__fish_git_prompt) (/bin/echo -ne "\xe2\x9e\xa4") \u27A4

Using a fish built from git head (so it includes my setlocale change) I see two instances of the expected glyph. My locale is

LANG=en_US.UTF-8
LANGUAGE=en_US:
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

I tested on Ubuntu 16.04 and OS X 10.11.6.

Bingo! If I switch to LANG= fr_FR.UTF-8 the unicode literal \u27A4 becomes a question-mark yet the UTF-8 sequence is correctly handled. That is bizarre to say the least. Okay, at this point I've got enough info to dig deeper now that I can reproduce.

@floam, Does locale on your system report anything other than en_US.UTF-8?

@floam
Member
floam commented Jun 8, 2016
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
@floam
Member
floam commented Jun 8, 2016

I'll say again though, this goes away if I delete ~/.iterm2_shell_integration.fish - and only occurs in Terminal.app (which is set to export the locale).

@krader1961
Member

A fish built without my setlocale change still behaves weird. @floam, can you test whether a fish built from git checkout 0b385f145ce6144b5812bd89fa8f73369bcbe57f^ still behaves incorrectly on your system?

@floam
Member
floam commented Jun 8, 2016

git checkout 0b385f145ce6144b5812bd89fa8f73369bcbe57f: broken
git checkout 0b385f145ce6144b5812bd89fa8f73369bcbe57f^: fixed

@floam floam added the bug label Jun 8, 2016
@floam
Member
floam commented Jun 8, 2016

On working fish:

echo \u27A4
➤
string length \u27A4
1
string length ➤
1

On broken fish:

echo \u27A4
?
string length \u27A4
1
string length ➤
3

And actually pasting in a ➤ causes all kinds of disturbances with the line editor. A wcwidth() problem?

@letheed
letheed commented Jun 8, 2016

I can reproduce @floam's results.
Additionally, on 32a585a:

$ /bin/echo -ne "\xe2\x8c\x91"
⌑⏎
$ /bin/echo -ne "\xe2\x8c\x91" | string length
1

and on 0b385f1:

$ /bin/echo -ne "\xe2\x8c\x91"
⌑~
$ /bin/echo -ne "\xe2\x8c\x91" | string length
3
@floam
Member
floam commented Jun 8, 2016 edited

Here's the effect screwing with my locale has on the rendering. Long story short: LC_ALL=C seems to fix it? Kind of backwards.

https://asciinema.org/a/4jf626hm23snnn8yrp87kexuv

Mostly it's just a video of me confusing the heck out of myself, as I don't really understand this stuff well enough to know exactly what the behavior is supposed to be when some of these "set themselves" after I unset them or set a different variable and start a new shell instance.

@krader1961
Member
krader1961 commented Jun 14, 2016 edited

The reason I saw the problem on my Ubuntu 16.04 system was because it didn't have locale fr_FR.UTF-8 installed so fish was falling back to the C locale which causes non-ASCII unicode literals to be illegal and replaced with a question mark. Installing that locale fixed the problem and I am, once again, unable to reproduce this issue on Ubuntu 16.04.

The good news is that I can reproduce on OS X using Terminal.app but not iTerm.app which is consistent with what @floam is seeing. The difference in behavior is because _share/functions/_fish_urlencode.fish, via __update_vte_cwd is being executed in Terminal.app but not iTerm.app. That function is setting LC_ALL to C but it's not being reset when we leave the scope of the function. Thus leaving fish in a single-byte locale sending multibyte UTF-8 sequences that the terminal renders as a single character. Which means fish's idea of how many characters are on the line is wrong.

@krader1961 krader1961 modified the milestone: 2.3.1, fish-future Jun 14, 2016
@krader1961 krader1961 added a commit to krader1961/fish-shell that referenced this issue Jun 14, 2016
@krader1961 krader1961 remove unset vars from the environment
Fixes #3124
2aeeade
@krader1961 krader1961 referenced this issue Jun 14, 2016
Closed

remove unset vars from the environment #3143

2 of 3 tasks complete
@floam floam added the regression label Jun 14, 2016
@krader1961 krader1961 added a commit to krader1961/fish-shell that referenced this issue Jun 14, 2016
@krader1961 krader1961 remove unset vars from the environment
Fixes #3124
ec1c318
@krader1961 krader1961 added a commit to krader1961/fish-shell that referenced this issue Jun 16, 2016
@krader1961 krader1961 remove unset vars from the environment
Fixes #3124
74431f1
@krader1961 krader1961 added a commit that closed this issue Jun 16, 2016
@krader1961 krader1961 remove unset vars from the environment
Remove vars from the environment that are no longer set. Simplify the code by
removing an unnecessary loop. Add some tests.

Fixes #3124
0ca1036
@krader1961 krader1961 added a commit that referenced this issue Jun 21, 2016
@krader1961 krader1961 remove unset vars from the environment
Remove vars from the environment that are no longer set. Simplify the code by
removing an unnecessary loop. Add some tests.

Fixes #3124
d8dd718
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment