Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

symbols changed when using nerdfonts in fish_prompt #7723

Closed
simao opened this issue Feb 16, 2021 · 14 comments
Closed

symbols changed when using nerdfonts in fish_prompt #7723

simao opened this issue Feb 16, 2021 · 14 comments
Labels
bug Something that's not working as intended
Milestone

Comments

@simao
Copy link

simao commented Feb 16, 2021

~ ❯ fish --version
fish, version 3.1.2
~ ❯ echo $version 
3.1.2
~ ❯ uname -a
Linux asterix 5.10.13-arch1-1 #1 SMP PREEMPT Wed, 03 Feb 2021 23:44:07 +0000 x86_64 GNU/Linux
~ ❯ echo $TERM
xterm-termite

Tried with sh -c 'env HOME=$(mktemp -d) fish' but no change in behavior. Tried with alacritty and termite and get the same result.

Fish seems to change characters in the prompt when using nerd fonts.

For example, using the nf-mdi-chevron_right symbol (). Fish changes that it to . These chars don't render in the browser properly, so here is an image:

image

@faho
Copy link
Member

faho commented Feb 16, 2021

What does locale say? How about fish_prompt | od -t x1z?

Note that fish has very little to do with font rendering - it just sends the bytes to the terminal and that decides how to render them.

Also things like nerd fonts are very hard to support because they abuse the unicode private use area, which is very tricky because the terminal and fish need to agree on the width, but there's no standard width for the private use area, so the best fish can do is to allow you to set it via $fish_ambiguous_width (either 1 or 2) and hope the terminal agrees - if they aren't all the same width it's unsupportable.

xterm-termite

As a sidenote, termite's development is dead, I recommend picking a different terminal.

@simao
Copy link
Author

simao commented Feb 16, 2021

Here is fish_prompt | od -t x1z:

fish_prompt | od -t x1z
0000000 ef 99 81 0a                                      >....<
0000004
~ ❯ locale
LANG=en_IE.UTF-8
LC_CTYPE="en_IE.UTF-8"
LC_NUMERIC="en_IE.UTF-8"
LC_TIME="en_IE.UTF-8"
LC_COLLATE="en_IE.UTF-8"
LC_MONETARY="en_IE.UTF-8"
LC_MESSAGES="en_IE.UTF-8"
LC_PAPER="en_IE.UTF-8"
LC_NAME="en_IE.UTF-8"
LC_ADDRESS="en_IE.UTF-8"
LC_TELEPHONE="en_IE.UTF-8"
LC_MEASUREMENT="en_IE.UTF-8"
LC_IDENTIFICATION="en_IE.UTF-8"
LC_ALL=

I understand that nerd fonts is kind of a mess. Is there anyway around this? I bumped into this while using the default starship prompt char, it renders differently on bash and fish. bash doesn't seem to touch it but fish changes it to a slightly different char.

Yeah I know termite is dead and I am looking for a new terminal but none has url navigation like termite so I am missing that from other terminals.

@faho
Copy link
Member

faho commented Feb 16, 2021

Ah, okay, that's a U+F641, which is part of our "ENCODE_DIRECT" area.

Fish uses some private use area codepoints for its own use to encode things like "this is a variable expansion" in-band, and if it encounters a codepoint from that it encodes them in that other area (also private use). It seems there's a place where this isn't reversed correctly.

Can you show the byte value you get in bash?

I understand that nerd fonts is kind of a mess. Is there anyway around this?

The way around this would be for nerd fonts to get this stuff into unicode, as actual assigned code points. Anything that uses the private use area in this way is going to be messy.

@simao
Copy link
Author

simao commented Feb 16, 2021

I see.
here is what I get in bash:

$ export PS1=
echo $PS1 | od -t x1z
0000000 ef 99 81 0a                                      >....<
0000004

So the same, before it's rendered in fish I guess.

@faho
Copy link
Member

faho commented Feb 16, 2021

So... it's actually in that area.

Can you see what running fish_prompt manually prints?

@simao
Copy link
Author

simao commented Feb 16, 2021

running just fish_prompt prints it properly:

image

@faho
Copy link
Member

faho commented Feb 16, 2021

Ah, okay.

What happens here is that, when fish does output itself, it will try to sanitize that string by reversing the ENCODE_DIRECT encoding, but the prompt never went through that, so it mangles the characters.

I think we just want to skip that bit with everything we write via s_write, but I'm not entirely sure.

@ridiculousfish: Any comment?

faho added a commit to faho/fish-shell that referenced this issue Feb 16, 2021
If e.g. the prompt includes codepoints in the ENCODE_DIRECT area, this
would mangle them, because the prompt never goes through that encoding
process.

Now I'm not sure if this is necessary *at all* in the outputter, but
let's be conservative for now and only do it for anything coming from
the screen.

Might fix fish-shell#7723
@faho
Copy link
Member

faho commented Feb 16, 2021

@simao
Copy link
Author

simao commented Feb 16, 2021

Yep that works:

2021-02-16-165409_773x92_scrot

@faho
Copy link
Member

faho commented Feb 16, 2021

Alright, I think I got it the wrong way around - wcs2string does the decoding, the current outputter_t::writestr doesn't.

writech does, but for some reason this was never added to the version that does an entire string in one go?

So I apparently missed an encoding step somewhere.

@ridiculousfish
Copy link
Member

ridiculousfish commented Feb 17, 2021

When fish executes some command and captures its output (say fish_prompt or external commands), it tries to convert the output from the raw bytes to a Unicode string using UTF-8 (actually using the user's locale, but assume UTF-8 here). For any bytes which are not part of a valid UTF-8 sequence, fish encodes them as Unicode characters in the private use area starting at U+F600. For example the single byte 0xFE is not valid UTF-8, so it would be encoded as U+F6FE. fish calls this "direct encoding."

I think there's two bugs. One is that outputter_t::writestr is failing to decode these. I think that was my mistake from many years ago, when I didn't understand what was happening, and probably an attempt to work around the second bug.

The second bug is handling the following: if the input is a valid UTF8 sequence which collides with our direct encoding, we try to encode that sequence directly as well. For example, if we capture 0xFE, we encode it as U+F6FE; if we capture 0xEF9BBE then that is a valid UTF8 sequence which also decodes to U+F6FE. fish tries to detect this case but that looks wrong to me, it is only storing the first byte. edit: I was wrong about this, fish is properly round-tripping characters in the private use area.

nerdfonts trips this bug because they assign glyphs to the private use area.

I think we should fix both writestr and str2wcs.

@ridiculousfish
Copy link
Member

ridiculousfish commented Feb 17, 2021

This sort of bisects to f2246df but that just shifted the value of ENCODE_DIRECT_BASE, changing which glyphs were affected. I think writestr has always been busted in this way.

The failure to decode raw bytes in output dates back to the original git import 149594f, while the raw bytes encoding scheme was added later, in 1977d3b. writestr was just never taught about it. So technically this is a regression in 1977d3b 😂

@ridiculousfish
Copy link
Member

Since the screen output code predates the encode direct scheme, it's just an oversight that it failed to handle directly encoded bytes, so I feel confident in just using the wcs2string family of functions. Thanks for filing, this was a neat bug!

@zanchey zanchey added this to the fish 3.2.0 milestone Feb 18, 2021
@zanchey zanchey added bug Something that's not working as intended and removed needs more info labels Feb 18, 2021
@simao
Copy link
Author

simao commented Feb 18, 2021

Thanks a lot both for fixing it and for working on fish!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something that's not working as intended
Projects
None yet
Development

No branches or pull requests

4 participants