Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test the emoji width at runtime #5722

Closed
wants to merge 3 commits into from

Conversation

faho
Copy link
Member

@faho faho commented Mar 5, 2019

Since Unicode 9, the width of some characters changed to 2.

Depending on the system, it might have support for it, or it might
not.

Instead of hardcoding specific glibc etc versions, we compile a small
program and check an actual value (the one for "馃槂", U+1F603 "Grinning Face With Big Eyes").

To skip the check, the USE_UNICODE9_WIDTH option can be used to default emoji width to 2.

The intention is to, in most cases, make setting $fish_emoji_width
unnecessary, but since it sets the "guessed_emoji_width", that variable still takes precedence if it is set.

Unfortunately this approach has some caveats:

  • It relies on the locale being set to a unicode-supporting one.
    (C.UTF-8 is unfortunately not standard, so we can't use it)
  • It relies on the terminal's wcwidth having unicode9 support IFF the
    system wcwidth does.

Some results:

Ubuntu Xenial (16.04) on Travis correctly detects 1 (glibc is 2.23,
the cut-off for unicode9 support is 2.26).

Archlinux detects 2.

macOS should mostly be unaffected as we have special heuristics for
iTerm and Terminal.app. Since the result is 2 for macOS 10.13.3, we could probably drop the Terminal.app check. iTerm defaults to unicode8 widths, and we might want to emit the special escape that switches it to unicode9 (https://gitlab.com/gnachman/iterm2/wikis/unicodeversionswitching).

Alpine edge (which uses musl) also returns 2.

NetBSD returns -1, even with LC_ALL set to utf-8, OpenIndiana/Illumos/Solaris/SunOS returns 1 (with the default en_US.UTF-8 locale).

TODOs:

  • [N/A] Changes to fish usage are reflected in user documentation/manpages.
  • [N/A] Tests have been added for regressions fixed
  • User-visible changes noted in CHANGELOG.md

@faho faho added this to the fish 3.1.0 milestone Mar 5, 2019
@zanchey
Copy link
Member

zanchey commented Mar 5, 2019

The platform that worries me here is WSL, but in general I am not sure we can count on the build system having similar characteristics to the running system.

@faho
Copy link
Member Author

faho commented Mar 5, 2019

The platform that worries me here is WSL

IIRC, WSL is Ubuntu 16.04 currently, so nothing changes right now. Once they update to 18.04, they'll get a newer wcwidth, in which case they should also fix their terminal to conform to that, or those users will still have to set -g fish_emoji_width 1.

in general I am not sure we can count on the build system having similar characteristics to the running system.

The build system should usually have the same wcwidth(). Even if there are differences, I don't see how it's any worse to sometimes get the newer widths. Remember, the current default is 1, for everyone except those using a new-ish Terminal.app.

And regarding the locale, I'm about to push a commit that tries to get a UTF-8 locale.


Really, the idea here is to default to the same width that the system's wcwidth would give, which is a guess, but unfortunately the best we can do.

faho added 2 commits March 5, 2019 17:58
Since Unicode 9, the width of some characters changed to 2.

Depending on the system, it might have support for it, or it might
not.

Instead of hardcoding specific glibc etc versions, we compile a small
program and check an actual value.

The intention is to, in most cases, make setting $fish_emoji_width
unnecessary.

Unfortunately this approach has some caveats:

- It relies on the locale being set to a unicode-supporting one.
  (C.UTF-8 is unfortunately not standard, so we can't use it)
- It relies on the terminal's wcwidth having unicode9 support IFF the
  system wcwidth does.

Some results:

Ubuntu Xenial (16.04) on Travis correctly detects 1 (glibc is 2.23,
the cut-off for unicode9 support is 2.26).

macOS should mostly be unaffected as we have special heuristics for
iTerm and Terminal.app. Regardless, the result is 2 for macOS 10.13.3.

Alpine edge (which uses musl) also returns 2.
This should fix the situation where the system has no locale set, in
which case the "C" locale is used in which `wcwidth(L'馃槂')` is basically meaningless.
That sucky g++ on Travis' old Ubuntu apparently can't handle the
awesomeness without additional configuration.
@floam
Copy link
Member

floam commented Mar 6, 2019

I think we might prefer to do checks at runtime, not build time. I think this is going to cause confusion in a lot of instances where the build system does not match the actual target system, either by a different wcwidth() or different locales.

@faho
Copy link
Member Author

faho commented Mar 6, 2019

Yeah, as it turns out the runtime version is much simpler. It'll have issues if you use a C locale, but if you use a C locale yet your terminal still displays emoji, your system is weird enough that we don't need to support it with heuristics.

I'll just commit that.

@faho faho closed this in c633c06 Mar 6, 2019
@floam floam changed the title Test the emoji width at compiletime Test the emoji width at runtime Mar 6, 2019
@faho faho mentioned this pull request Mar 27, 2019
@faho faho modified the milestones: fish 3.1.0, fish 3.0.3 Apr 6, 2019
@zanchey zanchey modified the milestones: fish 3.0.3, fish 3.1.0 Aug 14, 2019
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 16, 2020
@faho faho deleted the compiletest-wcwidth branch February 21, 2022 13:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants