Add a wrapper of wcwidth() that picks the best implementation #917
This adds a i_wcwidth() function that replaces mk_wcwidth(), and a
It defaults to auto.
mk_wcwidth() is still preferable in some cases, since the way it uses
The "system" implementation is also wrapped to never return -1, but to
This adds a i_wcwidth() function that replaces mk_wcwidth(), and a 'wcwidth_implementation' setting to pick which one it wraps. Values: - old: uses our local mk_wcwidth() which implements unicode 5.0 - system: uses the libc-provided wcwidth(), which may be better or worse than ours depending on how up to date the system is. - auto: tests the system one against two characters that became fullwidth in unicode 5.2 and 9.0 respectively. If either of them pass, pick the system implementation, otherwise pick ours. It defaults to auto. mk_wcwidth() is still preferable in some cases, since the way it uses ranges for fullwidth characters means most CJK blocks are covered even if their characters didn't exist back then. The "system" implementation is also wrapped to never return -1, but to assume those unknown characters use one cell. Quoting the code: /* Treat all unknown characters as taking one cell. This is * the reason mk_wcwidth and other outdated implementations * mostly worked with newer unicode, while glibc's wcwidth * needs updating to recognize new characters. * * Instead of relying on that, we keep the behavior of assuming * one cell even for glibc's implementation, which is still * highly accurate and less of a headache overall. */
One way in which i saw #720 manifest:
It's invisible outside of mosh, which is interesting, but it's clearly our fault. Using CJK characters that mk_wcwidth knows about, such as
The other CJK character I used in the system wcwidth tests of this PR is
Neat, thanks for the changes.
I'm not really sure why we need utf8proc, but i guess it would make sense to force newer unicode in mac os whose libc is stuck in the past forever. The problem with doing that is that it might make mismatches with the terminal / tmux / mosh / etc more likely. I guess we could just document the risks and benefits.
jillest left a comment
I suppose this conceptually makes sense, given that many terminal emulators seem to find it more important to be "correct" and up to date rather than to match the text-based applications exactly.