Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Align lines with emojis correctly #163

Closed
wants to merge 1 commit into from

Conversation

mikelorant
Copy link
Contributor

Alignment issues occur when some lines have certain emojis in them. This is caused by incorrectly calculating the width of a string.

For a string with emojis in them, it is important to understand that the string width is different to the rune width. Many functions within Lipgloss base the string length on the printable rune width which is incorrect. The correct method requires finding the string width.

The reflow package has a printable rune width function. While the ANSI sequences need to be removed, we do not want the rune width as this function returns.

This change adds a new function to get the printable string width. The code was taken from reflow and modified for use in Lipgloss. Since the printable rune width function in reflow works correctly, it was more reasonable to add the function to Lipgloss.

@muesli
Copy link
Member

muesli commented Dec 11, 2022

Please do fix this upstream if you found any issues. At first glance I don't see a difference however?

@mikelorant
Copy link
Contributor Author

The key change is the use of runewidth.StringWidth instead of runewidth.RuneWidth

I will provide a repository shortly showcasing the problem and why this pull request resolves the problem.

@mikelorant
Copy link
Contributor Author

I want to clarify that there is no bug with reflow. It is just providing a different answer to the question. Lipgloss needs to know the printable string width, not the printable rune width. Are you willing to accept a pull request to reflow to add this function? Lipgloss still needs to be updated to use it however.

@muesli
Copy link
Member

muesli commented Dec 11, 2022

Yeah, definitely willing to accept PRs there, so we don't end up copying similar code all over the place.

@muesli
Copy link
Member

muesli commented Dec 11, 2022

I will provide a repository shortly showcasing the problem and why this pull request resolves the problem.

Cool, a few test cases might be nice here.

@mikelorant
Copy link
Contributor Author

Test case with both code and images.

https://github.com/mikelorant/emoji-alignment

Hopefully these results are consistent with other platforms and terminals.

@meowgorithm
Copy link
Member

meowgorithm commented Dec 11, 2022

Unfortunately, I don't believe this is a valid solution. I'm already seeing different results with Kitty, however the core of the issue is with multi-sequence emojis such as the following.

// alembic
"⚗️"
"\U00002697\U0000FE0F"

// kiss: woman, man, medium-dark skin tone, light skin tone (with zero width joiners)
"👩🏾‍❤️‍💋‍👨🏻"
"\U0001F469\U0001F3FE\U0000200D\U00002764\U0000FE0F\U0000200D\U0001F48B\U0000200D\U0001F468\U0001F3FB"

Note that I'm providing both emoji literals and unicode sequences here, which Go will evaluate equally.

@mikelorant
Copy link
Contributor Author

Tested the two emojis you provided into my example code.

example

I think the complexity was always going to be the terminals and operating system support. However, on a Mac you can see this improves the results.

@mikelorant
Copy link
Contributor Author

Within Lipgloss the 12 byte emoji has some issues. I will continue looking into that issue and hopefully be able to work out what is going on.

I think my patch still improves the situation even if it does not solve the problem fully. Emojis are extremely complicated, so instead of looking for the perfect solution its best to continually improve things.

The good news is that the problem is not in Lipgloss. It all relates to go-runewidth.

@mikelorant
Copy link
Contributor Author

Even more interesting is my patch seems to get much closer to handling the 12 byte emoji. Normally the main issue is table borders break. This did not occur which indicates this may just be an issue with calculating maximum line length.
example

If we can identify more emojis that fail this will allow us to build a small bubbletea app that can be used for testing purposes. Its hard to identify all the edge cases.

@mikelorant
Copy link
Contributor Author

mikelorant commented Dec 11, 2022

Finding some very unusual results when using kitty. Providing some comparison shots with iTerm2 and Terminal for reference.

iTerm2
iTerm2

Terminal
Terminal

kitty
kitty

WezTerm
WezTerm

Hyper
Hyper

Not sure what all this means except more research is required.

@mikelorant
Copy link
Contributor Author

mikelorant commented Dec 11, 2022

Kitty has its own wcwidth implementation while Terminal and iTerm2 use the system wcwidth. This may not be solvable if rendering is done differently by terminals.

Open issue for kitty.
kovidgoyal/kitty#1978

Specific comment about wcwidth:
kovidgoyal/kitty#1978 (comment)

@praagyajoshi
Copy link

👋 Hello, thanks for your work on this @mikelorant.

I'm running into this exact issue without using multi-sequence emojis.
CleanShot 2023-09-26 at 00 19 06@2x

Is it possible to move ahead with this fix? Failing that, is there a workaround available?
cc: @meowgorithm.

@meowgorithm
Copy link
Member

meowgorithm commented Sep 25, 2023

What I would recommend for now is to simply avoid any emojis that are causing you problems.

The crux of the issue here is that emoji widths must both be correctly calculated in both Go and on the terminal level, and neither side is perfect yet. Even Kitty, which is among the best renderers I've seen, doesn't render every single emoji correctly. Going further, even some desktop environments don't support newer emojis.

We have some tooling in the works that should alleviate the problem a bit on the Go side, but there's not ETA yet.

@praagyajoshi
Copy link

Got it, thanks for the additional context @meowgorithm.

simply avoid any emojis that are causing you problems.

Just a clarification on that front, almost all the emojis I tried are suffering from the same issue.
Some of them include: ✨, 🌊, 💫, ✅, 🚀

@meowgorithm
Copy link
Member

@praagyajoshi which terminal, OS, and window manager (if relevant) are you using? As I mentioned this varies widely depending on terminal and can be affected by other OS-level factors.

Also, we still need to test properly but I suspect that switching from go-runewidth to rivo/uniseg directly under the hood may solve for this, particularly as rivo/uniseg now supports both the latest version of Unicode (15.0.0) and the ability to return cell widths.

@mikelorant
Copy link
Contributor Author

I'd like to bring this issue up again based on some feedback from @rivo.

That's because a single code point (rune) is not always a complete character. That's the whole basis of the uniseg package. The README explains this in detail. One example given there is the country flag emoji. These emojis must always consist of two runes. What's the width of only one of these runes then? It's basically undefined because it's an incomplete grapheme cluster. The mattn/go-runewidth package gets this fundamentally wrong and I have tried for a long time to help them do it right but there was never much interest in following up on it.
rivo/uniseg#48 (comment)

Using the rune width anywhere is something we need to be very careful about. In most cases I would expect we should only be using the string width.

It is likely we will not see the full benefits unless we replace RuneWidth functions across nearly all of the package as well as its dependencies. This makes this quite a complicated and time consuming task.

My recommendation would be to create a new issue for either bubbles or bubbletea and determine all the dependencies that would need to be fixed. This would then allow for issues and hopefully pull requests in each of the repositories.

@meowgorithm
Copy link
Member

meowgorithm commented Jan 22, 2024

Hey! We actually began work on switching to rivo.StringWidth today. It's not too daunting of a task as we maintain a fairly tight level of control over our dependency chain, at least as far as rendering is concerned. Here’s a few PRs for it so far:

@mikelorant
Copy link
Contributor Author

@meowgorithm Been tracking the great work on the StringWidth improvements. 👍

This is why I have now turned my attention to RuneWidth. This is just so much more complex because we can't just swap it out. While we can convert the rune to a string we now know it is not always going to be valid.

As you are one of the project maintainers, would be great to get some guidance on where we should start. Something experimental to prove this will give us improvements.

@meowgorithm
Copy link
Member

meowgorithm commented Jan 23, 2024

So I think ultimately we want to be counting and iterating over graphemes like you suggest. For example, ⚗️ is 6 bytes, 2 runes, and 1 grapheme cluster playground.

I'd also, have a look at this excellent writeup on grapheme clusters in terminals by Mitchell Hashimoto. It's the best reference I've seen on this subject.

Alignment issues occur when some lines have certain emojis in them. This
is caused by incorrectly calculating the width of a string.

For a string with emojis in them, it is important to understand that
the string width is different to the rune width. Many functions within
Lipgloss base the string length on the printable rune width which is
incorrect. The correct method requires finding the string width.

The reflow package has a printable rune width function. While the ANSI
sequences need to be removed, we do not want the rune width as this
function returns.

This change adds a new function to get the printable string width. The
code was taken from reflow and modified for use in Lipgloss. Since the
printable rune width function in reflow works correctly, it was more
reasonable to add the function to Lipgloss.
@mikelorant
Copy link
Contributor Author

mikelorant commented Jan 26, 2024

I've put up a preliminary pull request for muesli/reflow to iterate over grapheme clusters instead of runes. There are some complexities that will likely require further discussion but we can deal with them via an issue on that repository instead.

@meowgorithm
Copy link
Member

Awesome. For reference that's muesli/reflow#71 (also FYI @muesli is on the Charm team).

@maas, heads up re: textarea.

@mikelorant
Copy link
Contributor Author

Closing this pull request as this is going to be fixed with muesli/reflow#71.

@mikelorant mikelorant closed this Jan 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants