Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong character counting on Emojis #12316

Open
kb10uy opened this issue Nov 6, 2019 · 3 comments
Open

Wrong character counting on Emojis #12316

kb10uy opened this issue Nov 6, 2019 · 3 comments
Labels
bug Something isn't working

Comments

@kb10uy
Copy link

kb10uy commented Nov 6, 2019

Some emojis (e.g. 🏴󠁧󠁢󠁳󠁣󠁴󠁿 :flag-scotland: ) are counted as multiple characters (7 for that), which should be 1.
(omg GitHub paints them all with black ink :-( )

The stringz library does not seem to support Emoji 5.0 (2017). Perhaps we should use another one for character counting to support the latest Unicode specification.
AFAIK, grapheme-splitter will do that properly.

Expected behaviour

All emojis are counted as 1 character.

Actual behaviour

Some flag emoji are counted as 2 or more characters.

Steps to reproduce the problem

Input :flag-england: , :flag-scotland: , or :flag-wales: .
image

Specifications

Mastodon: 3.0.1
Browser: Firefox 70.0.1 (Windows)

@MaciekBaron
Copy link
Contributor

Emojis are, in fact, several characters, and they are counted the same way on Twitter et al. A lot of emojis are combinations of emojis that happen to be displayed as a single image, but also, some browsers might not display emojis at all, or display them as several glyphs.

The bottom line is, while it might seem counter-intuitive, the character count is correct and should stay this way.

@MaciekBaron MaciekBaron added the status/wontfix This will not be worked on label Nov 10, 2019
@kb10uy
Copy link
Author

kb10uy commented Nov 11, 2019

Yah, I understand that emoji are constructed from several codepoints. The problem is, some emoji are counted by codepoints, but others are by grapheme clusters. I'm insisting on the inconsistency.

  • 👨‍👨‍👧‍👦 consists of 7 codepoints, 1 grapheme cluster (so counted as 1)
  • :flag-scotland: also consists of 7 codepoints, 1 grapheme cluster (but counted as 7)

By the way, the lengths of posts are always counted by grapheme cluster in backend.
Just replacing stringz with grapheme-splitter does not work?

@stale stale bot removed the status/wontfix This will not be worked on label Nov 11, 2019
@MaciekBaron
Copy link
Contributor

Sorry, I've read your comment more generally. You are raising a valid point then if there is a discrepancy between backend and frontend.

@vmstan vmstan added the bug Something isn't working label Nov 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants