`center` and `truncate` should be grapheme cluster and unicode width aware #183

Kijewski · 2024-09-23T13:37:31Z

Let's have a look at the letter 'ễ' in the surname "Nguyễn". You can either find it as a single composed unicode character U+1EC5. Or decomposed as "e\u{302}\u{303}". When truncating a text, the letter ễ should stay ễ, and not be truncated to ê or e. Or take the emoji "👯‍♂️", which is composed of "\u{1f46f}\u{200d}\u{2642}\u{fe0f}", a sequence that must not be split up.

It would be nice if one could make |center and |truncate understand unicode widths (ễ = 1 display character; 👯‍♂️ = 2 display characters), and grapheme clusters. It should be opt in, because the lookup tables are big.

Maybe, instead of modifying the existing functions, new ones should be introduced.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`center` and `truncate` should be grapheme cluster and unicode width aware #183

`center` and `truncate` should be grapheme cluster and unicode width aware #183

Kijewski commented Sep 23, 2024 •

edited

Loading

center and truncate should be grapheme cluster and unicode width aware #183

center and truncate should be grapheme cluster and unicode width aware #183

Comments

Kijewski commented Sep 23, 2024 • edited Loading

`center` and `truncate` should be grapheme cluster and unicode width aware #183

`center` and `truncate` should be grapheme cluster and unicode width aware #183

Kijewski commented Sep 23, 2024 •

edited

Loading