Font.cpp: Use UCS-4 on all platforms, maybe precompose? #359

jlnr · 2017-01-06T04:10:31Z

Gosu::Font assumes that every wchar_t in a string should be rendered as a single character. That leads to inconsistencies between UNIX (wchar_t = 32 bit) and Windows (wchar_t = 16 bit).

The correct fix would be to split the string into grapheme clusters, but I don't know if that is possible with Gosu's current dependencies (Win32 API, macOS API, iconv). It is also frustrating that the definition of a grapheme cluster changes every year. This page looks completely different on macOS 10.11 and 10.12.

The next best thing is to convert the UTF-8 string into precomposed code points. This avoids Ä being rendered as A¨ and also ensures that UTF-16 surrogates are not rendered as separate characters, as is the case on Windows right now.

And if precomposing is impossible without external dependencies, Gosu should at least use UCS-4 on Windows for consistency with UNIX.

The text was updated successfully, but these errors were encountered:

jlnr · 2017-04-02T20:31:44Z

This library looks nice, Gosu could just bundle it and start using grapheme clusters internally: https://github.com/JuliaLang/utf8proc

pmer · 2017-04-03T01:12:07Z

Another library to check out might be HarfBuzz, which is MIT licensed and which claims to have no dependencies. Unfortunately IMHO the project does a terrible job marketing itself and it's hard to understand exactly what parts of Unicode handling it solves.

pmer · 2017-04-03T04:30:42Z

Okay, been doing some information about Harfbuzz. It's a high-level library that takes a Unicode string and tells you where to draw glyphs if you want to render it. It seems like a pretty big addition, so not sure how well it will fit with the existing Font class but it's something to check out.

Documentation:

Links:

jlnr · 2017-04-03T06:13:41Z

Similarly, each Arabic character has four different variants: within a font, there will be glyphs for the initial, medial, final, and isolated forms of each letter. Unicode only encodes one codepoint per character, and so a Unicode string will not tell you which glyph to use. Text shaping chooses the correct form of the letter and returns the correct glyph from the font that you need to render.

Ouch. It seems like the API I'm thinking about for Gosu::Font is inherently unsuitable for Arabic then:

font = Gosu::Font.new(...) { |cluster| Image.from_text(cluster, ...) }

I.e. Font's core responsibility is the segmentation of input strings, and it's up to the user how each cluster (substring) is rendered. Font just happens to use Image.from_text by default if you don't pass a custom block to it.

This wouldn't work with Harfbuzz, which is not based on substrings, but on glyph IDs.

Of course that's a theoretical problem right now, as Gosu doesn't handle bidirectional text (like Arabic) at all and Harfbuzz won't be of help for that either:

https://pdm.me/harfbuzz/hello-harfbuzz.html#what-harfbuzz-doesnt-do

I think it's good enough if Font is based on grapheme clusters (which I can pass around as strings), plus maybe a little bidi algorithm later. For all kinds of ligatures, we still have Image.from_text.

Thanks for the recommendation, though. I have been playing around with OSM map rendering and I didn't know what the libharfbuzz dependency was used for.

jlnr · 2017-04-03T06:23:58Z

Refactoring Font to use substrings instead of wchar_t internally is not trivial, this should probably be fixed together with #304.

jlnr added enhancement graphics platform-windows labels Jan 6, 2017

jlnr self-assigned this Jan 6, 2017

jlnr mentioned this issue Mar 21, 2017

Print a warning when relying on case insensitivity #363

Open

jlnr mentioned this issue Nov 26, 2017

Use stb_truetype for all text rendering #422

Merged

9 tasks

jlnr closed this as completed in #422 Feb 6, 2018

jlnr mentioned this issue Feb 7, 2018

Refactor Gosu::Font #429

Merged

10 tasks

jlnr mentioned this issue May 4, 2018

Font refactoring, font fallbacks #448

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Font.cpp: Use UCS-4 on all platforms, maybe precompose? #359

Font.cpp: Use UCS-4 on all platforms, maybe precompose? #359

jlnr commented Jan 6, 2017 •

edited

jlnr commented Apr 2, 2017

pmer commented Apr 3, 2017 •

edited

pmer commented Apr 3, 2017

jlnr commented Apr 3, 2017 •

edited

jlnr commented Apr 3, 2017

Font.cpp: Use UCS-4 on all platforms, maybe precompose? #359

Font.cpp: Use UCS-4 on all platforms, maybe precompose? #359

Comments

jlnr commented Jan 6, 2017 • edited

jlnr commented Apr 2, 2017

pmer commented Apr 3, 2017 • edited

pmer commented Apr 3, 2017

jlnr commented Apr 3, 2017 • edited

jlnr commented Apr 3, 2017

jlnr commented Jan 6, 2017 •

edited

pmer commented Apr 3, 2017 •

edited

jlnr commented Apr 3, 2017 •

edited