Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples: add shaped-text2svg for generating SVGs from shaped Unicode text. #70

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

eddyb
Copy link
Contributor

@eddyb eddyb commented Jun 22, 2023

This is based on the ttf-parser example font2svg, and uses a combination of unicode-bidi and rustybuzz on top of it, to offer a relatively compact but (hopefully) complete usage example for rustybuzz.

While discussing such a self-contained "complete example" with @Manishearth, he mentioned that it may be possible for rustybuzz to offer a "complete Unicode bidirectional shaping solution", to avoid having the user correctly use unicode-bidi etc.


Using Go Noto Universal 6.0's GoNotoCurrent.ttf, and the UDHR, I was able to get some examples:
(all images are chosen samples, with links above them for the full original version, due to GitHub limitations)

Lang shaped-text2svg output diff w/ browser rendering
eng full SVG
full HTML

(--- are misaligned - all languages hit this)
arb full SVG
full HTML

(Latin glyphs appear to misalign Arabic ones)
hin full SVG
full HTML

(no idea what's going on here, more investigation needed)
cmn_hans full SVG
full HTML

((III) confirmed to shape differently in browser vs rustybuzz)

A few notes about that that diff in the last column:

  • I haven't published the script I'm using because it's frankly a mess and less automated than I'd like, but I suspect some people might want it even integrated into the example itself (or at least available somewhere)
  • I'm overlapping the SVG and HTML text 1:1 and using CSS mix-blend-mode: difference;
  • it's not perfect because of what I assume is anti-aliasing/fine-positioning differences between the SVG paths and the HTML text, but it's close enough that you only see the outline (i.e. where grayscale anti-aliasing is used, not the fill of the glyphs) when shaping matches "perfectly"
  • something weird is going on with this font and its browser rendering of e.g. (III), compared to rustybuzz
    • (III) has all 5 glyphs aligned at the top in the browser, but vertically centered in rustybuzz
    • I think a lot of the mismatches are just oddities like that entirely confined to ASCII/Latin, which then cause the rest of the non-ASCII/Latin text to be misaligned
    • hopefully this is just me misusing rustybuzz and/or ttf-parser APIs, but at this point I'm not sure

TODO: try more languages, maybe emoji (hard to mix emoji & non-emoji w/o font fallback), try to improve diffing against browser rendering

@eddyb
Copy link
Contributor Author

eddyb commented Jun 23, 2023

Update: I've narrowed down most of the weird differences caused by ASCII to locl - some differences go away if I do font-feature-settings: "locl" 0; in the browser and likewise disabling locl in rustybuzz.

Another way to control this is with the lang property in the browser, if I do document.body.lang = "zh" on the cmn_hans example, all the differences in the bulk of the text go away, and new differences appear in the English header at the top.

At this point I would have to port this example to use harfbuzz to be able to tell, but I suspect the default of leaving the language unset is simply different from what browsers do (which may be using additional heuristics?).

EDIT: given that I see no changes when I force en on either side, I think that's quite literally the default (or equivalent to it in whatever OpenType terms) and there's a behavior mismatch within it, without browsers doing anything more sophisticated.

@RazrFalcon
Copy link
Owner

RazrFalcon commented Jun 23, 2023

Oh wow, thanks! Wasn't expecting someone to dive into this. I was planning to write something like this myself, by didn't had time.

I'm not sure we need full browser compatibility in this demo/example. Even resvg has a far simpler implementation. And it's the reason rustybuzz exists.

As for language and bidi - harfbuzz/rustybuzz are pretty low-level libraries. You cannot use them directly. You do need a text layout library on top of them. Like pango on Linux.

Honestly, I'm not even sure we need bidi in this example. Either way, it's good enough for me already. And you want to improve it a bit - I do not mind. But we should not try implementing a text layout library in a simple example.

@Manishearth
Copy link

Manishearth commented Jun 23, 2023

I would recommend having bidi in the example because it's a useful illustration of all the parts needed to handle text right, and prevents people from using the library naively.

(And because bidi is weird and complicated and the integration of a bidi algorithm implementation with a shaping engine is not necessarily immediately obvious)

@RazrFalcon
Copy link
Owner

@Manishearth Depending on you definition of a text layout, one can have thousands lines of code on top of rustybuzz.
Sure, I don't really mind having bidi in this example, but it's still pretty far from a proper text layout.

I do have plans on writing an easy to use text layout/rustybuzz wrapper eventually, but time is not on my side.

prevents people from using the library naively

Meanwhile I keep telling people to stop using rustybuzz... In a sense that it must not be used directly. You do need a higher level wrapper for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants