Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fonts can have nonsensical or undefined Ligatures #136

Open
rcombs opened this issue Sep 20, 2014 · 21 comments
Open

Fonts can have nonsensical or undefined Ligatures #136

rcombs opened this issue Sep 20, 2014 · 21 comments

Comments

@rcombs
Copy link
Member

rcombs commented Sep 20, 2014

https://www.dropbox.com/s/fkl0zt127c8qsgw/AB.mkv

Ewps. Happens only when harfbuzz is enabled at runtime.

@astiob
Copy link
Member

astiob commented Sep 24, 2014

That’s just how the font is. We’re not doing anything wrong. I don’t think there’s even any way to detect this. I mean, see that dot on the left of the “e”? Yeah, that’s the glyph for “ti”. We’d need an AI to determine that it’s a nonsensical glyph for “ti”.

@astiob astiob added fonts and removed bug labels Sep 24, 2014
@Cyberbeing
Copy link

That’s just how the font is. We’re not doing anything wrong. I don’t think there’s even any way to detect this.

While that font does have bogus glyphs named as ll & ti in the unicode private_use_area, is there a good reason why they are being automatically used in the first place? The script itself does not specify any combined unicode glyphs, so I think it'd be wise to just disable this automated ligature replacement behavior in Harfbuzz by default for compatibility reasons. If someone really wanted a ligature displayed, they could always add the combined glyph manually to the script.

@rcombs
Copy link
Member Author

rcombs commented Sep 26, 2014

Having it always disabled kinda defeats the purpose of having a complex shaper that can handle ligatures automatically, but could be a good argument for having it off by default unless some flag is set in the header.

@Cyberbeing
Copy link

Having it always disabled kinda defeats the purpose of having a complex shaper that can handle ligatures automatically

Do you have a good example of automatic ligature replacement yielding a clear positive benefit over GPOS kerning pairs with Harfbuzz? If you don't want to disable ligature glyph replacement entirely, maybe using it as a fallback only when a kerning pair does not exist would be a reasonable compromise.

@rcombs
Copy link
Member Author

rcombs commented Sep 26, 2014

I'm not 100% sure if Harfbuzz can handle it with kerning pairs, but Garamond has e.g. "fi", and handwriting fonts (often used in typesetting!) often have extensive ligatures; take Zapfino for an extreme example. Still, I think we're in agreement that we should disable automatic ligatures by default, at least.

@Cyberbeing
Copy link

handwriting fonts (often used in typesetting!) often have extensive ligatures

If these are most critical, another possible option is to only enable automatic ligatures by default for fonts specifying type as 'script' in OS/2 or 'handwritten' in Panose.

@torque
Copy link
Collaborator

torque commented Sep 27, 2014

Clearly the best and only solution is to have a giant internal lookup table of known bad fonts which have broken ligatures.

After all, it's very important that libass go out of its way to make sure that broken fonts are displayed correctly.

@line0
Copy link
Contributor

line0 commented Sep 27, 2014

You can't possibly be serious about disabling OpenType features, is this the 80s or what? Ligatures (and other GSUBs for that matter) are there for a reason, it's how the font is meant to be displayed. If the font specifies garbage glyphs as replacements, then that's obviously how the font is meant to be rendered by a modern text renderer/shaper. Having users manually add ligature glyphs to their scripts is possibly the most retarded idea I've ever heard of. Just because this font just so happens to have assigned unicode code points from the private use area to the replacement glyphs, this is by no means true for all fonts (which may not assign code points to those glyphs at all). Just because VSFilter is permanently stuck in the early 90s with regards to text rendering, that doesn't mean modern renderers have to adapt to make theirs equally shitty, just to support that one broken font.

@rcombs
Copy link
Member Author

rcombs commented Sep 27, 2014

I think the best option is to disable automatic ligatures by default in current ASS scripts for compatibility with broken fonts, which are probably somewhat prevalent with scripts that were only tested with VSFilter and fonts that were possibly poorly-stripped, but enable them by default in ASSv5 with a tag to turn them off, and possibly provide a script-wide flag in current ASS to enable them. With existing scripts, the worst effect would be that text renders the same way as in VSFilter, which is probably what the original author expected anyway. Yes, manually adding ligature glyphs in PUA is ridiculous.

@Cyberbeing
Copy link

At the very least there needs to be a toggle somewhere to workaround fonts with broken ligatures in existing scripts, even if this workaround is disabled by default. As long as a workaround is easy enough for an end-user to enable on demand, I don't see it as a major issue if fonts like this one are broken by default.

Though you have to keep in mind that it's not just about compatibility for scripts authored with VSFilter, but also those authored with Libass FriBidi, since up to this point Aegisub has never built Libass with Harfbuzz support. This makes it very unlikely that any advanced shaping issues would be discovered during the authoring stage, which is a problem in itself.

You can't possibly be serious about disabling OpenType features, is this the 80s or what?

Wouldn't be the first time this has happened because of broken fonts. Ever since 2009, Libass has had kerning support disabled by default on ASS scripts, along with an optional script header to re-enable kerning if desired: 0d3ddc1

this is by no means true for all fonts (which may not assign code points to those glyphs at all).

I came to realize this after talking to rcombs on IRC yesterday and seeing the actual Zapfino font which has a huge number of liguatures without code-points. Most fonts prior to that I looked at only had ff fl fi ligatures and such, which actually have standardized unicode code-points. Disregard my prior comment which naively assumed all fonts added extra ligatures to the PUA.

I think the best option is to disable automatic ligatures by default in current ASS scripts for compatibility with broken fonts, which are probably somewhat prevalent

Well that's a open question. How prevalent really are font issues such as this? Disabling by default for all fonts may be a bit extreme depending on the answer, and also assuming ligatures cause no other issues with ASS override tags. If ligatures are generally unharmful(?) to ASS scripts, there would be no reason to completely restrict it to a future ASSv5 spec.

@astiob
Copy link
Member

astiob commented Sep 27, 2014

I’m normally completely pro-VSFilter-compatibility, but frankly, if it isn’t shown that this problem is more widespread, I’m −½ on disabling ligatures by default. Not −1 because I’m still pro-VSFilter-compatibility and because it probably isn’t very easy to gauge how common this problem in fact is. Let’s start with some small steps: does anyone have at least one other similarly broken sample? And how popular is the sample that we have here? Is it even an actual public release? And did the group—or are they now willing to—release a patch that fixes the font?

I’d be perfectly happy with a switch to disable them while keeping them on by default. Also, if it does get shown that the issue is actually widespread, I’ll want them disabled by default after all, even if it makes me somewhat sad.

Also, horrible hackish workaround for lazy typesetters: adding any non-zero \fsp, even very small, disables ligatures in recent versions of libass.

ASSv5

This joke is getting old. But no, sure, it doesn’t hurt to think about it, as long as we also fully consider the reality of ASS. Just saying. (Also, I’m certain I’ve already said this on IRC, but the versions go like this: ASS = ASSv1 = SSAv4+ = SSAv5; ASSv2 = SSAv4++ = SSAv6. Yes, ASSv2 is a thing. No, noöne uses it, but it is a thing nevertheless. Indeed, I find it strange that noöne uses it. Regardless, “ASSv5” makes no sense.)

with a tag to turn them off

Pls no. Why should users ever be able to turn ligatures off? …Hmm, maybe for typesetting; this needs more thought. But certainly not for regular subtitles. Neither ligatures nor kerning, nor any other great OpenType features. …Hmm, broken fonts… b-but why is anyone using broken fonts in the first place? Why should we (when we’re not restrained by backwards compatibility) be more tolerant to fonts than any other modern piece of software that renders text?

@line0
Copy link
Contributor

line0 commented Sep 27, 2014

Pls no. Why should users ever be able to turn ligatures off? …Hmm, maybe for typesetting; this needs more thought.

turning off individual OpenType features is indeed useful for typesetting purposes (think contextual/random alternates etc.) and the feature should not be limited to ligatures only.

@rcombs
Copy link
Member Author

rcombs commented Sep 29, 2014

Re: the file in question, it's from joseole99's Angel Beats 720p release, which is on BakaBT and uses tormaid's subs.

@sl1pkn07
Copy link

sl1pkn07 commented Dec 10, 2014

Here other example

https://sl1pkn07.wtf/libass/ligature_example%28spanish%29.mkv

Error loading glyph, index 714

libass (linux)
shot0001

vsfilter (windows)
anaf-ynk harmonie bd 720p 33f277d7 mkv_snapshot_02 23_ 2014 12 10_15 18 32

@grigorig grigorig self-assigned this Jun 17, 2015
@sl1pkn07
Copy link

any notice of this?

@sl1pkn07
Copy link

sl1pkn07 commented Feb 8, 2019

Hi

any notide of fix?

greetings

@astiob
Copy link
Member

astiob commented Feb 8, 2019

The font in the last example doesn’t have the ligature glyph at all, which is different from the first example. @grigorig @behdad Is it possible to make HarfBuzz ignore ligatures that transform into undefined glyphs?

@sl1pkn07
Copy link

sl1pkn07 commented Feb 8, 2019

uploaded again the vsfilter/windows pic

@behdad
Copy link
Contributor

behdad commented Feb 8, 2019

Is it possible to make HarfBuzz ignore ligatures that transform into undefined glyphs?

No.

@TheOneric TheOneric changed the title Ligatures go missing in sample Fonts can have nonsensical or undefined Ligatures Dec 18, 2021
@TheOneric
Copy link
Member

TheOneric commented Oct 9, 2022

Summarising information from past discussion here and at other places:

  • Some (very few?) broken fonts have nonsensical ligatures, the only example so far being the one from OP
  • Some (few?) broken fonts have ligatures whose glyphs are undefined, the two examples being from sl1pkn07's comment and Square character on libass, correct character on vsfilter #228.
  • Some fonts require ligatures to display sensibly, a notable common'ish example being most handwriting fonts.
  • It is not possible to make HarfBuzz ignore ligatures transforming to undefined glyphs
  • libass always uses “required ligatures” (rlig, preënabled by HarfBuzz), “standard ligatures” (liga) and “contextual ligatures” (clig). libass currently never uses “discretionary ligatures” (dlig) and “historical ligatures” (hlig). These setting match Microsoft's recommendations for default settings.
  • In its regular rendering path VSFilters do not use "ligatures" — though it hasn't been tested if this actually includes all ligature groups, in particular rlig might be applied anyway
  • Sometimes VSFilter does use ligatures. In a different issue it was discovered that when some magic heuristic kicks in, GDI falls back to a different rendering path aware of (more) OpenType features and this supposedly includes ligatures though it isn't clear which specific groups of ligatures. E.g. including U+200B ZERO WIDTH SPACE in every glyph run should trigger this.

@astiob
Copy link
Member

astiob commented Oct 10, 2022

though it hasn't been tested if this actually includes all ligature groups, in particular rlig might be applied anyway

I’m 99% certain that it isn’t, that the GDI path doesn’t know what “OpenType” is, much less an “OpenType feature”, and that it is unable to perform any kind of glyph transformations. (Oh, that makes me wonder… Do vertical fonts always use Uniscribe?)

when some magic heuristic kicks in, GDI falls back to a different rendering path aware of (more) OpenType features

The different path employs Uniscribe, which is Microsoft’s (legacy) OpenType engine. The magic heuristic is presumably ScriptIsComplex. The real question is how that function is implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants