Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken Kannada shaping? #347

Closed
brawer opened this issue Oct 11, 2016 · 16 comments
Closed

Broken Kannada shaping? #347

brawer opened this issue Oct 11, 2016 · 16 comments

Comments

@brawer
Copy link
Contributor

brawer commented Oct 11, 2016

For NotoSerifKannada-Regular.ttf, HarfBuzz emits broken output while CoreText output looks good for the same font. This might be a bug in HarfBuzz. However, it might also be that CoreText has a bug in its Kannada shaping; I don’t know if Noto’s font designers have tested their font on multiple OpenType implementations or just on CoreText. So it might well be that the font (and CoreText) are buggy. If somebody who knows more about Kannada shaping could analyze this, it would be great.

I’ve added this as a test case to Unicode’s text rendering test suite, using CoreText to generate the expected renderings. Should it turn out that the HarfBuzz implementation is correct and CoreText (and the Noto font) are both wrong, I’ll of course adjust the test suite.

screenshot of broken Kannada shaping

See https://github.com/googlei18n/noto-fonts/issues/759 for the corresponding bug report on the NotoSerifKannada font.

@KrasnayaPloshchad
Copy link

I’m sure this is a bug in HarfBuzz, see the discussions on notofonts/noto-fonts#341.

@lianghai
Copy link

I wrote GSUB rules for all Noto Serif Indic fonts, and I can confirm it's a HarfBuzz issue. HarfBuzz's handling of the "GDEF" table seems to be nonstandard, which leads to this issue.

Noto Serif Kannada relies on lookupflag IgnoreMarks; for some lookups to work properly. The lookup of rules sub knKA knmI by knKI; etc has lookupflag IgnoreMarks; too (for skipping Nukta), however HarfBuzz wrongly considers vowel signs are marks (a derivation from UCD?) despite that in the "GDEF" table vowel signs are not defined as GDEF marks.

A patched Noto Serif Kannada with a minimal change (setting the <LookupFlag value="8"/> in <Lookup index="4"> to <LookupFlag value="0"/>) demonstrates how this works:
NotoSerifKannada-Regular-patched.ttf.zip

@lianghai
Copy link

@KrasnayaPloshchad I don't think this issue is related to notofonts/noto-fonts#341 though.

@lianghai
Copy link

While patching SamsungOne fonts I've noticed HarfBuzz's exact issue seems to be:

When a Unicode-mapped glyph's GDEF class is undefined in the GDEF table, HarfBuzz defaults its GDEF class to a value derived from UCD (eg, Mn becomes GDEF class 3 "mark glyph").

However, according to the OT spec, https://www.microsoft.com/typography/otspec/GDEF.htm

The font developer does not have to classify every glyph in the font, but any glyph not assigned a class value falls into Class zero (0).

— Which is indeed the common behavior of text engines (including earlier versions of HarfBuzz, eg, 0.9.40 and 1.0.5, according to Samsung's report). With the expected behavior of defaulting to 0, any glyph not assigned a class value in the GDEF table should not be ignored by lookupflag IgnoreMarks; (ignoring class 3).

Now the workaround for me is to patch the fonts by assigning non mark glyphs the class value of 1 "base glyph", like what happened (or was intended?) to have been done in Nirmala UI.

@behdad
Copy link
Member

behdad commented Dec 18, 2016

@lianghai The GDEF behavior you mention is definitely not expected. HarfBuzz only fills in GDEF classes from Unicode data if there is no GDEF table present...

Ugh. Now I see that this was changed in March, in: 69f9fbc
I suppose I can special-case Indic-like shapers to disable this and use the previous logic.

@lianghai
Copy link

@behdad Ah, I see… Though, given the clearly defined behavior of "any glyph not assigned a class value falls into Class zero (0)" in the OT spec, should the special case be actually Hebrew that does synthesize GDEF glyph class (instead of Indic-like shapers that do not)?

@behdad
Copy link
Member

behdad commented Dec 22, 2016

@lianghai Yeah, I'm going to revert this, and make hebrew shaper ignore GDEF and GPOS (and use fallback for both) if GPOS Hebrew script system is not found.

@behdad behdad closed this as completed in 1a0f4aa Dec 22, 2016
behdad added a commit that referenced this issue Dec 22, 2016
New approach to fix this:
69f9fbc

Previous approach was reverted as it was too broad.  See context:
#347 (comment)

With U+05E9,U+05B8,U+05C1,U+05DC and Arial Unicode, we now (correctly) disable
GDEF and GPOS, so we get results very close to Uniscribe, but slightly different
since our fallback position logic is not exactly the same:

Before:		[gid1166=3+991|gid1142=0+737|gid5798=0+1434]
After:		[gid1166=3+991|gid1142=0@402,-26+0|gid5798=0+1434]
Uniscribe:	[gid1166=3+991|gid1142=0@348,0+0|gid5798=0+1434]
@MayuraVerma
Copy link

Is this fixed now?
When this change will reflect in Chrome browser?

iongchun pushed a commit to iongchun/harfbuzz that referenced this issue Jan 12, 2017
iongchun pushed a commit to iongchun/harfbuzz that referenced this issue Jan 12, 2017
New approach to fix this:
harfbuzz@69f9fbc

Previous approach was reverted as it was too broad.  See context:
harfbuzz#347 (comment)

With U+05E9,U+05B8,U+05C1,U+05DC and Arial Unicode, we now (correctly) disable
GDEF and GPOS, so we get results very close to Uniscribe, but slightly different
since our fallback position logic is not exactly the same:

Before:		[gid1166=3+991|gid1142=0+737|gid5798=0+1434]
After:		[gid1166=3+991|gid1142=0@402,-26+0|gid5798=0+1434]
Uniscribe:	[gid1166=3+991|gid1142=0@348,0+0|gid5798=0+1434]
@MayuraVerma
Copy link

Thank you for addressing this issue.

I see that we have resolved this.

But when will we see this updated in Chrome? when do they incorporate the updated harfbuzz?

Could you please help me understand what version of harfbuzz is currently used in chrome.

@behdad
Copy link
Member

behdad commented Mar 1, 2017

Could you please help me understand what version of harfbuzz is currently used in chrome.

I don't know, but it should eventually make it to Chrome Stable. No further action is needed.

@drott yet another nag to have HarfBuzz version visible in chrome://version or whereever...

@drott
Copy link
Collaborator

drott commented Mar 1, 2017

https://chromium.googlesource.com/chromium/src/ can be used to look at the HarfBuzz version that Chromium is compiled with. The first release version of HarfBuzz after e2b8780 is 1.4.0. In Chrome 57 we have rolled to HarfBuzz 1.4.1. @MayuraVerma you can try with Chrome Beta, which is at version 57.

@MayuraVerma
Copy link

@drott Thank you. This is with the Chrome beta version. Finally NotoSerifKannada is rendering text correctly with updated harfbuzz.

screen shot 2017-03-04 at 4 08 05 pm

Few questions: It could be off-topic, please bear with me.

  1. Could we request chrome browser in iOS to use harfbuzz. Apple coretext is very buggy with Indian languages.
  2. If chrome browser in iOS can change fonts for specific language, so user can use "Google Noto Sans/Serif fonts" instead of Apple fonts for Indian languages (which is again super buggy). In mac, I am using "advanced font settings" to achieve it. If this can be embedded in both desktop and mobile versions, it will save a lot of headache for users from India.

Thanks.

@brawer
Copy link
Contributor Author

brawer commented Mar 5, 2017

Apple coretext is very buggy with Indian languages.

@MayuraVerma, do you know specific Unicode sequences that are broken? Feel free to add test cases to Unicode’s text rendering texts. Or simply file a bug with a list of strings that can be converted to test cases. Apple keeps an eye on the Unicode test suite, and they’ve fixed bugs in the past.

@MayuraVerma
Copy link

@brawer sure, I will do that.

@drott
Copy link
Collaborator

drott commented Mar 6, 2017

Few questions: It could be off-topic, please bear with me.

Could we request chrome browser in iOS to use harfbuzz. Apple coretext is very buggy with Indian languages.

Chrome on iOS uses Safari's WebKit as layout engine, this is not something we can currently change.

If chrome browser in iOS can change fonts for specific language, so user can use "Google Noto Sans/Serif fonts" instead of Apple fonts for Indian languages (which is again super buggy). In mac, I am using "advanced font settings" to achieve it. If this can be embedded in both desktop and mobile versions, it will save a lot of headache for users from India.

For changing default fonts in iOS Chrome (which would still go through CoreText), I would suggest to file a feature request on crbug.com.

Otherwise, @brawer 's suggestion is probably your best bet for getting CoreText bugs fixed.

@MayuraVerma
Copy link

@drott

is the limitation in iOS that doesn't allow browsers uses its know text engine?

For changing default fonts: I will submit the cr right away. Besides Apple coretext issue for Kannada language, Apple system default font KannadasangamMN (TTF) has a lot bugs in it.

Solving font issue with google fonts and reporting issues CoreText is right way to proceed for me. Apple is slowly changing TTF with new set of OTF for Indian languages. When they do that Coretext engine should not have issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants