Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Balinese shaping seems broken #387

Closed
brawer opened this issue Jan 5, 2017 · 11 comments
Closed

Balinese shaping seems broken #387

brawer opened this issue Jan 5, 2017 · 11 comments

Comments

@brawer
Copy link
Contributor

brawer commented Jan 5, 2017

Compare the rendering of ᬓ᭄ᬓᬼ U+1B13 U+1B44 U+1B13 U+1B3C in HarfBuzz 1.4.0 versus macOS 10.12.2 using NotoSansBalinese-Regular.ttf:

HarfBuzz: image

CoreText: image

@punchcutter
Copy link
Collaborator

This is the same as seen in https://github.com/googlei18n/noto-fonts/issues/572

@punchcutter
Copy link
Collaborator

I forgot to mention if I fix the font issue the shaping still doesn't work because of the update to hb-ot-shape-complex-use.cc where now there's a decomposition for Balinese 1B3C. I don't understand what that's there for since there's no canonical decomposition and the decomposition of this should be done in the font from what I can tell. This version of the font decomposes 1B3C into 1B42 and g170 (an unencoded glyph for the bottom half), but if I switch the order to g170 1B42 then shaping works as expected.

@behdad
Copy link
Member

behdad commented Jan 9, 2017

the update to hb-ot-shape-complex-use.cc where now there's a decomposition for Balinese 1B3C

Err. That's my bad. Let me fix.

behdad added a commit that referenced this issue Jan 9, 2017
We have had added this in Indic shaper to assist shaping these scripts.
In Universal Shaping Engine however, it is up to font designer to
decompose them.  Hence moving them from Indic shaper to USE was
wrong.

Fixup for f6ba63b

Part of fixing #387
@behdad
Copy link
Member

behdad commented Jan 9, 2017

@punchcutter Better now?

@punchcutter
Copy link
Collaborator

This looks good on the harfbuzz side, but the font also needs to be updated from what I can tell.

@behdad
Copy link
Member

behdad commented Jan 9, 2017

Humm. Why does CoreText get it right then, any guess?

@punchcutter
Copy link
Collaborator

CoreText doesn't seem to care about the order of the marks. I tried both possible orders of decomposing 1B3C and they are both fine in CoreText, but only one order works in harfbuzz (bottom followed by top). The same for Edge on Windows 10. This attached font doesn't work in harfbuzz or Windows 10, but if the decomposition order is swapped they both work fine. 1B3C is Top_And_Bottom in IndicPositionalCategory.txt, but when split the only order that's working correctly is bottom followed by top. The top mark 1B42 is encoded and considered a top mark, but the below base half is not encoded.

iongchun pushed a commit to iongchun/harfbuzz that referenced this issue Jan 12, 2017
We have had added this in Indic shaper to assist shaping these scripts.
In Universal Shaping Engine however, it is up to font designer to
decompose them.  Hence moving them from Indic shaper to USE was
wrong.

Fixup for f6ba63b

Part of fixing harfbuzz#387
@brawer
Copy link
Contributor Author

brawer commented Jan 17, 2017

Hm, should the USE spec be clearer on ordering?

brawer added a commit to unicode-org/text-rendering-tests that referenced this issue Jan 17, 2017
@punchcutter
Copy link
Collaborator

I think the USE spec is pretty clear on ordering, but this particular situation can be a little vague because once 1B3C is decomposed the bottom half becomes an unencoded mark. If the bottom mark is interpreted as Blw then it should work according to the spec, but it looks to me like the bottom mark is being interpreted as Abv. Still it's odd that if the decomposition order is switched in the ccmp then it works even though the order is then supposedly Blw Abv which is against the USE spec. This font has no Mark Attachment classes in the GDEF, but even if I try to add these marks to Mark Attachment classes they don't seem to be interpreted in a different way.

@KrasnayaPloshchad
Copy link

CoreText doesn't seem to care about the order of the marks. I tried both possible orders of decomposing 1B3C and they are both fine in CoreText, but only one order works in harfbuzz (bottom followed by top).

Maybe Core Text try to give support for so-called canonically equivalents as many as possible.

@behdad
Copy link
Member

behdad commented Jul 14, 2017

Uniscribe produces same output as HarfBuzz. As such, this is text encoding problem, as well as, arguably, CoreText bug for accepting it. cc @nedley

@behdad behdad closed this as completed Jul 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants