Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mkmk blocked by base+mark ligature formation #1109

Closed
punchcutter opened this issue Jul 25, 2018 · 9 comments
Closed

mkmk blocked by base+mark ligature formation #1109

punchcutter opened this issue Jul 25, 2018 · 9 comments

Comments

@punchcutter
Copy link
Collaborator

After some discussion @behdad summed it up nicely:

So, if you have base1,mark1,base2,mark2, and you ligate base1,base2 you'd get lig,mark1,mark2 but HarfBuzz remembers that mark1 and mark2 belong to first and second components of the ligature respectively and does NOT apply mkmk between them.
Now, in your case, you are ligating a base with a mark.

With sequence base1,mark1,base2,mark2 that makes sense that the marks both apply to the resulting ligature of base1+base2. However, what we have here is base1,mark1,mark2,mark3. A ligature is formed from base1+mark2 and then mark1 should attach to that with mark feature and mark3 should then attach to mark1 with mkmk feature. This is probably an unusual situation, but in this case all marks are post-base marks. They are actually all Spacing Marks, but we can't classify them as bases in the GDEF because we need them to be marks to allow for other following marks (including Vedic marks). That puts us in this situation of relying on mark and mkmk attachment to place them appropriately.

Here's a little visual explanation showing what happens in various situations. The third one shows how the ligature stops the mkmk. The fourth shows what should happen. There mark2 doesn't ligate so it's showing base1, mark1 (post-base), mark2 (top), mark3 (mkmk attached to mark1).

image001

@behdad
Copy link
Member

behdad commented Jul 25, 2018

Right. Thanks for the great report.

I think we should do this:

  • If there is no GDEF GlyphClass table, then detect a base+mark ligature and mark that (no pun!) as a base instead of ligature. If there is GDEF GlyphClass table, use the glyph category of that for the result,

  • If the result is a base, instead of a ligature, then disable the mark-ligature-component-tracking logic.

@punchcutter
Copy link
Collaborator Author

That sounds good. We always set the GDEF for this kind of ligature to base because it's not meant to act like a ligature with components. If we expect that kind of behavior we make sure to set the GDEF to ligature.

@behdad
Copy link
Member

behdad commented Jul 25, 2018

Do you happen to have a font I can use for testing and add to test suite?

@punchcutter
Copy link
Collaborator Author

I sent you a font on June 28 with test strings. I can send a new one if you need. This isn't delivered yet so can't post here.

@behdad
Copy link
Member

behdad commented Jul 25, 2018 via email

@Richard57
Copy link

Richard57 commented Aug 1, 2018

The situation is not so unusual. It's one way of dealing with Tai Tham <NA, TONE-2, SIGN AA, MAI KANG> 'water'. NA and SIGN AA ligate, and then TONE-2 and MAI KANG interact. SIGN AA is a spacing post-base mark that mostly acts like a base, and the other two marks are marks above.

@mhosken
Copy link
Contributor

mhosken commented Aug 9, 2018

@Richard57 SIGN AA is a base, isn't it? I know it's not a consonant, but it takes up space and it doesn't 'attach' as such.

Would cursive attachment help at all with the original issue?

@Richard57
Copy link

So far as I can tell, it is entirely a matter of choice as to whether U+1A63 TAI THAM SIGN AA is a base or a mark with an advance width to be restored by the dist feature. I thought Tai Tham <SAKOT, BA> was also a base, but Ed Trager's Hariphunchai font treats it as a mark, and in my Lamphun font derived from it, that seems to work well enough if I use dist to offset a following letter. (I may need to make use the facility for ad hoc groups of marks for some of the finer touches.) I intend to experiment with making U+1A63 map by cmap to a mark, and see if that makes the handling of <SIGN AA, MAI KANG> in a Bangkok style less horrendous. The way I do it in OpenType at the moment would not work with Graphite's split cursor.

@behdad
Copy link
Member

behdad commented Oct 2, 2018

Fixed in 9efddb9

However, @punchcutter the font you sent me has the iMatra_gran as a base in GDEF, not mark. So we couldn't verify the fix. Please test, and send me the corrected font so I can add a test to the test suite. Thanks.

@behdad behdad closed this as completed Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants