Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursive attachment intermixed with mark attachment is different from Uniscribe #211

Closed
behdad opened this issue Jan 6, 2016 · 9 comments
Assignees
Labels

Comments

@behdad
Copy link
Member

behdad commented Jan 6, 2016

Uniscribe seems to allow cursive attachment on marks. We don't. Enabling it is not as easy as changing
CursivePosFormat1::apply(), as currently we do deferrred mark attachment but cursive is immediate. So we get wrong results.

Test font attached in ttx format. Here's what we get currently:

$ ./hb-unicode-encode 0B1F 0B4D 0B1A 0B4D 0B1A | hb-shape NotoSansOriya-Regular.ttf.subset
[ttaorya=0+1307|casubscriptorya=0@-242,104+0|casubscriptnarroworya=0+487]

Here's what Uniscribe gets:

$ ./hb-unicode-encode 0B1F 0B4D 0B1A 0B4D 0B1A | hb-shape NotoSansOriya-Regular.ttf.subset --shaper uniscribe
[ttaorya=0+1307|casubscriptorya=0@-242,104+-211|casubscriptnarroworya=0+487]

Here's what we get if I just allow marks in CursivePosFormat1::apply():

$ ./hb-unicode-encode 0B1F 0B4D 0B1A 0B4D 0B1A | hb-shape NotoSansOriya-Regular.ttf.subset
[ttaorya=0+1307|casubscriptorya=0@-242,104+1076|casubscriptnarroworya=0@20,104+507]

Note that the Uniscribe output looks wrong to me. The font has same y anchor point in the cursive for the last two glyphs in our sequence, so their y position should align. But looks like Uniscribe is not doing that. So, I'm thinking that Uniscribe is also doing some things deferred, but in another order (doing cursive before mark attachment for example).

At any rate, I don't think we need to fix this necessarily. Just documenting it here.
NotoSansOriya-Regular.ttf.txt

@behdad behdad added the wontfix label Jan 6, 2016
@behdad
Copy link
Member Author

behdad commented Jan 6, 2016

Note that the Uniscribe output looks wrong to me. The font has same y anchor point in the cursive for the last two glyphs in our sequence, so their y position should align. But looks like Uniscribe is not doing that. So, I'm thinking that Uniscribe is also doing some things deferred, but in another order (doing cursive before mark attachment for example).

Apparently lookup 4 is moving the first mark upwards. We apply cursive connections late; but Uniscribe applies them immediately, as such it doesn't move the second mark up while we do. Not sure how to resolve this. But at least now we know what's going on and can reconsider this.

@behdad behdad removed the wontfix label Jan 6, 2016
@behdad
Copy link
Member Author

behdad commented Jan 6, 2016

Here's a note from Jelle about how to reproduce this with shipping Noto fonts:


I think you should test with the shipping font. But it differs from the
test font.

The shipping font will use spacing marks with cursive attachment only if
the sequence passes through rules that do not contain combinations where
cursive attachment may lead to the syllable as a whole getting a decrease
of advance width. So the original test combination tta-ca-ca, which has a
base with a central anchor for mark attachment would use mark-to-mark. If
you add a third ca-subscript then it is safe and a spacing mark is used
with cursive attachment. With pa-ca-ca, there is no problem and a spacing
mark can be used for the second already.

So if the sequence is tta-ca-ca-ca

  • GSUB replaces second (and third) subscript by a glyph that is lower.
  • The first subscript attaches with mark-to-base, moving in x only
  • The second subscript attaches to the first mark-to-mark moving in x only
  • The third attaches with cursive attachment moving in x only
  • A final context rule moves the first subscript up by 104 units in the
    regular

In the original test font the second ca used cursive attachment already.


@khaledhosny
Copy link
Collaborator

No sure if it is related, but I had noticed that if I have my kerning lookups (and probably anything that changes the glyph placement) ordered after the mark lookups, Uniscribe will apply the mark first then move the base glyphs, reordering the lookups gives the expected output. HarfBuzz gives the same output regardless of the order of the lookups.

@behdad
Copy link
Member Author

behdad commented Jan 7, 2016

No sure if it is related, but I had noticed that if I have my kerning lookups (and probably anything that changes the glyph placement) ordered after the mark lookups, Uniscribe will apply the mark first then move the base glyphs, reordering the lookups gives the expected output. HarfBuzz gives the same output regardless of the order of the lookups.

Right, that's known. We might change that if we end up wanting to match Uniscribe for this bug.

@JelleBosmaMT
Copy link

capture

For the record: the only requirement for the intended display is that the all the GPOS lookups are applied and applied in lookup order. Of course this the default assumption of how the GPOS is intended to function.

It may appear counterintuitive to use cursive attachment with anchor points on marks. But all that is done is to position the shape of a spacing glyph in a line of text in relation to shapes to the left of it, with the help of the coordinates of anchor points. It is the value of the coordinate that matters, and it matters not if the value is stored as an anchor of a base glyph or as an anchor of a mark glyph positioned relatively to a base glyph.

I am also tempted to make the comment: Although the GPOS has lookup types that have "attachment" and "connection" in the name, nothing gets attached or connected by these lookups. It is just positioning with the help of coordinates. They may be used for features that intend to "attach" or "connect" glyphs. But even so, there is always the need to correct general attachment or connection rules for exceptional cases with additional context rules.

behdad added a commit that referenced this issue Feb 12, 2016
@behdad
Copy link
Member Author

behdad commented Feb 12, 2016

No sure if it is related, but I had noticed that if I have my kerning lookups (and probably anything that changes the glyph placement) ordered after the mark lookups, Uniscribe will apply the mark first then move the base glyphs, reordering the lookups gives the expected output. HarfBuzz gives the same output regardless of the order of the lookups.

So, I'm going to change HarfBuzz to be more like Uniscribe here. I know it's inferior behavior, but is the simplest way to make cursive attachment to marks work reasonably.

@behdad
Copy link
Member Author

behdad commented Feb 12, 2016

For the record: the only requirement for the intended display is that the all the GPOS lookups are applied and applied in lookup order. Of course this the default assumption of how the GPOS is intended to function.

All of this only works in Indic shaper because that's the only shaper that does NOT zero mark advances... In all other shapers, using cursive to attach marks will probably behave the same as using mark and mkmk features.

@behdad behdad closed this as completed in 86c68c7 Feb 16, 2016
@JelleBosmaMT
Copy link

On 12-feb.-16 07:18, "Behdad Esfahbod" notifications@github.com wrote:

All of this only works in Indic shaper because that's the only shaper
that does NOT zero mark advances... In all other shapers, using cursive
to attach marks will probably behave the same as using mark and mkmk
features.

Cursive connection moves the advance of the first glyph, mark attachment
moves the outlines of the second. That is a fundamental difference. In the
other "shapers", the non-zero advance width of the second glyph is
modified, which is neither cursive connection, nor mark attachment. One
would expect even then that the advance of the first is changed by cursive
connection although with the advance of the second being changed, it has
gone wrong already ;-(

Best regards,
Jelle

jsonn pushed a commit to jsonn/pkgsrc that referenced this issue Feb 21, 2016
Overview of changes leading to 1.2.0
Friday, February 19, 2016
====================================

- Fix various issues (hangs mostly) in case of memory allocation failure.
- Change mark zeroing types of most shapers from BY_UNICODE_LATE to
  BY_GDEF_LATE.  This seems to be what Uniscribe does.
- Change mark zeroing of USE shaper from NONE to BY_GDEF_EARLY.  That's
  what Windows does.
- Allow GPOS cursive connection on marks, and fix the interaction with
  mark attachment.  This work resulted in some changes to how mark
  attachments work.  See:
  harfbuzz/harfbuzz#211
  harfbuzz/harfbuzz@86c68c7
- Graphite2 shaper: improved negative advance handling (eg. Nastaliq).
- Add nmake-based build system for Windows.
- Minor speedup.
- Misc. improvements.
behdad added a commit that referenced this issue Feb 24, 2016
That commit moved the advance adjustment for mark positioning to
be applied immediately, instead of doing late before.  This breaks
if mark advances are zeroed late, like in Arabic.  Also, easier to
hit it in RTL scripts since a single mark with non-zero advance is
enough to hit the bug, whereas in LTR, at least two marks are needed.

This reopens #211
The cursive+mark interaction is broken again.  To be fixed in a
different way.
@behdad
Copy link
Member Author

behdad commented Feb 24, 2016

Reopening since I partially reverted this fix in 86c68c7

@behdad behdad reopened this Feb 24, 2016
@behdad behdad added the bug label Jul 13, 2016
@behdad behdad self-assigned this Mar 2, 2017
@behdad behdad closed this as completed Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants