New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange behavior with ccmp lookup & marks #121

Closed
Pecita opened this Issue Jul 23, 2015 · 7 comments

Comments

Projects
None yet
3 participants
@Pecita

Pecita commented Jul 23, 2015

Hello.
It is not very important but I am curious to understand ...
In this minimal font the latin dz digraph is transform to a d followed by a z using a ccmp lookup.
Applying an accent to the digraph then it moves it to the first letter ignoring (for GPOS mark) the separation done by the ccmp lookup.
http:pecita.eu/ccmpDigraph/ccmpDigraph.tar.bz2

@behdad

This comment has been minimized.

Show comment
Hide comment
@behdad

behdad Jul 26, 2015

Collaborator

Right. We have found that this is the behavior that Uniscribe exposes, at least in the Arabic shaper, so we have implemented it that way. I understand it might not be desirable...

Collaborator

behdad commented Jul 26, 2015

Right. We have found that this is the behavior that Uniscribe exposes, at least in the Arabic shaper, so we have implemented it that way. I understand it might not be desirable...

@behdad behdad closed this Jul 26, 2015

@behdad

This comment has been minimized.

Show comment
Hide comment
@behdad

behdad Jul 26, 2015

Collaborator

Or does Windows expose a different behavior in your testing?

Collaborator

behdad commented Jul 26, 2015

Or does Windows expose a different behavior in your testing?

@Pecita

This comment has been minimized.

Show comment
Hide comment
@Pecita

Pecita Jul 27, 2015

Behdad, thank you for the explanation.
In my view and for LCG scripts this is wrong. It's useful only for digraphs and for digraphs diacritics applies always to the second letter (try switching the sample text to a classic ttf font).
I only have Linux computers for testing.

Pecita commented Jul 27, 2015

Behdad, thank you for the explanation.
In my view and for LCG scripts this is wrong. It's useful only for digraphs and for digraphs diacritics applies always to the second letter (try switching the sample text to a classic ttf font).
I only have Linux computers for testing.

@behdad

This comment has been minimized.

Show comment
Hide comment
@behdad

behdad Jul 27, 2015

Collaborator

Thanks. I'll test and reconsider.

Collaborator

behdad commented Jul 27, 2015

Thanks. I'll test and reconsider.

@behdad behdad reopened this Jul 27, 2015

@jfkthame

This comment has been minimized.

Show comment
Hide comment
@jfkthame

jfkthame Jul 27, 2015

Collaborator

I think this mostly serves to illustrate the fact that it's a bad idea to use the digraph characters in general, at least in conjunction with combining marks. Given a digraph <xy> followed by a combining accent such as <acute>, it's unclear in principle whether the accent is expected to appear on the first component of the digraph, on the second component, centered over the digraph as a whole, or perhaps even duplicated and rendered on both components.

(FWIW, I suspect the latter would be the most appropriate thing to do in Dutch if presented with the text sequence <U+0133, U+0301>, for example: it's wrong to write either iȷ́ or íj, the correct presentation is íȷ́ .... which is i-acute j-acute, in case it doesn't render well for you. But this isn't up to an engine like harfbuzz; if anything, it'd be up to the 'locl' feature to decide which rendering to implement.)

Collaborator

jfkthame commented Jul 27, 2015

I think this mostly serves to illustrate the fact that it's a bad idea to use the digraph characters in general, at least in conjunction with combining marks. Given a digraph <xy> followed by a combining accent such as <acute>, it's unclear in principle whether the accent is expected to appear on the first component of the digraph, on the second component, centered over the digraph as a whole, or perhaps even duplicated and rendered on both components.

(FWIW, I suspect the latter would be the most appropriate thing to do in Dutch if presented with the text sequence <U+0133, U+0301>, for example: it's wrong to write either iȷ́ or íj, the correct presentation is íȷ́ .... which is i-acute j-acute, in case it doesn't render well for you. But this isn't up to an engine like harfbuzz; if anything, it'd be up to the 'locl' feature to decide which rendering to implement.)

@Pecita

This comment has been minimized.

Show comment
Hide comment
@Pecita

Pecita Jul 27, 2015

I agrees that this case is very marginal and the digraphs are a silliness.
I think only in terms of consistency. If we want to make a clean typeface we must have tools that work with logic.

Pecita commented Jul 27, 2015

I agrees that this case is very marginal and the digraphs are a silliness.
I think only in terms of consistency. If we want to make a clean typeface we must have tools that work with logic.

@behdad

This comment has been minimized.

Show comment
Hide comment
@behdad

behdad Aug 20, 2015

Collaborator

I double checked that Uniscribe does the same. Closing as such.

Collaborator

behdad commented Aug 20, 2015

I double checked that Uniscribe does the same. Closing as such.

@behdad behdad closed this Aug 20, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment