Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glyph hash spec is different from some implementations #224

Open
jenskutilek opened this issue Jan 23, 2024 · 5 comments
Open

Glyph hash spec is different from some implementations #224

jenskutilek opened this issue Jan 23, 2024 · 5 comments

Comments

@jenskutilek
Copy link
Contributor

When I adapted the HashPointPen for usage in ufo2ft, it was discussed and improved in fonttools/fonttools#2005, but the UFO spec hasn't been updated to reflect the changes.

So there are now at least two different implementations of the glyph hash calculation, one in Adobe's psautohint, the other one in FontTools.

*Hint ID computation.* The hash string is initialized with the width value as a
decimal string, with the prefix w'. The glyph outline element is converted to a
string by iterating through all the child elements.
If the child is a contour element, it is converted to a string and added to the
hash string by iterating through all the point elements. Any contour with a
length of less than 2 is skipped. For each point, the point 'type' attribute, if
present, is written; else a space is written. The x and y values are then
written as decimal strings separated by a comma. The x and y values are rounded
to a precision of no more than 3 decimal places.
If the child is a component element, first the transform values are written, if
any. These are written by writing 't' followed by the comma-separated decimal
values of the transform attributes in the following order: ["xScale", "xyScale",
"yxScale","yScale", "xOffset", "yOffset"]. The letter 'h' is then written,
followed by the hash ID for the component glyph. The four scale values will be
rounded to a precision of 8 decimal places, and the offset values will be
rounded to a precision of at most 3 decimal places.
Once the hash string is built, it is used as is for the Hint ID if it is less
than 128 characters. Otherwise, a SHA 512 hash is computed, and this is used as
the Hint ID for the hint dict. The SHA 512 hash is written with lowercase
hexidecimal digits.

What's the way to resolve this? My vote would go to updating the spec to the algorithm used in the FontTools HashPointPen.

@benkiel
Copy link
Contributor

benkiel commented Jan 23, 2024

Hey Jens, that part of the spec was written by Adobe (@readroberts iirc). I am happy to get it updated, but this should be worked out with Adobe also (@skef @kaydeearts @miguelsousa). It might be useful to give us a tldr; digest of what is different from ps/otfautohint and fontTools here, and where it is off of the written spec (or, a PR that folks can debate).

The other question would be how to handle backwards compatibility, if that is needed (not sure it is?)

@skef
Copy link

skef commented Jan 24, 2024

I'm not seeing Adobe raising significant objections to changing this aspect of the UFO spec, as long as its the actual algorithm that is documented rather than a pointer off to the HashPointPen code.

On that front: we should probably confirm with fontTools that either things are currently as they want them to be for the foreseeable future, or see if they're willing to add a parameter (or whatever) that ensures that pen matches the UFO spec, in case they want to support other algorithms at some point.

The current form of psautohint is the Python-only otfautohint port in AFDKO. The hash algorithm is implemented in this object. I haven't looked into whether it still matches the UFO spec.

@jenskutilek
Copy link
Contributor Author

jenskutilek commented Jan 25, 2024

Here's a summary of the changes from fonttools/fonttools#2005:

  • Output y coordinates with prefixed sign so that e.g. (11, 1) can be distinguished from (1, 11): w500l1+11l10+10...
  • Output composite glyphs as the base glyph's outline followed by its transform in parentheses, with each component and its transform grouped in brackets

The change regarding the composite glyphs, which were decomposed in the Adobe implementation, was to facilitate use of the HashPointPen in checking if TrueType instructions match the outline, as in the UFO glyph’s public.truetype.instructions.

Example outputs (without applying the sha512 hashing):

A simple TTF glyph:

w626l335+458o327+484o313+535q308+559l306+559o301+535o287+483q280+459l210+247l405+247|l480+0l434+144l180+144l133+0l2+0l228+675l397+675l624+0|

A composite glyph:

w500[l0+0l10+110o50+75o60+50c50+0|(+2+0+0+3-10+5)]

A nested composite glyph:

w500[[l0+0l10+110o50+75o60+50c50+0|(+1+0+0+1+0+0)](+2+0+0+3-10+5)]

A glyph with outline and component:

w500l0+0l10+110o50+75o60+50c50+0|[l0+0l10+110o50+75o60+50c50+0|(+2+0+0+2-10+5)]

@jenskutilek
Copy link
Contributor Author

jenskutilek commented Jan 25, 2024

The decomposing of composite glyphs, as the Adobe version does, makes sense for the hinting of CFF glyphs, where it doesn't matter how the outline ended up in the glyph.

The change in the FontTools implementation proved unsatisfactory now (fonttools/fonttools#3421) because of how the transform values are stored in UFO vs. TTF. UFO can use arbitrary precision for any value, but TTF quantizes the transformation matrix elements to F2Dot14 values. This quantization also must be done in the stored hash if you want to compare glyphs between UFO and TTF, which is necessary in how the hash is used for TrueType hinting.

Maybe we need different requirements for how to build the hash depending on whether it is used in PS or TT hinting?

@skef
Copy link

skef commented Jan 25, 2024

@jenskutilek I thought about this, especially because at first glance of the description I wondered if the new hash was only on the "local" composite information rather than also taking the components into account. But the components are included, they're just not "unpacked".

The only relevant gap between the two hashes would be if the patterns of compositing changed but the component outlines, including their ordering, did not. From the perspective of CFF(2) you'd get a false negative on the identity check. I think that's fine -- there would just be some extra calculation in such cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants