Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error while subsetting Bungee COLR #2461

Closed
anthrotype opened this issue Dec 2, 2021 · 7 comments · Fixed by #2462
Closed

error while subsetting Bungee COLR #2461

anthrotype opened this issue Dec 2, 2021 · 7 comments · Fixed by #2462

Comments

@anthrotype
Copy link
Member

Take this font from https://github.com/google/fonts/blob/17fd8411fad0c1fb14e4c8bd18c87f45f65454f8/ofl/bungeecolor/BungeeColor-Regular.ttf

and do

$ fonttools subset BungeeColor-Regular.ttf --text="À" --glyph-names --verbose

you get this struct.error:

Traceback (most recent call last):
[...]
  File "python3.9/site-packages/fontTools/ttLib/tables/otConverters.py", line 356, in write
    writer.writeValue(self.typecode, font.getGlyphID(value))
  File "python3.9/site-packages/fontTools/ttLib/tables/otBase.py", line 438, in writeValue
    self.items.append(struct.pack(f">{typecode}", value))
struct.error: ('required argument is not an integer', 'str', 2, 'LayerRecord[]', 'LayerRecordArray')
@anthrotype
Copy link
Member Author

anthrotype commented Dec 2, 2021

A bit of context is required.

The font contains a COLR version 0 table.
The glyph "Agrave" (mapped from character 0x00C0 in cmap) in COLR uses two layer glyphs, "Agrave.alt001" and "Agrave.alt002".
In the glyf table, the glyph "Agrave" is a composite glyphs that uses "A" and "grave" as components.
The glyph "grave" is also a COLR base glyph, and uses "grave.alt001" and "grave.alt002" as layers.

The subsetter does a series of "glyph closures" to progressively extend the intial set and determine a final subset. These are done in a specific order that takes care of inter-table dependencies. It first intersects the requesed unicodes with the cmap tables to find the respective glyph names, then processes the GSUB table extending the set to include the glyphs that are reachable via substitutions from these.
After cmap and GSUB closures, it then searches the COLR table, when present, and extends the set to include all glyphs that are referenced from the base color glyphs (used as COLRv0 layers or COLRv1 PaintGlyph or PaintColrGlyph).
Only after COLR, it does a closure over glyf and/or CFF, to resolve component references in composite glyphs. This is the correct logical order, because COLR glyphs may reference glyf/CFF glyphs but not viceversa.

Now the problem in this particular case is that the subsetter is using the final subset to subset the COLR table's base glyph records, this also including the glyph set extended by the glyf closure that follows the COLR glyph closure; instead of only including the set resulting from the COLR glyph closure. For this reason, it's retaining the "grave" COLR base record because glyf's glyph "Agrave" uses "A" and "grave" as components, so "grave" happens to be in the final subset.
Now, including "grave" COLR glyph requires that we also include the layer glyphs "grave.alt001" and "grave.alt002", but one of these "grave.alt001" is missing from the final subset because it's only used in COLR's "grave", and by the time we did the COLR glyph closure, we didn't consider "grave", only the initial "Agrave"...
So when compiling the COLR table, the font.getGlyphID("grave.alt001") lookup fails with struct.error, because it's unexpectedly returning None, and struct.pack(..., None) is invalid as "the required argument is not an integer".

The fix is not to do another round of closures and include the missing "grave.alt001". We don't actually need the COLR "grave" color glyph, the initial request was to retain the glyph "Agrave", which in COLR table doesn't refer to or make use of the COLR glyph "grave" at all; "Agrave" only refers to glyf glyph "grave", via "Agrave" => ("A", "grave") composite glyf closure.

The fix is to subset COLR base glyphs only based on the gyphs subset that results after the COLR glyph closure but before the glyf closure. This way we don't include "grave" as COLR base record, and neither we need to include any layer glyphs referenced by it, because it was not requesed to begin with.

The subsetter uses a pattern to do similar thing, e.g. for GSUB, the current subset is not the final s.glyphs but s.glyphs_gsubed, similarly for MATH table, it's s.glyphs_mathed (these are all computed in the Subsetter._closure_glyphs() method). We basically need to save the equivalent s.glyphs_colred, i.e. the set of glyphs after (cmap + GSUB +) COLR glyph closure, and use that one when subsetting COLR table.

@rsheeter
Copy link
Collaborator

rsheeter commented Dec 2, 2021

Nice writeup, ty. A clearer error message would be nice too if possible. The struct complaint is not tremendously informative wrt what went wrong.

@anthrotype
Copy link
Member Author

The struct complaint is not tremendously informative wrt what went wrong.

yes, especially the fact that the error message mentions 'str', but the type of the offending value is None, not 'str'.. that's just the result of the too clever exception chaining in otTables

@anthrotype
Copy link
Member Author

And the reason it does not error when --no-glyph-names (default) option is used with the same font and input is because TTFont.getGlyphID method doesn't return None when the parameter is something like "glyph00840", but will just turn that back into an integer (in layout tables one can have "virtual" GIDs).
With --no-glyph-names the subsetter complets without error, but the resulting COLR table will be invalid because it references a glyph index which is either wrong, or out of bounds and missing.

@rsheeter
Copy link
Collaborator

rsheeter commented Dec 2, 2021

the resulting COLR table will be invalid because it references a glyph index which is either wrong, or out of bounds and missing

Ouch. Well this'll be nice to have fixed! Is there a point it would be appropriate to validate color does not reference fantasy glyphs?

@anthrotype
Copy link
Member Author

Is there a point it would be appropriate to validate color does not reference fantasy glyphs?

maybe, but that's tangential and better tackled as separate feature request. here we just want to exclude glyphs that are only added after glyf closure from the set used to subset COLR, to avoid including more color glyphs than requested.

anthrotype added a commit that referenced this issue Dec 2, 2021
this currently fails with struct.error. Fix will ensue shortly
anthrotype added a commit that referenced this issue Dec 2, 2021
@behdad
Copy link
Member

behdad commented Dec 3, 2021

The fix is to subset COLR base glyphs only based on the gyphs subset that results after the COLR glyph closure but before the glyf closure. This way we don't include "grave" as COLR base record, and neither we need to include any layer glyphs referenced by it, because it was not requesed to begin with.

Correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants