Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VF compilation: post-process fonts once #486

Merged
merged 3 commits into from Apr 1, 2021
Merged

VF compilation: post-process fonts once #486

merged 3 commits into from Apr 1, 2021

Conversation

madig
Copy link
Collaborator

@madig madig commented Mar 18, 2021

Tackle #485.

Font reloading does more than prepare for glyph renaming, not sure how to best tackle this in a sane manner.

@madig
Copy link
Collaborator Author

madig commented Mar 18, 2021

My idea here is to maybe target the reloading, i.e. when compiling a VF, don't reload the constituent masters but do it once before post-processing the VF. This unfortunately results in a

fontmake: Error: In 'NotoSans-MM.designspace': Generating fonts from Designspace failed: ('Base master not found.', '.Value1', 'Class2Record', '[35]', 'list', '.Class2Record', 'Class1Record', '[3]', 'PairPos', '[1]', 'Lookup', '[2]', 'list', '.Lookup', 'LookupList', '.LookupList', 'GPOS', '.table', 'table_G_P_O_S_')

😐

@madig
Copy link
Collaborator Author

madig commented Mar 18, 2021

I also tried otf.recalcBBoxes = False before reloading to cut down on bbox recalculation, but that crashes because maxp doesn't have all the data...

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

Trying to trace what recalcBBoxes does where.

  • vhea and hhea will both recalcBounds(glyfTable) independently of each other. Probably not an issue for most fonts, but can maybe make CJK compilation even slower? Also, both tables will recalc CFF1/2 bounds, which CFFFontSet.compile also does.
  • maxp does not recalc bound, just its own values, but will also loop over all glyphs again.
  • glyf will recalc glyph bounds again in Glyph.compile
  • head will recalc CFF1/2 glyph bounds in compile

Maybe TTFont needs a central method to do all the necessary computations once for all tables that need it. Would also cull duplicated code across tables...

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

So, this cute little hack makes Noto Sans VF compile in 53s instead of 63s on my machine.

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

So, 11% of the time compiling NS VF is spent in setupTable_head. That may not have anything to do with font reloading, but I noticed that those 11% divide evenly into 1) drawing a TTGlyph 2) computing its bbox, separately. Some time could probably be saved by computing it inside the pen, becasuse the necessery info might be there already... This may also need to be done for drawing CFF glyphs.

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

I'll focus on figuring out the GPOS issue you are facing. I suggest you look into the italicAngle / names and make what ufo2ft generate match the reloaded version if feasible.

For recalcBoundingBoxes, I suggest we make glyf/cff tables calculate the boxes and update in whatever table needs it; make the compilation of the downstream tables dependent on the outline tables.

Alternatively, leave fonttools as is re recalcing bounding boxes, but in ufo2ft somehow fetch those from glyf/cff and create the correct other tables and compile with recalc=false.

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

and I almost always get into issues like this. It isn’t hopeless, though

Agree. We had the same issue for loading from XML; I added populateDefaults to the otData-built tables. We can fix these things, just need to know.

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

Ok, so I completely remove reloading again and focus on making the test suite pass?

How do we tackle the renaming problem? The TTX dumps used for testing don't have production names when being compared.

@anthrotype
Copy link
Member

Don't change the expected TTX dumps, try to make it pass without the reloading

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

How do we tackle the renaming problem? The TTX dumps used for testing don't have production names when being compared.

I honestly never fully understood the production-names issue. So, ufo2ft takes an extra set of names, called production-names, and renames glyphset to those? So far so good. What test suite uses is tangent.

Now, IIRC one problem we had early on was that the feature files were written to production names I think? So we had to apply them at a certain time.

I don't have the full picture to the production-names issue. But just looking at the code, it looks to me like fontmake might be able to call ufo2ft in a way to avoid postProcessing at all.

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

The featureWriters seem to be taking their time as well. We'll find ways to speed those up.

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

Ok, will focus on the test suite then.

Yeah, feature writing is taking the biggest chunk after my cute hack (followed by glyph loading):

image

Time order:

image

Each master has its own mostly identical featuer file. Maybe there's savings in having one external "family.fea" that is parsed just once :)

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

I've reproduced the GPOS fail. Digging into.

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

Each master has its own mostly identical featuer file. Maybe there's savings in having one external "family.fea" that is parsed just once :)

Looks to me like scanning all glyphs to extract anchors, etc is taking time...

If we could change UFO loading to, even instead of loading into full objects, call a function that just collects various types of data where they are needed, and passes those to next layers, we save quite some time I think. But that fits more into a task-based fontmake than into ufo2ft which takes UFO as object.

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

Yeah, feature writing is taking the biggest chunk after my cute hack (followed by glyph loading):

Humm. I see. Not going through .fea file for generated stuff; and only "compiling" a .fea file once even if shared across masters, should save most of that time.

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

(I have reset my branch to slowly work my way through the test suite)

I like the collect-data-as-we-go strategy, but I think we must first eliminate wasteful reloading to see how much time the other time wasters truly take 😯 I must focus 😬

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

Yep. Let's jump on the bulldozer and clean this up :))

@moyogo
Copy link
Collaborator

moyogo commented Mar 19, 2021

Filters and feature writers should probably work on multiple fonts at a time, doing things only once when all the fonts have the same data, this would also help failing early when stuff is not interpolatable.

@moyogo
Copy link
Collaborator

moyogo commented Mar 19, 2021

Looks to me like scanning all glyphs to extract anchors, etc is taking time...

We're scanning all glyphs for anchors several times, propagateAnchors filter, mark feature writer and now GDEF feature writer. Maybe these should be scanned once and be kept in the compiler.

behdad added a commit to fonttools/fonttools that referenced this pull request Mar 19, 2021
Previously otlLib was generating None if the values themselves were
empty even if the value format was non-empty.  This happened to work
for compiling to binary since the compiler handles Value=None.

But this was confusing varLib.merger module (as in when building
variable fonts from such otlLib-built master GSUB/GPOS tables, without
roundtripping to OTF/TTF binary first), because in varLib.merger,
a None means "this master doesn't provide that info; skip it"; whereas
in a PairPos table a None as generated by otlLib simply meant "all
values are zero", which is different from "this master doesn't
provide this value".

This fixes that, such that ufo2ft can build variable-font without
saving masters to binary.

Part of googlefonts/ufo2ft#486
@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

I've reproduced the GPOS fail. Digging into.

Here you go:
fonttools/fonttools#2229

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

Cool, thanks 👍 will get to it either soon or next week.

We're scanning all glyphs for anchors several times, propagateAnchors filter, mark feature writer and now GDEF feature writer. Maybe these should be scanned once and be kept in the compiler.

Maybe! Once we eliminated the most obvious waste, let's pick new juicy profiler-guided targets.

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

So, with your patch and my patch, it fails because of glyph names. So I'm invoking with --no-production-names.

real	1m6.709s
user	0m56.932s
sys	0m3.293s

Without your patch:

real	1m23.260s
user	1m11.965s
sys	0m3.630s

That's a %20 speedup right there. One babelfont as I predicted in fonttools/fonttools#1095 (comment) :)).

@madig
Copy link
Collaborator Author

madig commented Mar 19, 2021

Truly worthy of a 🚀. Did you test with my branch as it is right now or with the previous one where I recompiled GPOS specifically to make noto compile?

@behdad
Copy link
Collaborator

behdad commented Mar 19, 2021

Did you test with my branch as it is right now

Now. The one that just nukes all reloading. With my fonttools fix PR>

@madig
Copy link
Collaborator Author

madig commented Mar 23, 2021

Still drowning in glyph name TTX diffs. Unsure how to proceed.

@madig
Copy link
Collaborator Author

madig commented Mar 23, 2021

Maybe I shouldn't call them hacks but rather being dilligent?

What's left is to figure out why glyph renaming does not work for the VFs whose tests fail :(

@behdad
Copy link
Collaborator

behdad commented Mar 23, 2021

What's left is to figure out why glyph renaming does not work for the VFs whose tests fail :(

I'm cleaning up the Format changes today. Will get to the name stuff after.

@behdad
Copy link
Collaborator

behdad commented Mar 24, 2021

Maybe I shouldn't call them hacks but rather being dilligent?

Yep.

Which makes me think now, can we make ufo2ft use fontTools.fontBuilder for smaller tables? Then roundtrippy diligence can be concentrated and tested there.

# NOTE: We must reload the font here, as loaded tables will keep using
# old glyph names. Compilation will convert them to glyph indices,
# which we can then rename by modifying the `post` or `CFF*` tables.
otf = self.otf = _reloadFont(self.otf)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be renaming should be done after compiling the font? Compile, open the font and only modify the post table (which I suppose should load the other heavy tables). Unless this is already what is happening here.

Copy link
Collaborator Author

@madig madig Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cosimo told me to do it only when we truly want to rename glyphs, i.e. inside the rename function. For some reason this still creates mismatches with some variable font TTX dumps so I have to reload when returning from compileVariable*, too 😭 I'm sure there are better data flows (i.e. rename right from the start and give a mapping to feaLib or something) but I dunno 😢

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ufo2ft integration tests only check the TTX dump of the TTFont that comes out ufo2ft's own compile functions. Usually that's going to be saved (compiled) to disk. The compile operation will also take care dropping the glyph names if post was set to format 3.0 (the tables which had been previously decompiled will be recompiled using the old names, post 3.0 will no longer contain the glyph names, such that upon reopening the same font the glyph names will be made up by fonttools from cmap or AGL names).

In general the problem when renaming or dropping glyph names is that there may already be tables that had been loaded using the old glyph names, and thus need to be compiled using the old names. Which is why renaming has to happen as first thing before any other table has been loaded, or else one needs to first reload the font (i.e compile all tables that had been loaded using old names, or simply copy the ones that haven't been loaded and are still binary blobs in the SFNTReader), and then they can change the names on the new TTFont instance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this is already what is happening here

well, the rename operation is the last thing that the postProcessor does, and the post-processor already is supposed to be "at the end", so it is kind of already happening

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem I see is that we are compiling masters, each getting their postProcessor running, and then we merge those masters into a varfont. I think we should only run postProcessor on the varfont, not on the masters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should only run postProcessor on the varfont, not on the masters.

indeed, the postProcessor is not actually doing anything when the compileInterpolatable*FromDS funcs are being called from the respective compileVariable{TTF,CFF2} functions, we are deliberately turning off subroutinization (for CFF only) as well as glyph renaming (via useProductionNames=False).
We could make so that we skip the whole post-processor step when building the interpolatable master TTFs/OTFs and only run at the very end of the (static/variable) pipeline. Then we could keep a single reloadFont in the postProcessor, just before we rename glyphs or immediatley after we drop them (setting post 3.0), which should make sure the TTX dumps contain the expected names.
To keep it simple, we could change the compileInterpolatable*FromDS to never call the postProcessor, basically ignoring the userProductionNames parameter. The problem is those same functions are not only used internally by compileVariable*, but are also used by fontmake for the -o {o,t}tf-interpolatable targets and currently they do respect the userProductionNames parameter (even though it doesn't make much sense to do glyph renaming on these intermediate build products). Maybe one can argue that the whole point of the interpolatable binary master is to build variable fonts so ignoring the glyph renaming process should be the default and only behavior -- given that the renaming has to happen on the final variable font anyway.

@madig
Copy link
Collaborator Author

madig commented Mar 25, 2021

Here's some measurements: https://daltonmaag.github.io/pipeline-perf-tracker/results/, see the result for "git-main-ufo2ft-avoid-reloading" :)

@behdad
Copy link
Collaborator

behdad commented Mar 25, 2021 via email

@behdad
Copy link
Collaborator

behdad commented Mar 25, 2021 via email

@madig madig force-pushed the avoid-font-reloading branch 2 times, most recently from 5f58221 to 1c88cab Compare March 26, 2021 17:36
@madig
Copy link
Collaborator Author

madig commented Mar 26, 2021

The code now skips reloading fonts when compiling variable fonts until the final font object. I moved the reloading code in PostProcessor into the process_glyph_names method, where it is actually needed. This PR now depends on fontTools main.

Completely avoiding glyph renaming might need a data flow rewrite or at least invasive refactoring, potentially including changes to fontTools, so maybe that's a longer term goal.

@madig madig marked this pull request as ready for review March 26, 2021 17:39
@behdad
Copy link
Collaborator

behdad commented Mar 28, 2021

I don't know what you did, but tests seem to be passing. so lgtm for whatever it means.

@madig
Copy link
Collaborator Author

madig commented Mar 28, 2021

Then I have failed at explanations!

  1. Compiling a variable TTF or CFF2 now signals to the lesser compile functions to not run the post-processor on the intermediates.
  2. Instead of unconditionally reloading the incoming font in PostProcessor.__init__, we now reload inside the glyph processing function, where it's actually needed.

Lib/ufo2ft/__init__.py Outdated Show resolved Hide resolved
@anthrotype
Copy link
Member

I think this PR will also fix #462 /cc @punchcutter

stream = BytesIO()
font.save(stream)
stream.seek(0)
return TTFont(stream)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would passing lazy=True to TTFont constructor make any measurable difference here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can measure no appreciable wall time improvement but have not profiled in depth.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lazy=True doesn't mix well with varLib merger. Don't use for now.

This helps with not having to reload some tables for the final values.
@madig madig requested a review from anthrotype April 1, 2021 14:12
@madig madig changed the title Avoid unnecessary font reloading VF compilation: post-process fonts once Apr 1, 2021
Copy link
Member

@anthrotype anthrotype left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@madig madig merged commit a5267d1 into main Apr 1, 2021
@madig madig deleted the avoid-font-reloading branch April 1, 2021 14:27
behdad added a commit to fonttools/fonttools that referenced this pull request Apr 8, 2021
Previously otlLib was generating None if the values themselves were
empty even if the value format was non-empty.  This happened to work
for compiling to binary since the compiler handles Value=None.

But this was confusing varLib.merger module (as in when building
variable fonts from such otlLib-built master GSUB/GPOS tables, without
roundtripping to OTF/TTF binary first), because in varLib.merger,
a None means "this master doesn't provide that info; skip it"; whereas
in a PairPos table a None as generated by otlLib simply meant "all
values are zero", which is different from "this master doesn't
provide this value".

This fixes that, such that ufo2ft can build variable-font without
saving masters to binary.

Part of googlefonts/ufo2ft#486
simoncozens pushed a commit to simoncozens/fonttools that referenced this pull request Oct 22, 2021
Previously otlLib was generating None if the values themselves were
empty even if the value format was non-empty.  This happened to work
for compiling to binary since the compiler handles Value=None.

But this was confusing varLib.merger module (as in when building
variable fonts from such otlLib-built master GSUB/GPOS tables, without
roundtripping to OTF/TTF binary first), because in varLib.merger,
a None means "this master doesn't provide that info; skip it"; whereas
in a PairPos table a None as generated by otlLib simply meant "all
values are zero", which is different from "this master doesn't
provide this value".

This fixes that, such that ufo2ft can build variable-font without
saving masters to binary.

Part of googlefonts/ufo2ft#486
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants