GPOS: feature writers should split lookups based on language system #619

cmyr · 2023-12-01T18:14:28Z

This is a significant project, and includes splitting based on writing direction. A good place to start will be kernFeatureWriter2.py.

The text was updated successfully, but these errors were encountered:

belluzj · 2024-01-25T09:33:32Z

Hello, here I think you're first talking about splitting by direction (LTR vs RTL) which is necessary.

As a second step, or while you're doing this, you might be tempted to also split into one lookup per script, as we're doing in recent ufo2ft. After discussing this issue several times with Cosimo, here are my thoughts on the topic: basically, splitting scripts into different lookups was a compile-time performance and file size optimization, and it's not the best one, so I don't think it should be adopted here.

Pros of splitting into one lookup per script:

it happens early in the pipeline (as opposed to GPOS compaction near the end) so it saves processing time early by keeping small lookup sizes and avoiding overflow resolution
in terms of file size savings, it's a very good approximation of the probable best splitting (we observed that splitting into one lookup per script is as good for file size as doing GPOS compaction which doesn't know about scripts, and splits based on clustering the numeric data in the table)

Con: It's not functionally equivalent to the old way of having just one big lookup for all LTR kerning, and another big lookup for all RTL kerning. The functional differences are minor but they are regressions:

it prevents cross-script kerning: Khaled implemented a fix here, but the fix is to go back to grouping scripts together, so I believe it loses the performance and file size benefits, in favour of correctness
it causes regressions in Adobe InDesign (one and two) which arguably are Adobe's fault, but unfortunately will matter to customers. I tried to fix these by registering the lookups everywhere, and while that preserves the compile-time performance and small file size, it creates a runtime performance issue, as profiled by Behdad, so it's not good either.

Proposed solution: instead of splitting into lookups, consider making one big lookup and splitting it into subtables. As done in the GPOS compaction, it's possible to split into subtables while preserving functional equivalence, and while driving down the file size. So in that respect, subtables are the best tool to split.

Making one big lookup will allow cross-script kerning, which turns out to be desirable, and will allow the Adobe InDesign dumb composer to keep working.

I'm not sure however which criteria is best to split into subtables, between doing it by script or doing it based on clustering as in the GPOS compaction. Maybe in fontc you don't have the same constraints as in fontTools and you could apply the GPOS compaction on the IR directly, and so you would skip the cost of overflow resolution (if your IR support bigger offsets than possible in TTFs, which could be nice to allow passing big data at no cost from one step to the next, even if that data would require overflow resolution to be serialized to TTF)

Sorry about the long comment. FYI @anthrotype

cmyr · 2024-01-25T16:25:31Z

Okay that is very useful @belluzj, thank you for taking the time to spell that all out.

Doing one lookup per writing direction sounds reasonable, and we can use some heuristics to add subtable splits so as to minimize the chances that we're going to overflow at compile time. I also suspect that we want to minimize the number of subtables, since the shaper will potentially need to inspect each individual subtable, and that has runtime costs.

belluzj · 2024-01-25T16:35:28Z

Yes exactly, you want to find the right middle spot between one big subtable and the other extreme, one tiny subtable per line of the original subtable. The code in this PR: fonttools/fonttools#2326 finds that middle ground by starting with one tiny subtable per line, and agglomerating them into bigger subtables as long as file size goes down. It does not take shaping speed into account, only file size. I think at the time we found that shaping speed was not so much affected.

cmyr · 2024-03-18T19:42:54Z

closed by #731

cmyr added this to the matching Oswald GPOS (mark/kern) milestone Dec 1, 2023

rsheeter modified the milestones: matching Oswald GPOS (mark/kern), Compile Oswald and compare to fontmake Dec 4, 2023

rsheeter added the correctness label Dec 6, 2023

cmyr mentioned this issue Jan 24, 2024

GPOS/feature writing: an overall design #682

Open

madig mentioned this issue Mar 14, 2024

Split kerning by script #731

Merged

cmyr closed this as completed Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPOS: feature writers should split lookups based on language system #619

GPOS: feature writers should split lookups based on language system #619

cmyr commented Dec 1, 2023

belluzj commented Jan 25, 2024

cmyr commented Jan 25, 2024

belluzj commented Jan 25, 2024

cmyr commented Mar 18, 2024

GPOS: feature writers should split lookups based on language system #619

GPOS: feature writers should split lookups based on language system #619

Comments

cmyr commented Dec 1, 2023

belluzj commented Jan 25, 2024

cmyr commented Jan 25, 2024

belluzj commented Jan 25, 2024

cmyr commented Mar 18, 2024