Skip to content

GPOS pack fails (or hangs with uharfbuzz) when LookupList overflow needs splitting a SinglePos (type 1) subtable #4091

@AmitMY

Description

@AmitMY

Summary

When the GPOS table contains enough large SinglePos Format 2 subtables to overflow the LookupList's uint16 offsets, fontTools' compile fails because the dispatch table for overflow recovery has no entry for lookup type 1 — splitSinglePos is commented out at Lib/fontTools/ttLib/tables/otTables.py:2514.

The user-visible failure mode depends on whether uharfbuzz is installed, but the underlying gap is the same:

  • Without uharfbuzzTTFont.save() raises OTLOffsetOverflowError (after the splitter logs Don't know how to split GPOS lookup type 1 and tryResolveOverflow gives up). At least it terminates with a clear failure.
  • With uharfbuzz installed — fontTools delegates to hb.repack first, which raises RepackerError. fontTools falls back to the same (incomplete) splitter, but the outer retry loop in BaseTTXConverter.compile never gives up: it keeps re-trying with the same unsplittable overflow record. save() hangs indefinitely.

Same shape as #1328 (the type 6 / MarkMarkPos variant, open since 2020 — splitMarkMarkPos is also missing from the dispatch table).

Verification matrix

Both arms verified on fresh venvs.

fontTools uharfbuzz Result with the reproducer below
4.62.1 not installed OTLOffsetOverflowError after ~70 s ❌
4.62.1 0.54.1 hangs indefinitely ❌
4.63.0 not installed OTLOffsetOverflowError after ~70 s ❌
4.63.0 0.54.1 hangs indefinitely ❌

The bare-fontTools packer does handle smaller GPOS that hb.repack would refuse (e.g. N=800 in this repro), so swapping in/out uharfbuzz is not a real workaround — past the bare-packer threshold the only difference is whether you get a clean exception or a hang.

Reproducer (self-contained)

"""N SinglePos Format 2 lookups, each with M glyphs in coverage.
N=3500 saves cleanly without uharfbuzz. N=4000 fails as described."""
import time
import fontTools
from fontTools.ttLib import TTFont, newTable
from fontTools.ttLib.tables import otTables as ot

N_LOOKUPS = 4000
N_GLYPHS = 15000

font = TTFont()
font.setGlyphOrder([".notdef"] + [f"G{i:05d}" for i in range(N_GLYPHS)])

lookups = []
for i in range(N_LOOKUPS):
    cov = ot.Coverage()
    cov.glyphs = [f"G{j:05d}" for j in range(N_GLYPHS)]
    vr = ot.ValueRecord()
    vr.XPlacement = -(i + 1)
    sp = ot.SinglePos()
    sp.Format = 2
    sp.Coverage = cov
    sp.ValueFormat = 1
    sp.Value = [vr] * N_GLYPHS
    lk = ot.Lookup()
    lk.LookupType = 1
    lk.LookupFlag = 0
    lk.SubTable = [sp]
    lookups.append(lk)

feat = ot.Feature()
feat.FeatureParams = None
feat.LookupListIndex = list(range(N_LOOKUPS))
frec = ot.FeatureRecord(); frec.FeatureTag = "kern"; frec.Feature = feat
fl = ot.FeatureList(); fl.FeatureRecord = [frec]

ls = ot.DefaultLangSys()
ls.LookupOrder = None; ls.ReqFeatureIndex = 0xFFFF; ls.FeatureIndex = [0]
sc = ot.Script(); sc.DefaultLangSys = ls; sc.LangSysRecord = []
srec = ot.ScriptRecord(); srec.ScriptTag = "DFLT"; srec.Script = sc
sl = ot.ScriptList(); sl.ScriptRecord = [srec]

ll = ot.LookupList(); ll.Lookup = lookups
gpos = ot.GPOS()
gpos.Version = 0x00010000
gpos.ScriptList = sl; gpos.FeatureList = fl; gpos.LookupList = ll

t = newTable("GPOS"); t.table = gpos
font["GPOS"] = t

print(f"fontTools {fontTools.__version__}")
t0 = time.time()
font.save("/tmp/repro_out.ttf")
print(f"saved in {time.time() - t0:.1f} s")

Without uharfbuzz — observed traceback

Traceback (most recent call last):
  File ".../fontTools/ttLib/tables/otBase.py", line 438, in getData
    items[i] = packUShort(item.subWriter.pos - pos)
  File ".../fontTools/ttLib/tables/otBase.py", line 862, in packUShort
    return struct.pack(">H", value)
struct.error: 'H' format requires 0 <= number <= 65535

During handling of the above exception, another exception occurred:

  ...
  File ".../fontTools/ttLib/tables/otBase.py", line 445, in getData
    raise OTLOffsetOverflowError(overflowErrorRecord)
fontTools.ttLib.tables.otBase.OTLOffsetOverflowError: ('GPOS', 'LookupIndex:', <n>, 'SubTableIndex:', None, 'ItemName:', None, 'ItemIndex:', None)

With uharfbuzz — stderr emits forever

hb.repack failed to serialize 'GPOS', attempting fonttools resolutions ; the error message was: RepackerError
Don't know how to split GPOS lookup type 1
Don't know how to split GPOS lookup type 1
... (never returns)

The hb.repack failed warning is from Lib/fontTools/ttLib/tables/otBase.py:198 in tryPackingHarfbuzz; the Don't know how to split log comes from fixSubTableOverFlows dispatching through the (incomplete) splitTable dict.

Possible fixes

Independent, additive:

  1. Implement splitSinglePos (currently commented out at otTables.py:2514). SinglePos Format 2's ValueArray is 1:1 with the Coverage, so splitting it along the Coverage chunks is structurally simple — splitPairPos looks like a close template. This fix benefits both arms.
  2. Make the retry loop give up when tryResolveOverflow returns False on the same overflow record twice. Right now the loop just retries forever (only visible when uharfbuzz is installed; without it the bare packer re-raises naturally). Either re-raise OTLOffsetOverflowError or break with a clear error.
  3. Optional: investigate the hb.repack RepackerError on this input shape. fontTools' bare packer handles e.g. N=800 in this repro fine; with uharfbuzz installed the same input hangs because hb.repack raises and the fallback splitter can't recover. That looks like a regression / blind spot worth a parallel issue against uharfbuzz / harfbuzz.

Why this matters (context)

SinglePos Format 1 (one shared value across the whole coverage) is the natural compact encoding when every glyph in a ChainContextPos input coverage gets the same adjustment — but harfbuzz silently drops such lookups at scale, so the documented workaround is to emit SinglePos Format 2 with N identical ValueRecords. This is what volt2ttf does, and what we ended up doing in sign-language-processing/signwriting-fonts#5 for the 2D SignWriting font's GPOS. We hit this bug trying to scale that font's coordinate range past 150 X × 150 Y positions — the full 250–749 range produces ~3000 of these SinglePos lookups and pushes us over the LookupList uint16 cap.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions