[ttx] Option to preserve PUSHB/PUSHW etc #92

behdad · 2014-01-19T09:24:40Z

From Werner:

"""
[FontTools-2.4-696-gfea81ee]

I'm not happy that the TrueType instruction dump in ttx no longer
shows the difference between PUSHB and PUSHW. I agree that this
information isn't relevant to the functionality, however, it's a
valuable information while debugging and comparing dumps.

Similarly, it is quite helpful to see the formats of tables (Coverage,
GlyphClassDef, etc.)

Can you revert that change? Or maybe add an option to ttx which
increases the verbosity of table dumps in general.
"""

And avoid merging consequent pushes. No option to control this externally right now, but the logic is there. Part of #92

avpanov · 2014-12-02T01:33:47Z

Please, revert the behavior of FontTools-2.4. Translation of the PUSHB, PUSHW, NPUSHB and NPUSHW instructions to non-existing PUSH one makes ttx incompatible with xgridfit which utilizes fonttols in some cases.

behdad · 2014-12-02T01:41:47Z

Can you please explain exactly how xgridfit uses FontTools?

avpanov · 2014-12-02T02:29:17Z

See http://xgridfit.sourceforge.net/convert.html

schriftgestalt · 2016-03-09T20:05:19Z

I agree that this information isn't relevant to the functionality,

In the case there is a JMPR command, the difference is indeed vital to the fictionality of the code. Compiling the instruction again will change the length of the code and thous invalidates the jump address. This will break all TTF generated with Glyphs.

schriftgestalt · 2016-03-09T20:09:27Z

And an option to switch of the optimisation is not sufficient as everybody is using it to import/export the fonts with ttx will get broken fonts and will not know to set that flag. Maybe the other way around, add an option to optimise (and check if there is a jump command).

lemzwerg · 2016-03-09T20:23:00Z

Indeed, I haven't thought of JMPR.

On the other hand: Making ttx squeeze a sequence of PUSHB and PUSHW as much as possible would be a good thing as an optimization, in case it can also properly adjust JMPR offsets.

schriftgestalt · 2016-03-09T20:26:54Z

If someone figures out how to fix the jump offsets I’m more than happy with the optimisation.

behdad · 2016-03-10T00:45:11Z

Ok I definitely had not thought about JMPR instruction.

Note that the original "optimize at write time" was done before my time. What I did was change the bytecode to reflect that. Ie. before, bytecode was being corrupted silently at write time!

I think we should disable this feature and enable it as an optimization option somehow.

Fixing the jump offset is NOT impossible, but a lot more work than this feature entails. I suppose maybe we can still do this for the burst of push operations at the beginning of each bytecode string but never in the middle?

lemzwerg · 2016-03-10T06:49:49Z

Yes. It shouldn't be too difficult to scan for JMPR instructions to find out the negative offset where PUSHX optimization has to stop. Or, as a starter, simply refrain from PUSHX optimization as soon as a JMPR instruction is found.

justvanrossum · 2016-03-10T06:57:38Z

Wow, it seems FontTools has indeed been broken for long time... (I never thought of JMPR, either.) And indeed, disabling the optimization when a JMPR instruction is there makes sense.

schriftgestalt · 2016-03-10T08:17:21Z

Or, as a starter, simply refrain from PUSHX optimization as soon as a JMPR instruction is found.

I suppose maybe we can still do this for the burst of push operations at the beginning of each bytecode string but never in the middle?

You need to disable optimization before the jump target. The jump address is an absolute code address. It is only safe after the the biggest jump address. Because that is not known at the beginning of the parsing, the optimization (and potentially the offset corrections) need to be done in a second step. First parse with preserve=True and then run the optimization in a second step. For now, it could just disable the optimization run if a jump was found. The jump will most likely be used in the prep or fpm but not in the instructions per glyph. So "lost" space saving is minimal.

lemzwerg · 2016-03-10T08:28:41Z

Well, addresses for jumps might be stored in registers or in CVT. In other words, you need a real bytecode compiler that considers all branches before optimization is really possible.

anthrotype · 2016-03-10T08:47:58Z

thanks @schriftgestalt for catching this bug!
If any of you have an idea how to keep the PUSH optimization without breaking the JMPR instructions, please send a pull request. In the meantime, I think it'd safer to just disable the optimization altogether.
Shall we set preserve=True by default for now?

justvanrossum · 2016-03-10T09:55:35Z

Btw. I'm not happy with the optimization happening in the decompile phase. That way you'll never see what changed if you round-trip through TTX. Yes, earlier, FontTools was already broken with its previous PUSHB and PUSHW optimization, but at least one could see if a font had changed when recompiled. So the current situation is worse in that it silently breaks things and makes it impossible to detect.

This is somewhat similar to the recent changes with the Format thing in the OTL subtables. Sure, the compiler may decide with format to generate, but by omitting the Format field in the TTX output one can no longer see if something has changed since recompilation. Which was a main purpose of TTX to begin with.

justvanrossum · 2016-03-11T10:24:42Z

I think the differences in our views can be summarized as follows:

TTX is an instruction for how to compile a font (Behdad)
TTX is a truthful dump from an existing font (Just)
Of course it's a bit of both, but I think we're losing important (debugging/analysis) functionality if we move away too much from 2 through abstraction.

justvanrossum · 2016-03-11T11:20:13Z

I want people to be able to use TTX to debug other tools. What did font editor XYZ emit? PUSHB or PUSHW? Which OTL subtable format did it generate? That has always been a core feature of TTX and I'd hate to lose that. These subtleties need not always be round-trippable by FontTools (because of optimizations or other reasons) but I want to be able to detect them. Sure, FontTools has not always been consistent with that view (eg. the 'loca' offsets have never been dumped to TTX), but I think the PUSHB/PUSHW abstraction takes TTX to an abstraction level that is too high.

behdad · 2016-03-15T23:22:25Z

The jump address is an absolute code address.

I just checked. The jump address is a relative offset.

twardoch · 2016-03-16T03:20:29Z

I am with Werner and Just on the “philosophical” level. TTX has been an important debugging tool in the past.

When Adobe, Apple, Mcrosoft, Tiro and myself were, in 2001, working on an alternative XML representation of SFNT during a two-day sprint in Redmond (one that got fully specified with an XSD schema but never really got any tools implemented apart from some portions of Apple's ftx tools), we had a big debate on whether fields in SFNT tables like head or OS/2 should be stored as attributes of a big XML element or as separate XML elements.

Microsoft’s Paul Nelson pushed for attributes while I argued for elements (actually I proposed just using TTX). I was overruled. Of course that notation never got properly, widely adopted while TTX has, even in AFDKO (despite Adobe’s Eric Muller being at that meeting back then). One of the “I told you” moments for me.

The reason I argued for elements was that the TTX notation stored the sequence of the fields which was useful because it reflected the SFNT structure more closely, so it was more low-level and allowed a person to follow the SFNT more closely, be educated about SFNT, and find problems easier. Code that was to be written could emit a partial table on decompilation and then fail, and you as a person could find the place of the flaw easier. And elements could hold their own optional attributes including those from other namespaces, which would be more future-proof. I argued all those points.

Paul backed the solution where any unique fields only needed to be expressed as attributes of a larger element (think <head version="1.0" unitsPerEm="2048" xMax="986" macStyle="0"> etc.). We had a very heated debate on this :) He named certain advantages of the approach. He mostly made the point that the structure was for authoring, and as such, the order of the fields within a table was not significant in the source notation, and that a tool should take care of it.

Basically, Paul wanted a smarter tool and a simpler notation, I wanted a more detailed notation and a dumber tool.

In the end, I think TTX prevailed not only because it was already written and was implemented in Python (rather than some C#), but also because Just’s original design of the format was the “just right” compromise between the “low-level debugginess” and “portability” of the TTX format.

Just’s format sits between the needs of super-clever engineers like Behdad who may prefer a simpler format because they can do super-clever tools, and of designers/coders like myself who can do copy-paste and learn simple programming techniques but often need simple “serialized” dumps of data to make sense of what’s going on. I think Just designed TTX just right because he’s “not a rocket scientist” if you know what I mean.

Also — after >10 years of the virtual domination of FontLab 4.x/Studio 5 as a font-building tool, where most .otfs and .ttfs were built by just one app, we now have a multitude of libraries for font building and manipulation, i.e. a multitude of potential binary outputs from seemingly the same “sources”. The debugging functionality of TTX is just as useful now as it was useful when Just originally wrote it.

justvanrossum · 2016-03-20T19:47:24Z

Any progress on this? I actually think the PUSH optimizations should be removed completely (possibly reimplemented as a separate tool), and that the instruction disassembly should again show exactly what's in the bytecode.

schriftgestalt · 2016-03-20T21:26:50Z

I have send a pull request.

justvanrossum · 2016-03-21T07:08:56Z

Merged. Thanks Georg. Leaving the issue open as the preserve=False route is still broken.

justvanrossum · 2016-03-21T07:25:32Z

Also: the issue of the Format attribute of OTL subtables has not yet been addressed. I propose to delete all lines

del self.Format # Don't need this anymore

in otTables.py. I can work on a pull request.

behdad · 2016-03-21T07:51:39Z

A pull request to removw the del format change lgtm, so feel free to merge.

I really want to expand on the design choices we make on this, but I've been busy and then on vacation. Will try to get my perspective in soon but definitely don't want to block anyone.

…the compiler but is useful for debugging. Part of fixing #92.

justvanrossum · 2016-03-21T11:13:57Z

Thanks, committed as ad386ee. I was a little surprised how little tests needed to be fixed (just one), I hope I didn't overlook anything.

behdad added a commit that referenced this issue Jan 19, 2014

Add logic for keeping PUSHB/PUSHW distinction

a36c74e

And avoid merging consequent pushes. No option to control this externally right now, but the logic is there. Part of #92

behdad referenced this issue Mar 11, 2016

[TTGlyphPen] Add component flags argument

fd5b6de

behdad mentioned this issue Mar 16, 2016

old ttProgram.py used to check and fix PUSHs mnemonics #248

Closed

justvanrossum added a commit that referenced this issue Mar 21, 2016

Keep Format attribute for OT subtables. This value may be ignored by …

ad386ee

…the compiler but is useful for debugging. Part of fixing #92.

behdad mentioned this issue Mar 23, 2021

[feaLib] Test that G* table compilation does not change data #2236

Merged

behdad closed this as completed Apr 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ttx] Option to preserve PUSHB/PUSHW etc #92

[ttx] Option to preserve PUSHB/PUSHW etc #92

behdad commented Jan 19, 2014

avpanov commented Dec 2, 2014

behdad commented Dec 2, 2014

avpanov commented Dec 2, 2014

schriftgestalt commented Mar 9, 2016

schriftgestalt commented Mar 9, 2016

lemzwerg commented Mar 9, 2016

schriftgestalt commented Mar 9, 2016

behdad commented Mar 10, 2016

lemzwerg commented Mar 10, 2016

justvanrossum commented Mar 10, 2016

schriftgestalt commented Mar 10, 2016

lemzwerg commented Mar 10, 2016

anthrotype commented Mar 10, 2016

justvanrossum commented Mar 10, 2016

justvanrossum commented Mar 11, 2016

justvanrossum commented Mar 11, 2016

behdad commented Mar 15, 2016

twardoch commented Mar 16, 2016

justvanrossum commented Mar 20, 2016

schriftgestalt commented Mar 20, 2016

justvanrossum commented Mar 21, 2016

justvanrossum commented Mar 21, 2016

behdad commented Mar 21, 2016

justvanrossum commented Mar 21, 2016

[ttx] Option to preserve PUSHB/PUSHW etc #92

[ttx] Option to preserve PUSHB/PUSHW etc #92

Comments

behdad commented Jan 19, 2014

avpanov commented Dec 2, 2014

behdad commented Dec 2, 2014

avpanov commented Dec 2, 2014

schriftgestalt commented Mar 9, 2016

schriftgestalt commented Mar 9, 2016

lemzwerg commented Mar 9, 2016

schriftgestalt commented Mar 9, 2016

behdad commented Mar 10, 2016

lemzwerg commented Mar 10, 2016

justvanrossum commented Mar 10, 2016

schriftgestalt commented Mar 10, 2016

lemzwerg commented Mar 10, 2016

anthrotype commented Mar 10, 2016

justvanrossum commented Mar 10, 2016

justvanrossum commented Mar 11, 2016

justvanrossum commented Mar 11, 2016

behdad commented Mar 15, 2016

twardoch commented Mar 16, 2016

justvanrossum commented Mar 20, 2016

schriftgestalt commented Mar 20, 2016

justvanrossum commented Mar 21, 2016

justvanrossum commented Mar 21, 2016

behdad commented Mar 21, 2016

justvanrossum commented Mar 21, 2016