Speedup pdftex.map parsing. #19538

For reminder, pdftex.map is a file that maps tex font names ("cmr10") to filesystem font names ("cmr10.pfb"), together with additional metadata (font encoding, postscript special commands). When using pdf output with usetex, we parse usetex-generated dvi files and then need to locate and load these fonts for embedding into the pdf file, hence then need to parse pdftex.map. On some systems (likely with large texlive installs), pdftex.map can be really large (>10^4 entries), and parsing it is quite slow (>500ms on the matplotlib macos). This patch implements a new (simpler?) parser, which is ~25% faster (so it can cut hundreds of ms on systems with large maps). The patch additionally correctly handles entries of the form `foo <bar.pfb` (i.e., with no postscript font name -- in that case the docs say that the postscript font name is the same as the tfm name). On the other hand, the patch also drops support for quotes around anything but the postscript specials (in accordance with the psfonts.map docs, and the actual pdftex implementation in `src/texk/web2c/pdftexdir/mapfile.c`: `case '"': /* opening quote */` only handles postscript specials). See also changes to test.map for the changes in supported syntax.

See previous commit for description of pdftex.map. The vast majority of entries (dozens of thousands) in pdftex.map actually end up being unused, and their parsing is just wasted. This patch takes advantage of the fact that we can quickly recover the tex font name from pdftex.map entries (it's just the first word), so we can very quickly build a mapping of tex font names to unparsed pdftex.map entries, and then only parse the few entries that we'll need on-demand. This speeds up e.g. ``` python -c 'from pylab import *; rcParams["text.usetex"] = True; plot(); savefig("/tmp/test.pdf")' ``` by ~700ms (~20%) on the matplotlib macos.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speedup pdftex.map parsing. #19538

Speedup pdftex.map parsing. #19538

Commits on Apr 6, 2021