Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusing fonts over multiple runs against the same target PDF document #683

Open
stechio opened this issue Mar 31, 2021 · 0 comments
Open

Comments

@stechio
Copy link

stechio commented Mar 31, 2021

[version: 1.0.9-SNAPSHOT; commit: ccd29f03ede2aecadac9c39fda95a5fedfb23645]

In my use case, I need to run OHTP multiple times against the same target PDF document. To my understanding, this workflow introduces some critical points to deal with, especially regarding resource reuse.

Familiarising with the configuration

According to the wiki, I can map CSS font families during renderer building; for example, reusing live instances of PDFont seems to work this way:

import com.openhtmltopdf.pdfboxout.PDFontSupplier;
import com.openhtmltopdf.pdfboxout.PdfRendererBuilder.PdfRendererBuilder;
import org.apache.pdfbox.pdmodel.font.PDFont;

. . .

PDFont myFont = . . .;
new PdfRendererBuilder().useFont(new PDFontSupplier(myFont), "myfamily"). . .;

While the concept is cool per se, I think that mapping fonts statically one-by-one is quite limiting: it would be nice if users could alternatively pass a mapping function providing dynamic resolution of font families, something like this:

new PdfRendererBuilder().useFont((fontFamily, fontWeight, fontStyle, subset) -> {. . . return font; }). . .;
. . .

@FunctionalInterface
public interface FontMapper {
	PDFont map(String fontFamily, Integer fontWeight, FontStyle fontStyle, boolean subset);
}

public class PdfRendererBuilder extends BaseRendererBuilder<PdfRendererBuilder, PdfRendererBuilderState> {
. . .
	public PdfRendererBuilder useFont(FontMapper mapper) {
. . .
	}
}

At this point, I think my API exploration gave me enough black-box clues to try OHTP, so let's start evaluating some common scenarios...

Evaluating common scenarios

To fathom limits and capabilities of OHTP, I have arranged some font reuse scenarios (when multiple source HTML files, referencing the same fonts, are rendered against the same target PDF document), as I read about metrics caching from the wiki but wasn't sure whether it encompassed some kind of font caching to reuse font structures too (if a font has already been imported into the target PDF document, it would be desirable that the following runs recognize the imported structure, avoiding to import it again).

For the sake of simplicity, in these test cases I applied twice the same source HTML file ("/path/to/source.html") against the same target PDF document.

1. Font referenced by source HTML files (via @font-face rules): BAD

When a font is referenced by source HTML files via @font-face rules, the resulting PDF has two copies of the font structure (one per run!), which is obviously unacceptable: is there any workaround?

Having no means to pass the loaded fonts state across runs, OHTP is oblivious of resource recurrence: no matter if a font has been loaded from the same location in the previous run against the same target PDF document, it will be happily loaded again and again...

It's essential that the mapping of font families to corresponding PDFont instances is shared across multiple runs to affect the font resolver: if a font family referenced by a @font-face rule (see MainFontStore.addFontFaceFont(..)) was already mapped, the resource loader should get the already-existing PDFont instance from the map; otherwise, it should load the font from the location specified by the @font-face rule and put it into the map.

Test source:

<html>
<head>
<style>
@font-face {
    font-family: 'MyFont';
    src: url(/path/to/MyFont.ttf);
}
body {
    font-family: 'MyFont';
    font-size: 24pt
}
</style>
</head>
<body>
    <p>HELLO WORLD</p>
</body>
</html>

Test code:

try (PDDocument doc = new PDDocument()) {
    FSCacheEx<String, FSCacheValue> cache = new FSDefaultCacheStore();
    try (PdfBoxRenderer renderer = new PdfRendererBuilder().withFile(new java.io.File("/path/to/source.html"))
            .usePDDocument(doc).useCacheStore(CacheStore.PDF_FONT_METRICS, cache)
            .buildPdfRenderer()) {
        renderer.createPDFWithoutClosing();
    }
    try (PdfBoxRenderer renderer = new PdfRendererBuilder().withFile(new java.io.File("/path/to/source.html"))
            .usePDDocument(doc).useCacheStore(CacheStore.PDF_FONT_METRICS, cache)
            .buildPdfRenderer()) {
        renderer.createPDFWithoutClosing();
    }
    try (OutputStream os = new FileOutputStream("/path/to/target.pdf")) {
        doc.save(os);
    }
}

2. Mapping (via PdfRendererBuilder.useFont(..)) a resourceless font family: GOOD

Mapping via PdfRendererBuilder.useFont(..) a resourceless font family, the font family is matched by the supplied font and the resulting PDF has just one copy of the font structure.

Test source:

<html>
<head>
<style>
body {
    font-family: 'MyFont';
    font-size: 24pt
}
</style>
</head>
<body>
    <p>HELLO WORLD</p>
</body>
</html>

Test code:

try (PDDocument doc = new PDDocument()) {
    FSCacheEx<String, FSCacheValue> cache = new FSDefaultCacheStore();
    PDFontSupplier fontSupplier = new PDFontSupplier(PDType0Font.load(doc, new File("/path/to/MyFont.ttf")));
    try (PdfBoxRenderer renderer = new PdfRendererBuilder().withFile(new java.io.File("/path/to/source.html"))
            .usePDDocument(doc).useCacheStore(CacheStore.PDF_FONT_METRICS, cache)
            .useFont(fontSupplier, "MyFont")
            .buildPdfRenderer()) {
        renderer.createPDFWithoutClosing();
    }
    try (PdfBoxRenderer renderer = new PdfRendererBuilder().withFile(new java.io.File("/path/to/source.html"))
            .usePDDocument(doc).useCacheStore(CacheStore.PDF_FONT_METRICS, cache)
            .useFont(fontSupplier, "MyFont")
            .buildPdfRenderer()) {
        renderer.createPDFWithoutClosing();
    }
    try (OutputStream os = new FileOutputStream("/path/to/target.pdf")) {
        doc.save(os);
    }
}

3. Overriding (via PdfRendererBuilder.useFont(..)) a font referenced by source HTML files (via @font-face rules): BAD

Fails to apply PdfRendererBuilder.useFont(..) in an attempt to override a (possibly missing) font referenced by source HTML files: @font-face rules are processed before fonts supplied via PdfRendererBuilder.useFont(..), so they take precedence in FontFamily.match(int, IdentValue) when candidates within the same font family ("MyFont") are evaluated, as the fonts are appended to an ArrayList (com.openhtmltopdf.outputdevice.helper.FontFamily._fontDescriptions) in execution order and evaluated (FontFamily.getStyleMatches(IdentValue, List<T>)) as such.

Because of the matching mechanism which picks the first candidate and ignores any other, when the @font-face-imported font fails to load, OHTP falls back to default serif instead of using the alternative supplied font associated to the same font family (in this case, "MyFont").

It's essential that fonts supplied via PdfRendererBuilder.useFont(..) take precedence over source CSS rules.

Test source:

<html>
<head>
<style>
@font-face {
    font-family: 'MyFont';
    src: url(bad/path/to/force/load/failure);
}
body {
    font-family: 'MyFont';
    font-size: 24pt
}
</style>
</head>
<body>
    <p>HELLO WORLD</p>
</body>
</html>

Test code:

try (PDDocument doc = new PDDocument()) {
    FSCacheEx<String, FSCacheValue> cache = new FSDefaultCacheStore();
    try (PdfBoxRenderer renderer = new PdfRendererBuilder().withFile(new java.io.File("/path/to/source.html"))
            .usePDDocument(doc).useCacheStore(CacheStore.PDF_FONT_METRICS, cache)
            .useFont(new File("/path/to/MyFont.ttf"), "MyFont")
            .buildPdfRenderer()) {
        renderer.createPDFWithoutClosing();
    }
    try (PdfBoxRenderer renderer = new PdfRendererBuilder().withFile(new java.io.File("/path/to/source.html"))
            .usePDDocument(doc).useCacheStore(CacheStore.PDF_FONT_METRICS, cache)
            .useFont(new File("/path/to/MyFont.ttf"), "MyFont")
            .buildPdfRenderer()) {
        renderer.createPDFWithoutClosing();
    }
    try (OutputStream os = new FileOutputStream("/path/to/target.pdf")) {
        doc.save(os);
    }
}

Summary

Recapping, here it is my proposal for effective font management over multiple OHTP runs against the same target PDF document ("user-supplied fonts" stands for "via PdfRendererBuilder.useFont(..)", while "CSS-imported fonts" stands for "via @font-face rules"):

  • font reuse (to solve font structure duplicates): font families should map to corresponding PDFont instances. PdfBoxFontResolver should keep track of both user-supplied and CSS-imported PDFont instances, making them available to the next PdfRendererBuilder targeting the same PDF document. If a font family referenced by a @font-face rule (see MainFontStore.addFontFaceFont(..)) was already mapped, the resource loader should get the already-existing PDFont instance from the map; otherwise, it should load the font from the location specified by the @font-face rule and put it into the map.
  • font override (to allow full control over the font resolution process): user-supplied fonts should take precedence over CSS-imported ones.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant