Skip to content

PDF Renderer Converter

Vladimir Schneider edited this page Oct 19, 2019 · 4 revisions

flexmark-java PDF Renderer extension

Overview

The flexmark-pdf-converter module renders HTML to PDF using Open HTML To PDF so the full conversion requires rendering Markdown to HTML then passing it to the PDF converter extension.

Usage

Sample application is provided PdfConverter.java.

Rendering non-latin character sets

Bundled fonts in the OpenHtmlToPDF library support only basic latin unicode characters. To render other character sets additional fonts must be included in the HTML being converted.

A solution to the font problem is to define an embedded TrueType font in the style or stylesheet and set the body tag to use this font. OpenHtmlToPDF will use the characters from the font which has them defined.

For example including Noto Serif/Sans/Mono fonts and adding noto-serif, noto-sans and noto-mono families to CSS to allow PDF to use these for rendering text.

However, the PDF converter requires TrueType fonts and Noto CJK fonts are OpenFonts which cannot be used. The solution is to download a TrueType Unicode font that supports CJK character set and add it to the custom rendering profile to be used for PDF.

For my test I used arialuni.ttf from https://www.wfonts.com/font/arial-unicode-ms

If the installation directory for the fonts is /usr/local/fonts/ then the following in the stylesheet or the page <head> <style>...</style> </head> should be added:

ℹ️ Don't forget to change the path to your OS and installation directory. On windows the path should start with file:/X:/... where X:/... is the drive letter followed by the full installation path.

@font-face {
  font-family: 'noto-cjk';
  src: url('file:///usr/local/fonts/arialuni.ttf');
  font-weight: normal;
  font-style: normal;
}

@font-face {
  font-family: 'noto-serif';
  src: url('file:///usr/local/fonts/NotoSerif-Regular.ttf');
  font-weight: normal;
  font-style: normal;
}

@font-face {
  font-family: 'noto-serif';
  src: url('file:///usr/local/fonts/NotoSerif-Bold.ttf');
  font-weight: bold;
  font-style: normal;
}

@font-face {
  font-family: 'noto-serif';
  src: url('file:///usr/local/fonts/NotoSerif-BoldItalic.ttf');
  font-weight: bold;
  font-style: italic;
}

@font-face {
  font-family: 'noto-serif';
  src: url('file:///usr/local/fonts/NotoSerif-Italic.ttf');
  font-weight: normal;
  font-style: italic;
}

@font-face {
  font-family: 'noto-sans';
  src: url('file:///usr/local/fonts/NotoSans-Regular.ttf');
  font-weight: normal;
  font-style: normal;
}

@font-face {
  font-family: 'noto-sans';
  src: url('file:///usr/local/fonts/NotoSans-Bold.ttf');
  font-weight: bold;
  font-style: normal;
}

@font-face {
  font-family: 'noto-sans';
  src: url('file:///usr/local/fonts/NotoSans-BoldItalic.ttf');
  font-weight: bold;
  font-style: italic;
}

@font-face {
  font-family: 'noto-sans';
  src: url('file:///usr/local/fonts/NotoSans-Italic.ttf');
  font-weight: normal;
  font-style: italic;
}


@font-face {
  font-family: 'noto-mono';
  src: url('file:///usr/local/fonts/NotoMono-Regular.ttf');
  font-weight: normal;
  font-style: normal;
}

body {
    font-family: 'noto-sans', 'noto-cjk', sans-serif;
    overflow: hidden;
    word-wrap: break-word;
    font-size: 14px;
}

var,
code,
kbd,
pre {
    font: 0.9em 'noto-mono', Consolas, "Liberation Mono", Menlo, Courier, monospace;
}

Details

Use class PdfConverterExtension from artifact flexmark-pdf-converter.

The following options are available:

Defined in PdfConverterExtension class:

Static Field Default Value Description
DEFAULT_TEXT_DIRECTION (PdfRendererBuilder.TextDirection) null default text direction for document
PROTECTION_POLICY (ProtectionPolicy) null document protection policy PDF Box Documentation, Related OpenHtmlToPdf
DEFAULT_CSS default embedded CSS for improved TOC generation default.css

ℹ️ Default CSS is only added if PdfConverterExtension.exportToPdf(OutputStream, String, String, DataHolder) is used to provide options. For all other calls you need to embed the default css into your HTML string before exporting to PDF. You can use PdfConverterExtension.embedCss(String html, String css) for this purpose and get the contents of the default css using PdfConverterExtension.DEFAULT_CSS.getFrom(null)

ℹ️ If you are using the TocExtension, you will need to set TocExtension.LIST_CLASS to PdfConverterExtension.DEFAULT_TOC_LIST_CLASS