Skip to content


Switch branches/tags

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


MIT licensed Build Status Scrutinizer Code Quality codecov


Generates pdf files without any dependencies. An MIT licensed alternative to

Targets of the compiler:

  1. Correctness
    • high unit test coverage (codecov)
    • develop by following adobe standard (PDF 1.7)
    • tests with multiple viewers (Adobe Acrobat, Firefox, Edge, Evince)
    • tests on muliple OS (Linux, Windows)
    • cross-reference output with other generators (ITextSharp, TCPDF)
  2. Maintainability
    • low cyclomatic complexity (scrutinizer)
    • no code smells (scrutinizer)
    • classic compiler patterns (frontend, intermediate representation (IR) & backend)
    • clearly defined purpose per namespace (see below for an overview)
  3. Small resulting file size
    • elements only included in file if referenced somewhere (ensured by Backend)
    • only new element created if not possible to append to another one (ensured by IR)
  4. Speed of compilation
    • only provide single way to accomplish something
    • no html/css parsing

Architecture of the project

Like any compiler, its divided primarily into Frontend, Intermediate Representation (IR) and Backend.


The frontend contains the API the user works with. It should be easy to understand, but not hinder the user to access any feature of the backend it may need.

To design an API that is truly easy to understand, focusing on the content the user wants to print, but not on PDF specific details, this is done closely together with the document-generator project.

The document-generator project will provide an way to generate documents using an easy to test API which supports multiple generators. One of these generators will be this pdf-generator, another one will generate HTML/CSS.

The frontend currently builds on a Layout-Print-Transaction model.


The library provides layouts to position elements on the final document. For more complex documents, multiple layouts can also be combined together.


Using the layout, a cursor can be positioned.

Then a print action can be executed, which prints text/images or other inside the boundaries as set by the layout.


Before any printed content appears on the document, a transaction has to be created.

The transaction knows its position and appearance on the final document, hence is useful to investigate everything will appear as expected and to position future layouts.

If the transaction is of the expected form, it can be committed and will be written to the final document.


The Intermediate Representation provides an API that is convenient to use for the frontend. This structure could apply to any paginated document; hence does not expose PDF-specific details.


Contains the structure of the IR. It can convert the user input to data supported by the backend (text encoding, image resizing).


The Backend itself is divided into multiple parts.


Contains a minimal structure of supported structures by the backend.
It renders content types such as text / images into a stream consumable for PDFs. It creates the catalog structure of the PDF.


Contains the logical structure of a PDF. It converts this logical structure into a structure using only streams and dictionaries (the first higher-level structures of a PDF).


Contains the structure of a PDF as it can be written to a file. It can setup the the file header/trailer/cross reference table given the body of the file (streams/dictionaries). It converts the body to tokens and then writes the content of the resulting file.


The project will be developed in multiple phases.

Render PDF Milestone

First, the backend will be created following closely the standard of adobe.

  • print & style text
  • print images
  • print & style drawings (lines & rectangles)
  • use TTF fonts
  • print UTF-8 text
    • check if all characters correctly included in font (like ä)
    • make font dimensions available in the IR (to measure text)

Minimal IR Base

To be able to print to the pdf sensible some initial works needs to be done to see whats doable and what is not.

  • calculate dimensions of text
  • place text on pages with automatic breaks

Text API Milestone

Then, the public API will be defined.

text printing:

  • paragraphs
  • different styles in same paragraphs
  • measurement
  • automatic line-breaking
  • alignment (center, right-align, justify)


  • fixed-position printing (printer)
  • columns
  • rows
  • grid
  • table


  • margin
  • padding
  • border (color, thickness, stroke style)
  • background (color)

Multimedia API Milestone

extend functionality to more use-cases.

content types:

  • svg, esp, ...
  • video, audio, ... (?)


  • circles
  • polynomials

PDF features:

  • forms

Fun Milestone

What does not need to be done, but could.


  • better line breaking (knuth & plass line-breaking instead of greedy)


  • compress string streams
  • optimize rectangle position (do not modify transform matrix)


fast pdf generation with a high-level api