PDF Exporter

Introduction

This simple library performs the task of exporting a web document to either PDF or PNG, and optionally postprocessing the result. It accomplishes this by sending a document (via url) to a Chromium-type instance, and performing either a PDF export, or snapping a screenshot.

Quick Start

<dependency>
  <groupId>nl.aerius</groupId>
  <artifactId>webdocument-exporter</artifactId>
  <version>1.0</version>
</dependency>

and make sure to have a chromium-type instance running, for example:

docker run -d --net=host -p 9222:9222 --cap-add=SYS_ADMIN justinribeiro/chrome-headless

Usage

PDF Export

To export a web page to PDF;

ExportJob job = ExportJob.create()
  .url(url)
  .print();
  
// A byte[] array of the resulting job
byte[] bytes = job.result();

// The printed document will be available at 'job.outputDocument()'
job.save();

Or, to post-process the resulting PDF document:

job
  .toProcessor()
  .target(destination)
  // Add a front page - by prepending the first page of another PDF document to the document being processed.
  // This can be useful because correctly formatting a front page inside a web document can be hard
  .frontPage(frontPageDocument)
  // Add a title to the bottom-left of the document
  .documentTitle(title)
  // Add a subtitle to the bottom-left of the document, right under the title
  .documentSubtitle(subtitle)
  // Add a Map<String, String> type meta data to the document
  .metaData(meta)
  // Add page numbers to the bottom-right of the document
  .pageNumbers()
  // Process the document using the definitions set above
  .process();

// The post-processed document will be available at 'destination'

It is possible to add custom processors to the post-processing step, via:

job
  .toProcessor()
  .documentProcessor((document) -> {
    // Manipulate the document in some way
  })
  .pageProcessor((document, page, number) -> {
    // Manipulate each page (by number) of the document in some way
  })
  .process();

It is also possible to forego the print job, and process an existing PDF, as per the following example:

PdfProcessingHandle.create(pdfDocument)
  .target(destination)
  .pageNumbers()
  .title(title)
  .process();

Snapshot Export

To export a web page to PNG;

ExportJob job = ExportJob.create()
  .url(url)
  .snapshot();

// A byte[] array of the resulting job
byte[] bytes = job.result();

// The screenshotted image will be available at 'job.outputDocument()'
job.save();

Chromium

A chromium-headless server (or fork) must be running to facilitate the exporting of the document. By default, this server is assumed to be running on localhost:9222

The simplest way to get this type of server up and running for development environments is via docker:

docker run -d --net=host -p 9222:9222 --cap-add=SYS_ADMIN justinribeiro/chrome-headless

If the chromium-headless server ends up running somewhere other than localhost:9222, it can be configured with ExportJob.chromeHost(String).

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Exporter

Introduction

Quick Start

Usage

PDF Export

Snapshot Export

Chromium

About

Releases

Packages

Languages

Gijsvanhorn/webdocument-exporter

Folders and files

Latest commit

History

Repository files navigation

PDF Exporter

Introduction

Quick Start

Usage

PDF Export

Snapshot Export

Chromium

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages