Skip to content
Nice wrapper of PDFBox in Clojure
Clojure
Branch: master
Clone or download

Latest commit

dotemacs Upgraded PDFBox to 2.1.19
Keeping up with the stable PDFBox release
Latest commit 8d41ea3 Feb 23, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
doc initial commit Dec 12, 2013
resources Added log4j config, turned off logging Apr 4, 2015
src/pdfboxing Updated doc string for set-fields Feb 22, 2020
test Added tests for nested form fields filling Feb 22, 2020
.gitignore Updated .gitignore Mar 10, 2017
.travis.yml Upgraded lein and JVM versions Jul 8, 2019
CHANGELOG.md Added the changelog Feb 23, 2020
README.md Fix parenthesis Feb 22, 2018
project.clj Upgraded PDFBox to 2.1.19 Feb 23, 2020

README.md

pdfboxing

Clojure PDF manipulation library & wrapper for PDFBox.

  • "Clojar version"
  • "Continuous Integration status"
  • License
  • Dependencies Status
  • Downloads

Usage

Extract text

(require '[pdfboxing.text :as text])
(text/extract "test/pdfs/hello.pdf")

Merge multiple PDFs

(require '[pdfboxing.merge :as pdf])
(pdf/merge-pdfs :input ["test/pdfs/clojure-1.pdf" "test/pdfs/clojure-2.pdf"] :output "foo.pdf")

Split a PDF into mutliple PDDocuments

 (require '[pdfboxing.split :as pdf])

List of PDDocument pages 1 through 8

 (pdf/split-pdf :input "test/pdfs/multi-page.pdf" :start 1 :end 8)

Splits the PDF into single pages as a list of PDDocument

 (pdf/split-pdf :input "test/pdfs/multi-page.pdf")

Splits the PDF in half and writes them to disk as clojure-1.pdf and clojure-2.pdf

 (pdf/split-pdf-at :input "test/pdfs/multi-page.pdf")

Splits into two PDFs, the first having 5 pages and second has rest

 (pdf/split-pdf-at :input "test/pdfs/multi-page.pdf" :split 5)

List form fields of a PDF

To list fields and values:

(require '[pdfboxing.form :as form])
(form/get-fields "test/pdfs/interactiveform.pdf")
{"Emergency_Phone" "", "ZIP" "", "COLLEGE NO DEGREE" "", ...}

Fill in PDF forms

To fill in form's field supply a hash map with field names and desired values. It will create a copy of fillable.pdf as new.pdf with the fields filled in:

(require '[pdfboxing.form :as form])
(form/set-fields "test/pdfs/fillable.pdf" "test/pdfs/new.pdf" {"Text10" "My first name"})

Rename form fields of a PDF

To rename PDF form fields, supply a hash map where the keys are the current names and the values new names:

(require '[pdfboxing.form :as form])
(form/rename-fields "test/pdfs/interactiveform.pdf" "test/pdfs/addr1.pdf" {"Address_1" "NewAddr"})

Get page count of a PDF document

(require '[pdfboxing.info :as info])
(info/page-number "test/pdfs/interactiveform.pdf")

Get info about a PDF document

Such as title, author, subject, keywords, creator & producer

(require '[pdfboxing.info :as info])
(info/about-doc "test/pdfs/interactiveform.pdf")

Draw lines on a PDF document

Supply a PDF document, a name for the output PDF document, the coordinates where the line should be drawn along with the page number on which the line should be drawn

(require '[pdfboxing.draw :as draw])
(draw/draw-line :input-pdf "test/pdfs/clojure-1.pdf"
                :output-pdf "ninja.pdf"
                :coordinates {:page-number 0
                              :x 0
                              :y 160
                              :x1 650
                              :y1 160})

Compatibility with PDFBox's PDDocuments

The following functions referenced above have direct compatibility with PDFBox's internal PDDocument type:

  • text/extract
  • pdf/split-pdf
  • form/get-fields
  • form/set-fields
  • form/rename-fields
  • info/page-number
  • draw/draw-line

This allows you to substitute each filepath (of each function's input) referenced above with a PDDocument type. This is helpful for example in the case that you were to want to split a PDF up by pages and then extract the text from only the 3rd page:

(require '[pdfboxing.text :as text])
(require '[pdfboxing.split :as split])
(-> (split/split-pdf :input "test/pdfs/multi-page.pdf")
    (nth 2)
    text/extract)
You can’t perform that action at this time.