pdfcpu: a golang pdf processor

Package pdfcpu is a simple PDF processing library written in Go supporting encryption. It provides both an API and a CLI. Supported are all versions up to PDF 1.7 (ISO-32000).

Motivation

Reducing the size of large PDF files for mass mailings by optimization to the bare minimum. This can be achieved by analyzing a PDF's cross reference table, removing redundant embedded resources like font files or images and by always writing back the file maxing out PDF compression. I also wanted to have my own swiss army knife for PDFs written entirely in Go that allows me to trim, split and merge PDF content.

Features

Validate (validates PDF files up to version 7.0)
Read (builds xref table from PDF file)
Write (writes xref table to PDF file)
Optimize (gets rid of redundancies like duplicate fonts, images)
Split (split a multi page PDF file into single page PDF files)
Merge (a set of PDF files into one consolidated PDF file)
Extract Images (extract all embedded images of a PDF file into a given dir)
Extract Fonts (extract all embedded fonts of a PDF file into a given dir)
Extract Pages (extract specific pages into a given dir)
Extract Content (extract the PDF-Source into given dir)
Trim (generate a custom version of a PDF file)
Manage (add,remove,list,extract) embedded file attachments
Encrypt (sets password protection)
Decrypt (removes password protection)
Change user/owner password
Manage (add,list) user access permissions

Demo Screencast

Installation

Required build version: go1.8 and up

go get github.com/hhrutter/pdfcpu/cmd/...

Usage

pdfcpu validate [-verbose] [-mode strict|relaxed] [-upw userpw] [-opw ownerpw] inFile
pdfcpu optimize [-verbose] [-stats csvFile] [-upw userpw] [-opw ownerpw] inFile [outFile]
pdfcpu split [-verbose] [-upw userpw] [-opw ownerpw] inFile outDir
pdfcpu merge [-verbose] outFile inFile...
pdfcpu extract [-verbose] -mode image|font|content|page [-pages pageSelection] [-upw userpw] [-opw ownerpw] inFile outDir
pdfcpu trim [-verbose] -pages pageSelection [-upw userpw] [-opw ownerpw] inFile outFile

pdfcpu attach list [-verbose] [-upw userpw] [-opw ownerpw] inFile
pdfcpu attach add [-verbose] [-upw userpw] [-opw ownerpw] inFile file...
pdfcpu attach remove [-verbose] [-upw userpw] [-opw ownerpw] inFile [file...]
pdfcpu attach extract [-verbose] [-upw userpw] [-opw ownerpw] inFile outDir [file...]

pdfcpu encrypt [-verbose] [-mode rc4|aes] [-key 40|128] [-perm none|all] [-upw userpw] [-opw ownerpw] inFile [outFile]
pdfcpu decrypt [-verbose] [-upw userpw] [-opw ownerpw] inFile [outFile]
pdfcpu changeupw [-verbose] [-opw ownerpw] inFile upwOld upwNew
pdfcpu changeopw [-verbose] [-upw userpw] inFile opwOld opwNew

pdfcpu perm list [-verbose] [-upw userpw] [-opw ownerpw] inFile
pdfcpu perm add [-verbose] [-perm none|all] [-upw userpw] -opw ownerpw inFile

pdfcpu version

Please read the documentation

Status

Version: 0.1.9

Redesigned extraction API with focus on returning the extracted data rather than writing it somewhere.
It is up to the API consumer how to process the extracted data.

func ImageData(ctx *types.PDFContext, objNr int) (*types.ImageObject, error)
func FontData(ctx *types.PDFContext, objNr int) (*types.FontObject, error)
func ContentData(ctx *types.PDFContext, objNr int) (data []byte, err error)

Contributing

Please open an issue if you find a bug or want to propose a change.
Pull requests, bug fixes and issues are always welcome.

Disclaimer

Usage of pdfcpu assumes you know about and respect all copyrights of any PDF content you may be processing. This applies to the PDF files as such, their content and in particular all embedded resources like font files or images. Credit goes to Renee French for creating our beloved Gopher.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
_scripts		_scripts
attach		attach
cmd/pdfcpu		cmd/pdfcpu
create		create
crypto		crypto
extract		extract
filter		filter
log		log
merge		merge
optimize		optimize
read		read
resources		resources
testdata		testdata
types		types
validate		validate
write		write
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
api.go		api.go
coverage.sh		coverage.sh
doc.go		doc.go
process.go		process.go
process_test.go		process_test.go
regexp_test.go		regexp_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pdfcpu: a golang pdf processor

Motivation

Features

Demo Screencast

Installation

Usage

Status

Contributing

Disclaimer

License

About

Releases

Packages

Languages

License

hannson/pdfcpu

Folders and files

Latest commit

History

Repository files navigation

pdfcpu: a golang pdf processor

Motivation

Features

Demo Screencast

Installation

Usage

Status

Contributing

Disclaimer

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages