A command line tool for extracting image files from a PDF file.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
PDFImgEx
.gitattributes
.gitignore
PDFImgEx.sln
README.md

README.md

PDFImgEx

A command line tool for extracting image files from a PDF file.

Usage

PDFImgEx PDF_File_Name [Output_Path] [-o(verwrite)] [-t(itle) Title_Prefix]

PDF_File_Name: Name of the PDF file to parse.

Output_Path: Write images to this path. Optional, if not present then write to same directory as source file.

-overwrite: Overwrite files in the output directory if they have the same name. Optional, by default this is disabled and the image is skipped.

-title: Add an optional prefix to the output image files. Images files are normally named _Page_TotalImageCount.Format where Page is the page in the document, TotalImageCount is a squential count of images in the document and Format is the file format extension.

Dependencies

Requires the PDFImageExtract.Core library, which in turn requires iTextSharp 5.5.

https://github.com/seaweedfactory/PDFImageExtract.Core

File Conversion

Images are written according to the format used in the pdf. The resulting files often require conversion to be used in other programs. Use another tool, like imagemagick, to do this:

https://www.imagemagick.org

For example, if .jp2 (JPEG 2000) files are created, use the following imagemagick command to convert all files in the output directory to PNG files.

magick mogrify -format png *.jp2