PDF2JSON is a conversion library based on XPDF (3.02) which can be used for high performance PDF page by page conversion to JSON and XML format. It also supports compressing data to minimize size. PDF2JSON is available for Windows, OSX and Linux. Please see https://flowpaper.com for more information
C++ C Makefile DIGITAL Command Language Batchfile M4 Objective-C
Switch branches/tags
Nothing to show
Latest commit 330067c Jul 14, 2017 @flexpaper committed on GitHub Update pdf2json.cc
Version bump
Permalink
Failed to load latest commit information.
doc initial commit Apr 7, 2015
fofi Remove generated files from repository Apr 16, 2016
freetype.win32 initial commit Apr 7, 2015
goo
splash Remove generated files from repository Apr 16, 2016
src
xpdf
.gitignore Add generated files to .gitignore Apr 16, 2016
AUTHORS
CHANGES
Makefile.in initial commit Apr 7, 2015
README.md Update README.md Sep 28, 2016
aclocal.m4 initial commit Apr 7, 2015
aconf-win32.h
aconf.h.in initial commit Apr 7, 2015
configure initial commit Apr 7, 2015
configure.ac initial commit Apr 7, 2015
ms_make.bat

README.md

PDF2JSON

PDF2JSON is a conversion library based on XPDF (3.02) which can be used for high performance PDF page by page conversion to JSON and XML format. It also supports compressing data to minimize size. PDF2JSON is available for Windows and Linux.

Usage

Usage: pdf2json [options] <PDF-file> [<output-file>]
  -f <int>          : first page to convert
  -l <int>          : last page to convert
  -compress         : Use compressed mode
  -q                : dont print any messages or errors
  -h                : print usage information
  -help             : print usage information
  -i                : ignore images
  -noframes         : use standard output
  -xml              : output for XML post-processing
  -split <int>      : split the output into separate files on every X page. 
                      use '%' as part of output file name to specify where page number should appear (e.g. Paper_%.js)
  -hidden           : output hidden text
  -enc <string>     : output text encoding name
  -v                : print copyright and version info
  -opw <string>     : owner password (for encrypted files)
  -upw <string>     : user password (for encrypted files)

Compiling & Installing on Linux

./configure
make
sudo make install

Installing on Windows

Download the latest installer (.msi file) and run through the installation steps

Compiling on Windows

Make sure you have Visual Studio installed and run ms_make.bat

License

GNU GPL v2


For the latest about this PDF to JSON converter see the FlowPaper home page