A poppler-based PDF processing tool to extract document data and save it in EDN format. It supports:
- Font and glyph remapping via user-defined font map configurations (in JSON format) to allow glyph substitutions for Type 1 or TT fonts with invalid/incorrect unicode tables and even embedded CID fonts with missing tables.
- Path data extraction.
- Transformed image output, written directly to disk in PNG format.
- Annotations.
- PDF outlines.
Process a pdf document and write its output to output_file.edn
:
pdftoedn -o output_file.edn input_file.pdf
Refer to the wiki for
- more usage examples.
- exit error code reference.
- installation instructions.
- output file format.
- overview of font map substitution and sample font configuration file.
- List of internal font maps and internal glyph maps.