Skip to content

wanglrebe/pdf2ofd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pdf2ofd

Convert PDF files to OFD (Open Fixed-layout Document) format — a Chinese national standard for electronic documents (GB/T 33190-2016).

Features

  • Full vector path conversion (SVG → OFD), supporting all SVG path commands: M L H V Q C S T A Z
  • Native OFD path commands: M L Q B A C — no lossy approximation
  • Image extraction with RGBA transparency support
  • Multi-page PDF support
  • Command-line interface and Python API

Installation

pip install pdf2ofd

Note: Requires pymupdf>=1.24.0,<1.25.0. Newer versions have SVG export issues that affect conversion accuracy.

Usage

Command Line

# Convert all pages
pdf2ofd input.pdf output.ofd

# Convert specific pages (0-indexed)
pdf2ofd input.pdf output.ofd --pages 0,1,2

# Suppress output
pdf2ofd input.pdf output.ofd --quiet

# Show version
pdf2ofd --version

Python API

from pdf2ofd import convert

# Convert all pages
convert("input.pdf", "output.ofd")

# Convert specific pages
convert("input.pdf", "output.ofd", pages="0,1,2")

# Suppress progress output
convert("input.pdf", "output.ofd", verbose=False)

How It Works

PDF
 ├── PyMuPDF SVG export  →  vector paths (glyphs, lines, curves)
 └── PyMuPDF image API   →  images (with RGBA/transparency)
          ↓
        OFD
  • Vector paths are extracted from PyMuPDF's SVG export and converted to OFD AbbreviatedData format
  • Images are extracted directly from PDF resources via get_image_rects(), preserving original quality and supporting multiple placements of the same image
  • All SVG path commands are mapped to native OFD equivalents — cubic Bézier (C) maps to OFD B, arcs (A) map to OFD A

Requirements

  • Python >= 3.10
  • pymupdf >= 1.24.0, < 1.25.0
  • lxml >= 4.9
  • pillow >= 9.0

License

MIT

About

Convert PDF to OFD (GB/T 33190-2016) — Python library

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages