Skip to content

MAXMARDONES/penquify

penquify

OCR in reverse. Make your documents worse.

penquify

A Python toolkit that generates photorealistic smartphone photos of logistics documents
with coffee stains, folds, blur, skew — and verified ground truth for every field.

GitHub stars License Last commit SmartUp


cli pypi fastapi mcp agent-sdk gemini verified

penquify.com · Docs · GitHub


Open Source AWS Ready Docker Hosted SaaS

From Chilean slang "penca" (lousy, worse) — because your document photos should look realistically bad, not studio-perfect.


How it works

You start with data. Penquify builds everything else.

ERP purchase order          penquify generates           penquify generates
(or any JSON/PDF)    ──►    dispatch guide PDF     ──►   realistic photos
                            with supplier jargon,        with verified
                            unit mismatches,             ground truth +
                            realistic discrepancies      occlusion manifest

You don't build the PDF. You don't design the document. You give penquify an OC number, a JSON payload, or upload an existing PDF — and it:

  1. Generates a realistic document with supplier-style names (not your ERP master data names), realistic unit mismatches (CJ vs KG, UN vs L), and configurable quantity discrepancies
  2. Renders a clean PDF from Jinja2 templates (dispatch guides, invoices, POs, BOLs)
  3. Produces N photorealistic photos — each a different failure mode (blur, fold, stain, crop, angle)
  4. Verifies every field by blind-extracting from the photo and comparing programmatically against source data
  5. Generates an occlusion manifest explaining which fields are hidden in each variation and why

Before → After

Clean PDF (auto-generated) Realistic Photo (verified)
Clean PDF Realistic photo

Every variation from the same document


full_picture

folded_skewed

strong_oblique

coffee_stain

stain + angle

galaxy_s7

8 built-in presets + infinite custom via JSON or natural language.

# From scratch — penquify generates the document AND the photos
penquify demo

# From an existing PDF — penquify detects the schema and generates variations
penquify upload --image existing_invoice.pdf

# From a description — no JSON needed
penquify config --text "folded paper with grease, shot on old Motorola"

Getting Started

Install

pip install penquify

# or from source
git clone https://github.com/MAXMARDONES/penquify.git
cd penquify && pip install -e ".[all]"

# browser engine for HTML → PDF rendering
playwright install chromium

Environment

export GEMINI_API_KEY="your-key"   # required for photo generation
export PENQUIFY_OUTPUT="./output"  # where files go (default: ~/penquify-output)

Run

# Full demo: PDF + 8 photo variations
penquify demo

# PDF only from JSON
penquify pdf --doc-json invoice.json

# Photos from any document image
penquify photos --image scan.png --presets full_picture blurry coffee_stain

# Full dataset: 10 documents x 3 variations each
penquify dataset --doc-json docs.json --presets full_picture folded_skewed blurry

Python

from penquify.models import Document, DocHeader, DocItem, PhotoVariation, Stain
from penquify.generators.pdf import generate_document_files
from penquify.generators.photo import generate_dataset

doc = Document(
    header=DocHeader(doc_type="guia_despacho", doc_number="00847291", date="16/04/2026",
                     emitter_name="ACME FOODS LTDA.", oc_number="4500000316"),
    items=[
        DocItem(pos=1, code="AF-001", description="FROZEN POTATO WEDGES",
                qty=12, unit="CJ", unit_price=15000, total=180000),
    ],
)

files = await generate_document_files(doc, "output/")
photos = await generate_dataset(files["png"], preset_names=["full_picture", "blurry"])

Document Templates

Template Description Status
guia_despacho Chilean dispatch guide (guia de despacho electronica) Done
factura_sii Chilean tax invoice (DTE tipo 33, SII XML) Planned
purchase_order Standard purchase order Planned
bill_of_lading Transport bill of lading (BOL) Planned
nota_credito Credit note (DTE tipo 61) Planned
remito Argentine dispatch note Planned

Templates are Jinja2 HTML — add your own:

penquify pdf --template my_template.html --doc-json data.json

Photo Variations

A fixed system instruction handles base realism (paper physics, camera behavior, operational context). The variation config controls specifics. Every field is optional — override only what you need.

8 Built-in Presets

Preset What it tests
full_picture Baseline: clean handheld shot, 90% frame coverage
folded_skewed Geometric distortion: dog-ear, crease, 6deg tilt
zoomed_detail Close-up OCR: tight crop, oblique 25-30deg
blurry Motion blur: rushed capture, partial legibility
cropped_header Missing data: top 10-15% cut off
strong_oblique Extreme angle: 45deg, strong curvature
coffee_stain Contamination: stain over text
stapled_stack Multi-page: stapled with sheets behind

Full Variation Schema

{
  "name": "my_variation",
  "camera": "Samsung Galaxy S8",
  "year_device_style": "2017 Android",
  "aspect_ratio": "4:3",
  "document_coverage": "90% of frame",
  "background": "blurred warehouse at edges",
  "curvature": "slight",
  "folds": "dog_ear",
  "wrinkles": "medium",
  "angle": "45 degree oblique",
  "skew": "strong",
  "rotation_degrees": 8,
  "motion_blur": true,
  "glare": "strong",
  "shadow_from_hand": true,
  "jpeg_compression": "heavy",
  "hand_visible": true,
  "grip_type": "both hands",
  "glove": "warehouse glove",
  "stain": {"type": "coffee", "location": "upper_right", "opacity": "heavy", "text_obstruction": "partial"},
  "cropped_header": true,
  "stapled": true,
  "stacked_sheets_behind": 2
}

Every string field is free text — cameras, angles, backgrounds, grip types. Use presets or write whatever describes your scenario.

22 Camera Presets (+ free text)

galaxy_s7 galaxy_s8 galaxy_a5_2017 moto_g5 iphone_7 iphone_8 pixel_2 huawei_p10 xiaomi_note4 galaxy_s9 iphone_xr galaxy_a10 galaxy_a50 iphone_11 galaxy_a21s iphone_12 pixel_4a galaxy_a13 iphone_14 pixel_7 warehouse_generic field_worker

Or any free text: PhotoVariation(camera="Nokia 3310 with cracked screen")

Natural Language Config

Don't know the schema? Just describe it:

from penquify.generators.config import text_to_variation

config = await text_to_variation(
    "blurry photo with coffee stain, strong angle, old Samsung, paper folded in half"
)
# → returns valid PhotoVariation JSON

REST API

uvicorn penquify.api.server:app --port 8080
Method Path Description
POST /generate/document Document JSON → PDF + PNG
POST /generate/photos Image → realistic photos
POST /generate/dataset Document → PDF → photos (full pipeline)
POST /generate/config Natural language → variation JSON
GET /documents List generated runs
GET /documents/{id}/{file} Download file
GET /presets Photo presets
GET /templates Document templates

MCP Server

5 tools for Claude Desktop, Cursor, Windsurf, or any MCP client:

{
  "mcpServers": {
    "penquify": {
      "command": "python3",
      "args": ["-m", "penquify.mcp"],
      "env": {"GEMINI_API_KEY": "your-key"}
    }
  }
}

Tools: penquify_generate_document penquify_generate_photos penquify_generate_dataset penquify_text_to_config penquify_list_presets


Claude Code Skills

/penquify          # Full reference: presets, cameras, variation schema
/generate          # Generate a document from description or JSON
/dataset           # Generate large synthetic datasets
/add-template      # Add a new document template

Agent SDK Plugin

from penquify.agent_plugin import penquify_tools

agent = Agent(model="claude-sonnet-4-6", tools=penquify_tools)

Deployment

Docker

docker build -t penquify .
docker run -p 8080:8080 -e GEMINI_API_KEY=xxx penquify

docker-compose (with PostgreSQL)

GEMINI_API_KEY=xxx docker-compose up

Kubernetes

kubectl apply -f k8s/secret.yaml   # set GEMINI_API_KEY first
kubectl apply -f k8s/deployment.yaml

Architecture

penquify/
  templates/         Jinja2 HTML per doc type
  generators/
    pdf.py           HTML → PDF/PNG (Playwright)
    photo.py         PNG → realistic photo (Gemini image gen)
    config.py        text → variation JSON (Gemini text)
  models/
    document.py      DocHeader + DocItem + Document
    variation.py     PhotoVariation + Stain + 8 presets
    cameras.py       22 camera presets + free text
  api/server.py      FastAPI REST
  mcp.py             MCP server (5 tools)
  agent_plugin.py    Agent SDK plugin
  storage/s3.py      AWS S3 upload
  cli.py             CLI entry point

Roadmap

  • Jinja2 templates + Playwright PDF/PNG
  • Gemini photo gen with system instruction + variation config
  • 8 photo presets + 22 camera presets
  • CLI (penquify demo/pdf/photos/dataset)
  • FastAPI REST server (8 endpoints)
  • MCP server (5 tools)
  • Agent SDK plugin
  • Claude Code skills (4 commands)
  • Natural language → variation JSON (Gemini)
  • S3 upload support
  • Dockerfile + docker-compose + K8s manifests
  • GitHub Actions CI
  • CODE_OF_CONDUCT + CONTRIBUTING + LICENSE
  • PostgreSQL persistent storage
  • PostgREST auto-API
  • More templates: factura SII, PO, BOL
  • SII DTE XML generation
  • Batch dataset generation with progress bar
  • PyPI publish
  • Demo images in README

License

MIT


About

OCR in reverse. Make your documents worse. Python toolkit for synthetic logistics document & photo datasets with verified ground truth.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors