dwocr

dwocr batches PDF OCR requests against an OpenAI-compatible API, with defaults tuned for Doubleword-hosted Qwen OCR models. It renders each PDF page locally, submits page-level requests through autobatcher, and writes one markdown file per source PDF.

By default it uses:

base URL: https://api.doubleword.ai/v1
model: Qwen/Qwen3.5-397B-A17B-FP8
prompt: the built-in Qwen 397 OCR benchmark prompt in src/dwocr/prompts.py

Install

pip install -e .

This exposes two commands:

dwocr
dwocr-web

CLI Usage

Basic usage:

dwocr INPUT_PATH

INPUT_PATH can be either:

a single PDF file
a directory, in which case dwocr recursively processes *.pdf files below it

The CLI looks for an API key in this order:

--api-key
DOUBLEWORD_API_KEY
OPENAI_API_KEY

If you do not pass --output-dir, output is written to a sibling dwocr_output/ directory next to the input root. Relative paths are preserved, so nested PDFs produce nested markdown files.

Example: Doubleword + Qwen 3.5 397B

export DOUBLEWORD_API_KEY=...

dwocr ./papers \
  --base-url https://api.doubleword.ai/v1 \
  --model Qwen/Qwen3.5-397B-A17B-FP8 \
  --output-dir ./ocr_output \
  --render-images \
  --batch-size 512 \
  --batch-window-seconds 5 \
  --poll-interval-seconds 5 \
  --completion-window 24h \
  --target-longest-image-dim 1024 \
  --render-concurrency 8 \
  --overwrite

Important Options

dwocr INPUT_PATH [options]

--api-key TEXT
--base-url TEXT                    OpenAI-compatible API base URL
--model TEXT                       OCR model name
--output-dir TEXT                  Output directory for markdown files
--prompt-file TEXT                 Replace the built-in OCR prompt
--temperature FLOAT                Default: 0.0
--max-tokens INT                   Default: 4096
--batch-size INT                   Default: 512
--batch-window-seconds FLOAT       Default: 5.0
--poll-interval-seconds FLOAT      Default: 5.0
--completion-window {1h,24h}       Default: 24h
--target-longest-image-dim INT     Default: 1024
--render-concurrency INT           Default: min(16, max(4, cpu_count))
--render-images                    Save cropped image regions and rewrite markdown image tags
--overwrite                        Allow writing into a non-empty output directory

Output

Each source PDF becomes one markdown file containing page outputs in order:

<!-- source: some/file.pdf -->
<!-- model: Qwen/Qwen3.5-397B-A17B-FP8 -->
<!-- generated_at: 2026-03-19T12:34:56+00:00 -->

<!-- page 1 -->
...page markdown...

<!-- page 2 -->
...page markdown...

When --render-images is enabled and the model emits markers such as:

image[[120,300,520,700]] Figure caption

dwocr will:

crop that region from the rendered source page
write the crop into a sibling asset directory such as document_images/
rewrite the OCR output to a normal markdown image link

If any pages fail, dwocr still finishes the remaining work, exits non-zero, and prints the failed page list to stderr.

Web UI

Run:

dwocr-web

Then open http://127.0.0.1:8123.

The web UI lets you:

submit OCR jobs for a PDF or directory
set model, base URL, batching, and rendering options
monitor active jobs and logs
inspect recent job details and the exact generated CLI command

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
logs		logs
scripts		scripts
src/dwocr		src/dwocr
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dwocr

Install

CLI Usage

Example: Doubleword + Qwen 3.5 397B

Important Options

Output

Web UI

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dwocr

Install

CLI Usage

Example: Doubleword + Qwen 3.5 397B

Important Options

Output

Web UI

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages