Blazing Fast PDF to Markdown Converter

Convert PDF to Markdown with layout detection — preserving images, tables, formulas, captions, headers, and footnotes. Built with Rust, NCNN, and MuPDF for maximum performance.

Try the free online converter: pdf2md.deepdiy.net

Features

Layout-aware Markdown — Uses DocLayoutNet YOLO-based detection to understand document structure. Output preserves headings, paragraphs, tables, lists, formulas, captions, and more in proper reading order.
Images & Assets — Automatically extracts embedded images and saves them alongside the Markdown output.
Clean Output — No unnecessary line breaks within paragraphs. Produces readable, well-formatted Markdown.
Self-hostable — Pre-built binaries for macOS, Linux, and Windows. No Docker or external services required.
Free Web API — No API key needed. Send a PDF and get back Markdown, image links, and a ZIP download.

Performance Comparison

Faster than other PDF to Markdown tools on equivalent hardware.

Runs efficiently on a 1-core 1GB RAM VPS.

DocLayoutNet detection keeps the original layout intact.

No broken inline text — every paragraph stays together.

No sign-up required. Upload and convert instantly.

Pre-built Binaries

Download pre-compiled binaries for 4 platforms from the dist/ directory:

Platform	Binary
macOS (Apple Silicon)	`dist/pdf2md-macos-arm64`
Linux (x86_64)	`dist/pdf2md-x86_64-unknown-linux-gnu`
Linux (ARM64)	`dist/pdf2md-aarch64-unknown-linux-gnu`
Windows (x86_64)	`dist/pdf2md-win10-x64.exe`

Step 1 — Move files to your working directory

mv dist/pdf2md-<platform> <workdir>/
mv yolo26n-doclaynet_ncnn_model/ <workdir>/

Step 2 — Run conversion

cd <workdir>
./pdf2md-<platform> <input.pdf>

Arguments

Argument	Description
`input.pdf`	Input PDF file
`output.md`	Output Markdown file (optional, defaults to stdout)

Extra options

Option	Description
`--asset-dir DIR`	Directory to export page assets
`--detect-dpi N`	DPI for layout detection (default: `72`)
`--asset-dpi N`	DPI for asset export (default: `150`)
`--page N`	Process only the specified page
`--model-dir PATH`	Path to the model directory (default: `./yolo26n-doclaynet_ncnn_model/`)

Build from Source

cargo build --release --bin pdf2md

The compiled binary will be at target/release/pdf2md.

Run from Source

cargo run --release --bin pdf2md -- ./input.pdf ./output.md

Self Hosting Streamlit App

A browser-based UI for uploading PDFs and previewing Markdown output with images.

The app automatically detects your OS and architecture to find the right binary in dist/. You can also specify a custom path:

pip install streamlit
streamlit run streamlit_app.py

Specify a custom binary or model directory:

streamlit run streamlit_app.py -- \
  --pdf2md-bin ./dist/pdf2md-<platform> \
  --model-dir /path/to/yolo26n-doclaynet_ncnn_model

Free PDF to Markdown API

No API key required. Submit a PDF and receive Markdown, extracted images, and a downloadable ZIP.

Endpoint

POST https://pdf2md.deepdiy.net/v1/convert
Content-Type: application/pdf

curl example

curl -X POST "https://pdf2md.deepdiy.net/v1/convert" \
  -H "Content-Type: application/pdf" \
  --data-binary @paper.pdf

Success response

{
  "status": "succeeded",
  "markdown": "# Paper title\n\nConverted Markdown...",
  "images": [
    {
      "path": "assets/page_0001_order_0001_class_6.png",
      "url": "https://..."
    }
  ],
  "zip_url": "https://...",
  "download_url": "https://...",
  "expires_in": 300
}

Error response (HTTP 429)

{
  "error": "busy"
}

The system processes one request at a time across all users. If the server is busy, it returns HTTP 429. Wait 1 second and retry. Each conversion runs for up to 120 seconds — you will likely get a slot within that window.

API Limits

Item	Value
Price	Free
Max PDF size	20 MB
Concurrency	One request at a time (returns 429 if busy)
Max task duration	120 seconds
Conversion timeout	150 seconds
Request timeout	180 seconds
ZIP download expiry	5 minutes

Detection Classes

You can use these class IDs to filter or block specific elements (e.g., Page-header, Footnote) from the output:

0: Caption, 1: Footnote, 2: Formula, 3: List-item, 4: Page-footer, 5: Page-header, 6: Picture, 7: Section-header, 8: Table, 9: Text, 10: Title

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
dist		dist
src		src
yolo26n-doclaynet_ncnn_model		yolo26n-doclaynet_ncnn_model
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
build.rs		build.rs
streamlit_app.py		streamlit_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blazing Fast PDF to Markdown Converter

Features

Performance Comparison

Pre-built Binaries

Step 1 — Move files to your working directory

Step 2 — Run conversion

Arguments

Extra options

Build from Source

Run from Source

Self Hosting Streamlit App

Free PDF to Markdown API

API Limits

Detection Classes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Blazing Fast PDF to Markdown Converter

Features

Performance Comparison

Pre-built Binaries

Step 1 — Move files to your working directory

Step 2 — Run conversion

Arguments

Extra options

Build from Source

Run from Source

Self Hosting Streamlit App

Free PDF to Markdown API

API Limits

Detection Classes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages