PDF Markdown Studio

Windows x64*: PDF Markdown Studio.exe
macOS arm64: PDF Markdown Studio.dmg

PDF Markdown Studio is a desktop app for converting PDFs and images into clean Markdown.

It is built as an example integration client for OpenResearchTools Engine, showing how to run Engine PDF and VLM workflows from a GUI.

The PDF Markdown Studio application source code is licensed under the MIT License; third-party dependencies and bundled components remain licensed under their respective original licenses.

What You Can Do

Add multiple files (PDFs and images) into one workspace.
Preview the source document and generated Markdown side-by-side.
Search by text and jump through matching pages.
Convert selected files using either fast PDF extraction or VLM-based extraction.
Edit Markdown in place and save changes.

Supported Files

PDF (.pdf)
Images (.png, .jpg, .jpeg)

Conversion Modes (How To Choose)

1) FAST PDF

Use this for machine-readable digital PDFs.

Very fast.
Best when text is selectable in the PDF.
Output quality is strong for standard digital documents.
Limitation: table content is often flattened/inline in output; complex tables can become misstructured.

2) PDF VLM + Image VLM

Use this when layout is complex, scanned-like, visual-heavy, or FAST output is poor.

Uses your selected VLM model + MMProj.
Works for both PDFs and images.
Slower than FAST, but better for difficult pages.
Engine applies a per-page quality gate for PDF VLM output. Pages that end up obviously truncated, stuck in repetition/looping, or otherwise fail the gate are retried automatically before final output is written.
This is meant to reduce bad outputs on difficult pages, but manual inspection is still recommended for important documents.
Limitation: table quality depends on the selected model's capabilities.
On complex tables with heavy formatting/whitespace, models can misattribute values to wrong cells or rows.
If downstream automation depends on table values, compare Markdown against the original document before automated extraction.

3) FAST PDF + VLM fallback

Try FAST first, then automatically switch to VLM if FAST identifies non-machine-readable content.

Good default when PDF quality is mixed.
Balances speed and robustness.

Quick Start

Open the app.
In Settings, make sure runtime is healthy and model paths are set.
Click File, Add PDFs / Images.
Tick files in the Documents sidebar.
Select a conversion mode.
Click Convert Selected.

Output files are written next to the source document:

filenameFAST.md
filenameVLM.md

Viewing and Navigation

Left pane: original PDF/image.
Right pane: Markdown preview/edit.
Find, Prev/Next Hit, and page controls are above the workspace.
View zoom affects both panes together.

Runtime and Model Location

Default Engine runtime path:

Windows: C:\Users\<user>\AppData\Roaming\OpenResearchTools\PDF Markdown Studio\Engine
macOS: ~/Library/Application Support/OpenResearchTools/PDF Markdown Studio/Engine
Linux: ~/.local/share/OpenResearchTools/PDF Markdown Studio/Engine

Default app settings/data path:

Windows config/data: C:\Users\<user>\AppData\Roaming\OpenResearchTools\PDF Markdown Studio
macOS config/data: ~/Library/Application Support/OpenResearchTools/PDF Markdown Studio
Linux config: ~/.config/OpenResearchTools/PDF Markdown Studio
Linux data: ~/.local/share/OpenResearchTools/PDF Markdown Studio

Shared VLM model folder:

Windows: C:\Users\<user>\AppData\Roaming\OpenResearchTools\models
macOS: ~/Library/Application Support/OpenResearchTools/models
Linux: ~/.local/share/OpenResearchTools/models

Each selected Qwen3.5 family downloads into its own shared repo folder under that global models root, including the required MMProj file for the chosen family.

GPU / CPU Execution

CPU mode: run without GPU acceleration.
GPU mode: select one GPU in settings.
The app sends one selected GPU for VLM execution paths through Engine runtime.

Troubleshooting

Unsigned Build Notice

This app is an open-source hobby development effort by the repository owner. We do not currently have funding for full paid code-signing and notarization pipelines across all platforms/releases.

Because of that, operating-system protections or hardened security environments (for example Windows SmartScreen, enterprise endpoint controls, or macOS Gatekeeper policies) may block unsigned binaries.

If your environment blocks unsigned binaries, the recommended path is:

build this desktop app from source on the target device,
build Openresearchtools-Engine from source on the same target device,
and use those locally-built artifacts in your deployment.

Windows (when blocked)

If SmartScreen shows "Windows protected your PC", use More info -> Run anyway only if your policy allows it.
In the app, go to Settings -> Runtime Setup and run:
- Download/Repair runtime
- Unblock unsigned runtime
- Recheck
The Windows unblock script clears Mark-of-the-Web flags in the selected runtime directory by running Unblock-File recursively on runtime files.

macOS (when blocked)

Try Right click -> Open on first launch.
If blocked by Gatekeeper, use System Settings -> Privacy & Security -> Open Anyway when available and policy permits.
In the app, after runtime install/repair, click Unblock unsigned runtime then Recheck.
The macOS unblock script removes quarantine attributes recursively (xattr -dr com.apple.quarantine) and restores executable bits for runtime binaries/scripts where needed (chmod +x on relevant files).

If conversion fails or setup is incomplete

Open Settings.
Use runtime health/check and download/repair actions.
Confirm model and MMProj paths exist.
Check Jobs and logs for the exact error.

If adding many files feels slow

Wait for background imports to finish before converting.
Large PDFs can take time to rasterize and preview.

Acknowledgements (What This App Uses)

This app uses OpenResearchTools Engine runtime components for PDF and VLM execution. For this app's active feature set, key upstream technologies include:

Openresearchtools-Engine: embeddable runtime used by this app (llama-server-bridge, runtime orchestration, and model/device execution path).
egui / eframe: native immediate-mode GUI framework used to build this desktop application UI.
llama.cpp and ggml: core inference runtime and device/offload mechanics used through Openresearchtools-Engine.
Docling: reference logic for VLM document-conversion behavior used by Engine pdfvlm, including page-wise rendering/scaling heuristics (scale, oversample) and Catmull-Rom style downscale before inference.
PDFium and pdfium-render: PDF rasterization/page access primitives used by the app's native PDF rendering and by Engine PDF conversion paths.

Current VLM Model Lineup

PDF Markdown Studio now uses the Qwen3.5 GGUF + MMProj model family Qwen: upstream Qwen3.5 model family reference used for the app's current vision model lineup for PDF VLM and Image VLM conversion:

Qwen3.5 9B (Q4_K_M and Q8_0)
Qwen3.5 4B (Q4_K_M and Q8_0)
Qwen3.5 2B (Q4_K_M and Q8_0)

The app downloads the text model and the matching MMProj for the selected family automatically into the shared OpenResearchTools model store.

Note the use of the models in our app does not imply affiliations or endorsements from original model authors. This is just a personal recommendation after testing many currently available models for speed/quality of the outputs. You are also free to use any other vision model that can run on GGML (llama.cpp backend). The app allows for manual model selection.

Recommended guidance:

9B 4-bit is the recommended default for documents that need higher precision, denser layout understanding, or more reliable structure recovery.
2B models often still produce surprisingly strong results at a fraction of the compute cost, and are a good option when you want speed or need to run on lighter hardware.
4B is the middle ground when you want a better quality/speed balance.

openresearchtools/Qwen3.5-9B-GGUF, openresearchtools/Qwen3.5-4B-GGUF, and openresearchtools/Qwen3.5-2B-GGUF: converted GGUF + MMProj model repositories used by the app for PDF VLM and Image VLM conversion.

Qwen: upstream Qwen3.5 model family reference used for the app's current vision model lineup.

This project is independent and is not affiliated with, sponsored by, or endorsed by egui, llama.cpp, Docling, PDFium, Qwen, or other upstream projects/vendors.

How to cite

Suggested citation:

Rutkauskas, L. (2026). PDF Markdown Studio (Version 1.0.0) [Computer software]. OpenResearchTools. https://github.com/openresearchtools/pdfmarkdownstudio.

BibTeX:

@software{Rutkauskas_PDFMarkdownStudio_2026,
  author    = {Rutkauskas, L.},
  title     = {PDF Markdown Studio},
  version   = {1.0.0},
  date      = {2026-03-04},
  url       = {https://github.com/openresearchtools/pdfmarkdownstudio},
  publisher = {OpenResearchTools},
  license   = {MIT}
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.github/workflows		.github/workflows
demo		demo
licenses		licenses
logo		logo
packaging		packaging
runtime-manifests		runtime-manifests
scripts		scripts
src		src
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Demo.png		Demo.png
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF Markdown Studio

What You Can Do

Supported Files

Conversion Modes (How To Choose)

1) FAST PDF

2) PDF VLM + Image VLM

3) FAST PDF + VLM fallback

Quick Start

Viewing and Navigation

Runtime and Model Location

GPU / CPU Execution

Troubleshooting

Unsigned Build Notice

Windows (when blocked)

macOS (when blocked)

If conversion fails or setup is incomplete

If adding many files feels slow

Acknowledgements (What This App Uses)

Current VLM Model Lineup

How to cite

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PDF Markdown Studio

What You Can Do

Supported Files

Conversion Modes (How To Choose)

1) FAST PDF

2) PDF VLM + Image VLM

3) FAST PDF + VLM fallback

Quick Start

Viewing and Navigation

Runtime and Model Location

GPU / CPU Execution

Troubleshooting

Unsigned Build Notice

Windows (when blocked)

macOS (when blocked)

If conversion fails or setup is incomplete

If adding many files feels slow

Acknowledgements (What This App Uses)

Current VLM Model Lineup

How to cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages