Docling UI

A local Streamlit GUI that exposes the full power of Docling through a clean, sidebar-driven interface — no CLI required.

Overview

Docling UI wraps IBM Research's Docling document-intelligence library with a point-and-click frontend. Upload a file, tune every pipeline knob, click Convert, and get structured output (Markdown, JSON, HTML, plain text, or document tokens) ready to download — all without touching the command line.

Built for personal productivity on WSL/Linux; runs entirely on localhost with no external API calls.

Features

Category	Capability
Input	PDF, DOCX, PPTX, HTML, PNG, JPG, JPEG, TIFF, BMP
Pipelines	StandardPdfPipeline (full ML) · SimplePipeline (lightweight)
OCR	RapidOCR · Tesseract · force full-page mode · confidence threshold
Tables	TableFormer fast or accurate mode
Enrichments	Code · Formula/MathML · image extraction
Page selection	All pages or a custom range (e.g. `1-3, 5, 7`)
Output formats	Markdown · JSON · Plain Text · HTML · Document Tokens
Export	One-click download button for every format
Observability	Real-time conversion log viewer (captures the `docling` logger)

Requirements

Python 3.10 or later
System package: tesseract-ocr (only required when using the Tesseract OCR engine)

All Python dependencies are declared in requirements.txt and installed by the setup commands below.

Setup

# Clone the repository (or navigate to the project directory)
cd docling-ui

# Create and activate a virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install Python dependencies
# Note: docling pulls in PyTorch and HuggingFace models (~3–5 GB on first run)
pip install -r requirements.txt

# (Optional) Install the Tesseract binary for the Tesseract OCR engine
sudo apt update && sudo apt install -y tesseract-ocr

First-run note: Docling downloads ML models (layout detection, TableFormer, etc.) on first use and caches them locally. Subsequent runs are significantly faster.

Running

source .venv/bin/activate   # if not already active
streamlit run app.py

Open http://localhost:8501 in your browser.

Usage

Upload a document via the sidebar file uploader. File name, size, and detected type are shown immediately.
Configure the pipeline in the sidebar:
- Choose Standard (full ML, best quality) or Simple (fast, no ML).
- Enable/disable OCR and select the engine.
- Toggle table recognition and pick fast vs. accurate mode.
- Activate code, formula, or image enrichments as needed.
- Optionally restrict conversion to a custom page range.
Select an output format (Markdown, JSON, Plain Text, HTML, or Document Tokens).
Click Convert.
The Preview tab renders the output in the appropriate viewer; the Logs tab shows the full Docling log for the run.
Use the Download button to save the result, or copy from the code block.

Project Structure

docling-ui/
├── app.py            # Streamlit application (single file)
├── requirements.txt  # Python dependencies
└── README.md

app.py is organised into clearly separated layers:

UILogHandler — custom logging.Handler that captures the docling logger into a StringIO buffer so logs appear in the UI rather than only the terminal.
ConversionSettings — typed dataclass holding every sidebar value; passed as a single argument to convert_document().
convert_document() — pure conversion function; writes a temp file, builds DocumentConverter with the configured options, runs conversion, and returns a plain dict.
render_sidebar() — all 9 control sections; returns the uploaded file, settings, and the button state.
main() — orchestrates rendering, conversion trigger, error display, and tab switching.

Dependency Notes

Package	Purpose
`streamlit`	Web UI framework
`docling`	Document conversion engine
`docling-core`	Core data models
`rapidocr-onnxruntime`	RapidOCR engine (Python-only, no system binary needed)
`pytesseract`	Tesseract Python wrapper (requires system `tesseract-ocr`)
`Pillow`	Image handling
`watchdog`	Streamlit file-watcher backend

Troubleshooting

Tesseract not found

sudo apt install tesseract-ocr

RapidOCR import error

pip install rapidocr-onnxruntime

Out of memory on large PDFs Switch to SimplePipeline in the sidebar, or narrow the page range.

export_to_document_tokens not available

pip install --upgrade docling

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docling UI

Overview

Features

Requirements

Setup

Running

Usage

Project Structure

Dependency Notes

Troubleshooting

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Docling UI

Overview

Features

Requirements

Setup

Running

Usage

Project Structure

Dependency Notes

Troubleshooting

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages