Docling is an open-source toolkit for parsing diverse document formats — PDF, DOCX, PPTX, XLSX, HTML, images, audio, LaTeX, plain text — into a unified, lossless DoclingDocument representation that downstream generative AI and RAG systems can consume directly. It pairs IBM Research's DocLayout and TableFormer models with the GraniteDocling visual language model and pluggable OCR engines, runs entirely locally for air-gapped use, and ships as a Python library and CLI, a FastAPI HTTP service (docling-serve), an MCP server (docling-mcp), and a Kubernetes operator. Originally created by IBM Research Zurich; now hosted by the LF AI & Data Foundation under the MIT license.
URL: Visit APIs.json
Run: Capabilities Using Naftiko
- Documents, Parsing, PDF, OCR, Layout, Tables, RAG, LLM, Open Source, IBM Research, LF AI and Data, MCP, Knowledge Graph, Generative AI
- Created: 2026-05-25
- Modified: 2026-05-25
| Item | Value |
|---|---|
| License | MIT |
| Foundation | LF AI & Data Foundation |
| Origin | IBM Research Zurich — AI for Knowledge team |
| GitHub Org | docling-project |
| Primary repo | docling-project/docling (~60k stars) |
| Python | pip install docling (Python 3.10+) |
| CLI | docling <source> [flags] |
| HTTP service | docling-serve — sync, async, WebSocket |
| MCP server | docling-mcp |
| Kubernetes | docling-operator |
| Bindings | Python, Java (docling-java, docling4j), TypeScript (docling-ts), Swift (docling-snap) |
| Default VLM | GraniteDocling-258M |
Core Python library and docling CLI. Converts PDFs, Office docs, HTML, images, audio, LaTeX, and plain text into DoclingDocument; exports to Markdown, HTML, lossless JSON, DocTags, and WebVTT. Implements page layout, reading order, TableFormer table structure, code/formula recognition, picture classification, OCR, and the GraniteDocling VLM pipeline. Runs locally for air-gapped use.
Docs: docling-project.github.io/docling — Source — PyPI
HTTP service exposing the Docling pipeline. POST /v1/convert/source and POST /v1/convert/file are synchronous; /v1/convert/source/async and /v1/convert/file/async queue work and return a task_id that can be polled at /v1/status/poll/{task_id}, streamed via WebSocket /v1/status/ws/{task_id}, and retrieved at /v1/result/{task_id}. CPU, CUDA 12.8/13.0, and AMD ROCm 6.3 container variants ship out of the box.
Docs: docling-serve usage — Source
- OpenAPI
- JSON Schema — DoclingDocument
- JSON Schema — Convert Request
- JSON-LD Context
- Naftiko Capability — Convert
- Naftiko Capability — Tasks
MCP server (docling-mcp) that exposes Docling parsing as agent tools for Claude, Cursor, Gemini, and any MCP-aware client. Lets agents convert PDFs, Office files, and images into DoclingDocument without bespoke integration code.
Docs / Source: docling-project/docling-mcp
Canonical DoclingDocument data model and serialization primitives shared by the Docling library, Docling Serve, and all language bindings.
Source: docling-project/docling-core — PyPI
Native C++ engine that extracts text with precise coordinates from programmatic PDFs. Distributed as a Python extension.
Source: docling-project/docling-parse
Open-weight IBM Research models that power Docling: DocLayout, TableFormer, code/formula heads, picture classifier, and GraniteDocling-258M. Distributed through Hugging Face.
Source: docling-project/docling-ibm-models
End-to-end evaluation framework for document parsing models and services. Standard datasets and metrics for layout, tables, OCR, and reading order.
Source: docling-project/docling-eval
Tools for generating synthetic labeled document data from real corpora for fine-tuning and RAG stress-testing.
Source: docling-project/docling-sdg
Convert unstructured documents (via Docling) into validated, queryable knowledge graphs for GraphRAG.
Source: docling-project/docling-graph
Reference agent that reads, writes, and edits documents using Docling as the IO layer.
Source: docling-project/docling-agent
Go-based operator that deploys and manages Docling Serve workloads — model-cache PVCs, RQ workers, GPU pools, OAuth, sticky sessions.
Source: docling-project/docling-operator
JVM API for Docling.
Source: docling-project/docling-java
Java-native document understanding integrations over Docling.
Source: docling-project/docling4j
TypeScript/JavaScript types and helpers for consuming DoclingDocument JSON and DocTags.
Source: docling-project/docling-ts
First-party LangChain document loader and chunker.
Source: docling-project/docling-langchain
Shared job-runner primitives used by Docling Serve and the operator (RQ workers, Ray).
Source: docling-project/docling-jobkit
- Portal — docling-project.github.io
- Documentation
- GettingStarted — Quickstart
- SourceCode — Main repo
- GitHubOrganization — docling-project
- License — MIT
- SDK — docling (PyPI)
- SDK — docling-core (PyPI)
- SDK — docling-serve (PyPI)
- SDK — Java bindings
- SDK — Docling4j
- SDK — TypeScript / JavaScript
- CLI
- ReleaseNotes
- ChangeLog
- Issues
- Discussions
- ContributionGuide
- CodeOfConduct
- Governance — LF AI & Data project page
- Foundation — LF AI & Data
- Models — IBM DS4SD on Hugging Face
- Models — GraniteDocling-258M
- Blog — IBM Research
- AcademicPaper — Docling Technical Report (arXiv)
- ContainerImage — docling-serve (Quay)
- ContainerImage — docling-serve (GHCR)
- KubernetesOperator
LangChain, LlamaIndex, Haystack, Crew AI, txtai, spaCy, Apify, NVIDIA NIM / NeMo Retriever, InstructLab, Bee Agent Framework, Weaviate, Qdrant, Milvus, OpenSearch.
Machine-readable specifications organized by format.
- docling-serve POST /v1/convert/source
- docling-serve async submit → poll → result
- docling CLI convert
FN: Kin Lane
Email: info@apievangelist.com