doclingclient

A Go docling client library and CLI. Docling is a deep learning document analysis and conversion project, which can also be run as service. This project helps to decouple the document processing, which may benefit from a GPU, from the client, which may be a lower spec machine.

Installation

$ go install github.com/miku/doclingclient/cmd/docli@latest

Packages (deb, rpm), cf. releases. Quick start:

$ docli --server http://docling.city:5001 convert https://arxiv.org/pdf/2110.06595

Background, Prompt

Docling serve supplies an openapi spec, currently using version 3.1.0 of the standard.

$ jq -rc '.paths | keys[]' openapi.json
/health
/openapi-3.0.json
/ready
/v1/chunk/hierarchical/file
/v1/chunk/hierarchical/file/async
/v1/chunk/hierarchical/source
/v1/chunk/hierarchical/source/async
/v1/chunk/hybrid/file
/v1/chunk/hybrid/file/async
/v1/chunk/hybrid/source
/v1/chunk/hybrid/source/async
/v1/clear/converters
/v1/clear/results
/v1/convert/file
/v1/convert/file/async
/v1/convert/source
/v1/convert/source/async
/v1/memory/counts
/v1/memory/stats
/v1/result/{task_id}
/v1/status/poll/{task_id}
/version

Unfortunately, an SDK generated from a spec can be quite large and may have downsides; cf. also this comparison.

Hence, we decided to use a more manual approach. We use an LLM to build a simple, mostly idiomatic client for the core functionality first. For docling this may be just "/v1/convert/file" and "/v1/convert/source" - this would already serve most use cases.

Create a minimal Go library, then wrap a nice CLI around the library, so interacting with the docling service becomes easy to integrate into shell scripts or ad-hoc human (and maybe agentic) terminal use.

Status: Library and CLI cover synchronous conversion (/v1/convert/{source,file}), synchronous chunking (/v1/chunk/{hybrid,hierarchical}/{source,file}), and the /health, /ready, and /version routes. Async conversion and async chunking are not yet wrapped.

Requirements: Go 1.24+. A running docling-serve instance (defaults to http://localhost:5001).

Library

import "github.com/miku/doclingclient"

c := doclingclient.New("http://localhost:5001",
    doclingclient.WithAPIKey("sk-..."),
    doclingclient.WithTimeout(10*time.Minute),
)

// Convert a URL.
resp, err := c.ConvertURL(ctx, "https://arxiv.org/pdf/2206.01062", nil)

// Convert a local file (streamed multipart upload).
resp, err := c.ConvertPath(ctx, "paper.pdf", &doclingclient.Options{
    ToFormats: []doclingclient.OutputFormat{
                    doclingclient.FormatMD,
                    doclingclient.FormatJSON},
    DoOCR:     doclingclient.Ptr(true),
    Pipeline:  doclingclient.PipelineStandard,
})

// A 200 response can still describe a conversion failure — check it.
if err := resp.Err(false); err != nil {
    log.Fatal(err)
}
fmt.Println(resp.Document.MDContent)

The library covers /v1/convert/source (URL or base64 in-body), /v1/convert/file (streamed multipart upload), and the /health, /ready, /version routes. For full coverage of ConvertDocumentsOptions, the struct in types.go is a deliberate subset — extend it as needed.

Note on output formats: the docling-serve OutputFormat enum also defines yaml, html_split_page, and vtt, but the ExportDocumentResponse object does not carry corresponding content fields, so this library and CLI do not surface them. The five exposed formats (md, json, html, text, doctags) match what the server actually returns.

CLI

A minimal command, docli, wraps the library. It is named to avoid collision with the upstream docling CLI.

go install github.com/miku/doclingclient/cmd/docli@latest

# Convert a URL (default output: markdown to stdout).
docli convert https://arxiv.org/pdf/2206.01062 > paper.md

# Convert a local file as JSON.
docli convert --to json paper.pdf > paper.json

# Produce several formats at once and write them to a directory.
docli convert --to md,json,html --output ./out paper.pdf
# => ./out/paper.md, ./out/paper.json, ./out/paper.html

# Talk to a remote docling-serve, with auth.
DOCLING_SERVER=https://docling.example.org \
DOCLING_API_KEY=sk-... \
    docli convert paper.pdf

# Server checks.
docli health
docli ready
docli version

Chunking for RAG / embeddings

docli chunk converts a document and splits it into chunks suitable for feeding into an embedding model. Output is JSONL on stdout — one chunk per line — which composes naturally with jq.

# Default hybrid chunker (tokenization-aware).
docli chunk paper.pdf > chunks.jsonl

# Pick a tokenizer and cap chunks to 512 tokens.
docli chunk --max-tokens 512 \
    --tokenizer Qwen/Qwen3-Embedding-0.6B \
    paper.pdf > chunks.jsonl

# Structural chunks (one per document element, no tokenizer).
docli chunk --chunker hierarchical paper.pdf > chunks.jsonl

# Inspect chunk lengths.
jq -r '.num_tokens // (.text | length)' < chunks.jsonl | sort -n | uniq -c

Each chunk carries text (with headings/captions inlined for context), optional raw_text (with --include-raw-text), num_tokens, headings, captions, page_numbers, and doc_items references into the source document.

Tokenizer choice

The hybrid chunker counts tokens to keep each chunk within a budget. That budget is meaningful only relative to a specific tokenizer — and you almost always want the tokenizer to match the embedding model you'll feed the chunks into downstream, so chunk sizes line up with the embedder's context window.

docling-serve accepts any HuggingFace tokenizer identifier as --tokenizer (OpenAI/tiktoken tokenizers are not reachable through the server). The default is sentence-transformers/all-MiniLM-L6-v2. If you don't pass --max-tokens, the cap is derived from the tokenizer's model_max_length.

A few common picks, biased toward what shows up in docling's own examples and typical RAG stacks:

Tokenizer (HuggingFace ID)	Max tokens	Notes
`sentence-transformers/all-MiniLM-L6-v2`	256	Default. Tiny, fast, English-only. Good baseline.
`sentence-transformers/all-mpnet-base-v2`	384	Higher-quality English embeddings, still small.
`BAAI/bge-small-en-v1.5`	512	Strong small English model, widely used in RAG.
`BAAI/bge-m3`	8192	Multilingual, long-context. Good general-purpose pick.
`intfloat/multilingual-e5-large`	512	Multilingual, balanced quality/size.
`nomic-ai/nomic-embed-text-v1.5`	8192	Long-context English.
`Qwen/Qwen3-Embedding-0.6B`	32768	Long-context, multilingual, newer.

Rule of thumb: pick the tokenizer that ships with the embedding model you plan to call after docli chunk. Mixing them silently misaligns the token count and leads to chunks that overflow (or underfill) the real embedder.

The server needs to fetch the tokenizer the first time it sees it. In air-gapped deployments only models already cached on the server will work.

Conversion flags (shared by `convert` and `chunk`)

These flags tune the underlying document conversion. They apply identically to docli convert and docli chunk. Numeric and boolean defaults marked (auto) are sent only when you set them explicitly, so docling-serve's own defaults stay authoritative on bare invocations.

Flag	Default	Description
`--from`	(auto)	Input formats, e.g. `pdf,docx`; server autodetects if empty.
`--ocr`	`true`	Enable OCR.
`--force-ocr`	`false`	Force OCR over existing text.
`--ocr-lang`	(auto)	Comma-separated OCR languages, e.g. `en,de`.
`--table-mode`	(auto)	`fast` or `accurate`; server default if empty.
`--tables`	(auto)	Extract table structure. Sent only when explicitly set.
`--pages`	(all)	Page range, e.g. `1-10` or `3`.
`--image-export-mode`	(auto)	`placeholder`, `embedded`, or `referenced`. Server default if empty.
`--include-images`	(auto)	Include extracted images. Sent only when explicitly set.
`--images-scale`	(auto)	Scale factor for extracted images (server default ~2.0).
`--abort-on-error`	`false`	Abort on first error. Sent only when explicitly set.
`--document-timeout`	(none)	Per-document timeout in seconds.
`--pdf-backend`	(auto)	`pypdfium2`, `docling_parse`, `dlparse_v1`, `dlparse_v2`, `dlparse_v4`.
`--pipeline`	(auto)	`legacy`, `standard`, `vlm`, or `asr`. Server default if empty.

`docli convert` extras

Flag	Default	Description
`--to`, `-t`	`md`	Output formats: `md`, `json`, `html`, `text`, `doctags`.
`--output`, `-o`	(none)	Directory to write all requested formats as `<basename>.<ext>`; stdout is silent.
`--status`	`false`	Emit one status line/object to stderr after the conversion.
`--status-format`	`text`	`text` or `json` (see Caching below).
`--cache-dir`	(XDG)	Override the on-disk cache directory. Env: `DOCLING_CACHE_DIR`.
`--no-cache`	`false`	Disable the on-disk result cache.

`docli chunk` extras

Flag	Default	Description
`--chunker`	`hybrid`	Chunker strategy: `hybrid` or `hierarchical`.
`--max-tokens`	(auto)	Hybrid only. Max tokens per chunk; derived from the tokenizer if unset.
`--tokenizer`	`sentence-transformers/all-MiniLM-L6-v2`	Hybrid only. HuggingFace tokenizer ID. See "Tokenizer choice" above.
`--merge-peers`	`true`	Hybrid only. Merge undersized successive chunks with the same headings.
`--markdown-tables`	`false`	Serialize tables as Markdown instead of triplets.
`--include-raw-text`	`false`	Populate `raw_text` on each chunk alongside the contextualized `text`.
`--pretty`	`false`	Emit the full response as indented JSON instead of one chunk per line.

Note: docli chunk does not cache results; each invocation re-runs the conversion server-side. Only docli convert uses the on-disk cache.

Global flags (any subcommand): --server/-s (env DOCLING_SERVER), --api-key/-K (env DOCLING_API_KEY), --tenant/-T (env DOCLING_TENANT_ID).

Caching

docli convert caches results on disk by default, so repeating a request is near-instant. The cache uses the XDG spec, typically ~/.cache/doclingclient/, overridable with --cache-dir or DOCLING_CACHE_DIR. Disable with --no-cache.

Layout:

~/.cache/doclingclient/
├── server_version.json           # /version response, refreshed every 24 h
└── <12-char-server-hash>/
    ├── server_info.json           # full server version map for this namespace
    └── <input-hash>.json.zst     # zstd-compressed ConvertResponse JSON

Cache key fingerprints everything that affects output: source URL or local file content (SHA-256), to_formats, OCR settings, table mode, page range, etc. The server-version directory namespaces cached results, so an upstream docling-serve upgrade naturally falls into a fresh namespace — old results stay around for diffing or can be pruned with rm -rf ~/.cache/doclingclient/<hash>/.

Use --status to see whether a request was served fresh or from cache:

$ docli convert --status paper.pdf > /dev/null
status=success processing_time=12.43s source=fresh
$ docli convert --status paper.pdf > /dev/null
status=success processing_time=12.43s source=cached

For ad-hoc post-processing, add --status-format json to emit a single JSON object per run to stderr (one line, suitable for jq or appending to a log):

$ docli convert --status --status-format json paper.pdf > paper.md
{"status":"success","processing_time":12.43,"source":"fresh","filename":"paper.pdf","errors":[]}

$ docli convert --status --status-format json paper.pdf 2> status.jsonl > paper.md
$ jq -r '.processing_time' < status.jsonl
12.43

Testing

go test ./...
go test -cover ./...

The library exercises its HTTP client against httptest.Server; no live docling-serve instance is required.

A random thought on openapi

OpenAPI was very helpful to get this client started, in that the LLM could inquire the openapi.json file for the spec. However, we did not need to use any of the openapi generators, of which there are quite a few. A more systematic comparison of features of various libraries is still outstanding, but you could see an LLM + Prompt + openapi.json based client SDK generator.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
cmd/docli		cmd/docli
notes		notes
static		static
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cache.go		cache.go
cache_test.go		cache_test.go
chunk.go		chunk.go
chunk_test.go		chunk_test.go
client.go		client.go
client_test.go		client_test.go
convert.go		convert.go
go.mod		go.mod
go.sum		go.sum
meta.go		meta.go
nfpm.yaml		nfpm.yaml
openapi.json		openapi.json
types.go		types.go
types_test.go		types_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

doclingclient

Installation

Background, Prompt

Library

CLI

Chunking for RAG / embeddings

Tokenizer choice

Conversion flags (shared by `convert` and `chunk`)

`docli convert` extras

`docli chunk` extras

Caching

Testing

A random thought on openapi

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

doclingclient

Installation

Background, Prompt

Library

CLI

Chunking for RAG / embeddings

Tokenizer choice

Conversion flags (shared by convert and chunk)

docli convert extras

docli chunk extras

Caching

Testing

A random thought on openapi

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Conversion flags (shared by `convert` and `chunk`)

`docli convert` extras

`docli chunk` extras

Packages