Skip to content

ndcorder/smelt

Repository files navigation

smelt

smelt converts PDF, DOCX, PPTX, XLSX, images, audio, web pages, and YouTube videos into clean, structured text optimized for large language models. It exposes a single convert MCP tool that any compatible client (Claude Desktop, Cursor, etc.) can call directly. Multiple backends compete on quality — install one or install them all, and smelt picks the best available converter for each file type. Output as markdown, XML, JSON, or DocTags.

Installation

Use case Install command Backends available
GPU workstation, best PDF quality pip install smelt[mineru] MinerU
CPU server, good all-rounder pip install smelt[docling] Docling
Lightweight/embedded, PDF only pip install smelt[pymupdf] PyMuPDF
Web scraping + YouTube pip install smelt[web] Trafilatura, yt-dlp
Audio transcription pip install smelt[whisper] faster-whisper
Everything pip install smelt[all] All backends
Dev pip install -e ".[all,dev]" All + test tools

Quick start

  1. Install smelt with your preferred backend:
pip install smelt[docling]
  1. Add to Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS, %APPDATA%\Claude\claude_desktop_config.json on Windows):
{
  "mcpServers": {
    "smelt": {
      "command": "python",
      "args": ["-m", "smelt"]
    }
  }
}
  1. Ask Claude to convert a document:

"Convert report.pdf to markdown"

Claude will call convert automatically.

Output formats

All tools accept a format parameter: "markdown" (default), "xml", "json", or "doctags".

Markdown

# Q1 Revenue Report

Company revenue exceeded targets across all regions in Q1 2025.

| Quarter | Revenue | Growth |
|---------|---------|--------|
| Q1 2024 | $4.2M | 12% |
| Q1 2025 | $5.1M | 21% |

XML

<document>
  <heading level="1">Q1 Revenue Report</heading>
  <paragraph>Company revenue exceeded targets across all regions in Q1 2025.</paragraph>
  <table>
    <header>
      <cell>Quarter</cell>
      <cell>Revenue</cell>
      <cell>Growth</cell>
    </header>
    <row>
      <cell>Q1 2024</cell>
      <cell>$4.2M</cell>
      <cell>12%</cell>
    </row>
    <row>
      <cell>Q1 2025</cell>
      <cell>$5.1M</cell>
      <cell>21%</cell>
    </row>
  </table>
</document>

JSON

{
  "metadata": {
    "source": "report.pdf",
    "backend": "docling",
    "pages": 1
  },
  "elements": [
    {"type": "heading", "level": 1, "text": "Q1 Revenue Report"},
    {"type": "paragraph", "text": "Company revenue exceeded targets across all regions in Q1 2025."},
    {
      "type": "table",
      "headers": ["Quarter", "Revenue", "Growth"],
      "rows": [
        ["Q1 2024", "$4.2M", "12%"],
        ["Q1 2025", "$5.1M", "21%"]
      ]
    }
  ]
}

Configuration

smelt works out of the box with no configuration. For customization, create a smelt.yaml file. See smelt.yaml.example for a full reference.

Config search order (first found wins):

  1. $SMELT_CONFIG environment variable (explicit path)
  2. ./smelt.yaml (current directory)
  3. ~/.config/smelt/smelt.yaml (XDG config)
  4. /etc/smelt/smelt.yaml (system-wide)
  5. Built-in defaults

Environment variable overrides follow the pattern SMELT_{SECTION}_{KEY}:

SMELT_OUTPUT_DEFAULT_FORMAT=xml
SMELT_SERVER_MAX_FILE_SIZE_MB=1000
SMELT_BACKENDS_PDF_PREFERRED=docling

Available tools

By default, smelt exposes 5 tools. Set expose_specialized_tools: true in smelt.yaml to also register per-type tools for explicit backend control.

Tool Description Default
convert Convert any file, URL, or YouTube link to LLM-ready text Yes
batch_convert Convert all supported files in a directory Yes
list_backends List available backends and their capabilities Yes
list_formats List available output formats Yes
get_text_file Read a text file directly (txt, md, csv, etc.) Yes
pdf_to_text Convert a PDF document Opt-in
docx_to_text Convert a Word document (.docx) Opt-in
pptx_to_text Convert a PowerPoint (.pptx) Opt-in
xlsx_to_text Convert an Excel spreadsheet (.xlsx/.xls) Opt-in
image_to_text Extract text from an image using OCR Opt-in
webpage_to_text Extract main content from a web page Opt-in
youtube_to_text Get the transcript of a YouTube video Opt-in
audio_to_text Transcribe an audio file using Whisper Opt-in
html_to_text Convert a local HTML file Opt-in
epub_to_text Convert an EPUB ebook Opt-in

Available backends

Backend Install extra Supported types Priority GPU
MinerU smelt[mineru] PDF, IMAGE 100 Optional
Docling smelt[docling] PDF, DOCX, PPTX, XLSX, HTML, IMAGE, EPUB 80 Optional
Trafilatura smelt[web] WEBPAGE 90 No
faster-whisper smelt[whisper] AUDIO, VIDEO 90 Optional
YouTube smelt[web] YOUTUBE 90 No
BeautifulSoup smelt[web] WEBPAGE 30 No
PyMuPDF smelt[pymupdf] PDF 30 No
Pandoc system install DOCX, EPUB, HTML 20 No

When backend="auto" (the default), smelt picks the highest-priority installed backend for the file type. If it fails, it falls down the chain automatically.

Docker

CPU:

docker compose -f docker/docker-compose.yaml up smelt

GPU (requires NVIDIA Container Toolkit):

docker compose -f docker/docker-compose.yaml up smelt-gpu

Or build directly:

docker build -f docker/Dockerfile -t smelt .
docker build -f docker/Dockerfile.gpu -t smelt-gpu .

Contributing

See CONTRIBUTING.md for development setup, code standards, and how to add backends or formatters.

License

GPL-3.0-or-later. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors