Skip to content

Releases: niclasj79/tldr-g

TLDR-G v0.1.0 - Windows binary

18 Jun 23:44

Choose a tag to compare

TLDR-G v0.1.0 — first public build

This is the first public build of TLDR-G — a local-first knowledge rendering engine. It's an early release: it runs end to end, but it's deliberately a v0.1, not a 1.0. We're putting it in your hands now so you can run it, inspect it, and judge the trajectory — not because it's finished.

What it does

TLDR-G builds a structured graph of your documents and renders the context a question needs — preserving the connections and the provenance, under a hard token budget — instead of retrieving disconnected chunks. It runs on your machine, against the model you choose.

What's in the download

  • The Cockpit desktop app — ingest .txt/.md files, then ask questions and get grounded, source-backed answers. No terminal needed.
  • tp-vrg-mcp — an MCP stdio server, so agent hosts (Claude Desktop, Cursor, custom clients) can use TLDR-G as queryable memory.
  • A local HTTP API — the same engine, over REST.
  • On-device, single-file (SQLite) storage — your data never has to leave the machine.
  • Verifiable provenance — every rendered claim resolves to verbatim source, checkable offline.

Requirements

  • Windows 10/11 (64-bit). macOS and Linux are a fast-follow.
  • NVIDIA GPU with ≥4 GB VRAM strongly recommended (GTX 1060 6 GB or better). Runs CPU-only, but ingest and query are ~20–50× slower.
  • 16 GB RAM recommended.
  • ~3 GB of models download once on first launch (needs internet the first time).

Bring your own model

The deterministic core needs no LLM at all. For generated answers, choose your provider in the Cockpit:

  • Ollama (local) — keyless, fully on-device. The recommended sovereign path.
  • OpenAI (cloud) — in v0.1, the key is read from the OPENAI_API_KEY environment variable; in-app key entry is a near-term fast-follow.
  • Context-only — no LLM; the engine renders the context and you read it directly.

Language support

The engine auto-detects the language of each chunk at ingestion and routes it to a language-specific NLP stack — it adapts to a source's language and format, never its domain. What that means today:

  • English is the shipped, validated stack in v0.1.
  • Other-language content ingests gracefully — the detector is language-aware, so it won't crash and won't produce garbled output, but non-English text is processed at English-grade quality for now.
  • Swedish has a dedicated stack already designed and built, but it is opt-in and not bundled in v0.1 (its models aren't downloaded by default). Native Swedish-quality NLP is a near-term release.
  • Adding a language is adding its stack behind the detector — the architecture generalizes; it isn't English-only by design, it's English-first by shipping order.

Known limitations (v0.1, stated honestly)

  • Unsigned installer — Windows SmartScreen will warn on first run; choose "More info → Run anyway." Code signing comes later.
  • Cloud LLM key is env-var-only for now (see "Bring your own model"); in-app entry is coming.
  • Non-English content gets English-grade NLP until per-language stacks ship.
  • Windows only this release.
  • No published head-to-head benchmark yet — rigorous, reproducible comparison is the next milestone. We'd rather ship the runnable engine now and the proof when it's been hardened.

Open SDK

The integration contracts and the offline verification surface are open source (Apache-2.0) at https://github.com/niclasj79/tldr-g. The engine ships as a free compiled binary. This is not open-core: the engine stays closed, the contracts you build against stay open.


A next-release roadmap will follow — what's coming after v0.1.