Skip to content

Releases: IxMxAMAR/raggity

raggity v0.10.0 — model switching, doctor, provider auto-discovery, personas

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 03 Jul 16:34

rag model — switch generation backend/model from the CLI: rag model gemma3:4b -p ollama edits your raggity.toml in place (comments preserved). Providers: claude/anthropic, openai, ollama, lmstudio, llamacpp, vllm, jan, koboldcpp — local OpenAI-compatible servers get the right base_url automatically and don't demand an API key. rag model --list shows every local LLM runtime discovered on your machine (running / installed / models available).

Provider auto-discovery & auto-start — raggity probes known local runtimes and, when your configured backend is Ollama (or LM Studio with its CLI present) and the server isn't running, finds the binary and starts it automatically — no more manual ollama serve. Gated by generation.auto_start (default on).

rag doctor — one command that checks your whole setup: config, sources, index, embedding model, generation backend (Claude auth / Ollama reachable and model pulled / OpenAI key), optional extras, write access — with ok/warn/FAIL markers and fix hints. It doesn't just diagnose a stopped Ollama; it starts it.

Opt-in personalizationgeneration.personal_kb = true makes first-person questions ('who am I', 'my …') treat the knowledge base as belonging to the user; generation.persona adds free-text user context. Both append to the system prompt without weakening citation/abstention rules, and the default prompt is byte-identical to before. Multi-tenant servers can set per-API-key personas ([server.personas]).

Also: ASCII-safe log strings on Windows consoles. 425 tests, 0 warnings. New base dep: tomlkit (comment-preserving config edits).

raggity v0.9.1 — ingest hardening hotfix

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 03 Jul 14:35

Hotfix from a real-world report: running rag ingest from a home directory with broad globs. Fixes: (1) built-in junk-directory pruningAppData, node_modules, .git, site-packages, __pycache__, venvs, package caches and the .raggity index itself are skipped when they appear below an include pattern's base (a pattern pointed inside such a dir still works), plus a new user-configurable sources.exclude glob list and a warning when a pattern matches >10,000 files. (2) Embedding OOM crash fixed — the chunker now hard-splits oversized single paragraphs (a multi-MB single-line file previously became one enormous chunk whose attention matrix tried to allocate 3.2GB), and embedding.parallel now defaults to in-process single-model instead of an all-cores multiprocess pool. (3) Quiet ingest — per-file skip messages demoted to info-level (summarized as skipped=N in the ingest summary), pypdf's internal warnings silenced. 387 tests, 0 warnings.

raggity v0.9.0 — performance overhaul

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 03 Jul 14:10

Everything faster, nothing lost — retrieval results are bit-identical to v0.8.0 (reused stored vectors are byte-equal to fresh embeds; every change is behind regression tests; 374 tests, 0 warnings).

Measured on the same machine (v0.8.0 → v0.9.0):

  • rag ask retrieval stage: removes ~2.8s/query of redundant CPU (dedup now reuses the vectors already returned by the store instead of re-embedding all 30 candidates)
  • No-op rag ingest: 1898ms → 248ms (7.7×) — mtime+size fast-path (manifest v2, auto-migrating), and a no-op ingest no longer loads the embedding model or opens the store at all. rag watch benefits on every save.
  • rag status: 1935ms → 1045ms — lazy component construction; status never loads the two ML models.
  • CLI import: 0.387s → 0.152s; rag --help: 522ms → 296ms — the Claude Agent SDK (full MCP stack) now imports only on first LLM use.

Streaming: the web UI / session SSE path now streams true incremental token deltas (previously it delivered the whole answer as one chunk — and the Claude backend never token-streamed at all). SSE responses add :ok flush + Cache-Control: no-cache + X-Accel-Buffering: no (proxy-safe) and newline-safe framing.

LLM efficiency: setting_sources=[] (your CLAUDE.md / settings no longer leak into every call), the SDK's per-call claude -v subprocess spawn is skipped, max_turns=1; parallel GraphRAG extraction (retrieval.graph_concurrency, default 8); LLM-judge does 1 call per row instead of 2; optional transform-output caching; the answer-cache lock no longer serializes concurrent generation.

Server/multi-tenant: per-tenant instances now share the embedding + reranker models (thread-safe; ~0.8s + one model-copy of RAM saved per tenant). Embed cache moved from whole-file JSON to SQLite (auto-migrates; a 50k-vector cache was 193MB JSON rewritten in full per batch).

Ingest: atomic manifest writes, batched source deletes (~170× on bulk changes), connector ingests are now incremental (hash-diff + scoped pruning), optimize/ANN skipped on no-op runs.

No API changes; caches remain opt-in; claude-agent-sdk>=0.2 floor. Default single-user local behavior unchanged.

raggity v0.8.0

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 01 Jul 06:34

First-run onboarding + batteries-included document readers. rag init scaffolds an annotated raggity.toml. Empty-KB guidance: ask/chat/status now tell you to run rag initrag ingest instead of silently abstaining, and warn when no config is found. Skipped-file hints: ingest reports files that need an optional extra (e.g. "Skipped 3 file(s) needing raggity[ocr]") instead of dropping them silently — surfaced in the CLI and the server /ingest response. Batteries-included install: pip install raggity now reads md/txt/pdf plus docx/html/pptx out of the box (the lightweight readers moved into base deps); only image/scanned-PDF OCR stays opt-in via raggity[ocr]. New raggity[all] installs every optional feature. Also fixed Rich-markup swallowing [extra] in install hints and non-ASCII dashes mangling on the Windows console. 324 tests, 0 warnings.

raggity v0.7.1

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 29 Jun 12:22

Whole-package review hardening. Security: per-identity session isolation (no cross-tenant conversation read/continue/delete), identity-scoped DELETE /session, bounded+closed per-user Raggity cache, SSE errors no longer leak internals, new POST /ingest/content so tenants ingest their own docs. Correctness: embed-cache now keyed by model+dim (no stale-vector corruption on model switch), answer-cache concurrency lock + system-prompt keying + size bound, parent-document retrieval preserves lost-in-the-middle ordering, RRF ordering fixed for hybrid+rerank-off, query transforms resilient to a single LLM failure, loop-safe sync wrappers. Providers: OpenAI/Ollama client closed, empty-choices guard, OpenAI requires a key, subscription mode strips OAuth token too. Storage/connectors: Qdrant local payload indexes, LanceDB version guards, bounded fallbacks, posix/raw-bytes-hash consistency, GitHub SHA refs, web crawl max_pages cap, 16-char citation tags. Tooling: serve --open race fixed, eval input validation, clearer registry errors, resilient watch, pip install -e .[dev] runs the suite, Python 3.13 in CI. 308 tests, 0 warnings. No change to the default single-user/local/Claude path.

raggity v0.7.0

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 28 Jun 16:45

Growth & polish (Phase F). GitHub Actions CI (full test suite on every push/PR) + status/PyPI/Python/license badges. MkDocs Material docs site (auto-deployed to GitHub Pages). CONTRIBUTING / CODE_OF_CONDUCT / SECURITY, issue & PR templates, and example recipes (personal-notes KB, fully-offline Ollama, web/GitHub KB). A VHS demo tape and launch drafts. No behavior change vs v0.6.0; this is the community/onboarding layer.

raggity v0.6.0

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 28 Jun 16:13

Ops & deployment. Optional server auth (API-key; server.auth) + per-user collections (namespaced multi-tenant; server.per_user) — single-user local stays the default. Opt-in OpenTelemetry tracing (raggity[otel], exports to Phoenix/Langfuse/Jaeger via OTLP). Docker + compose (raggity + Qdrant + optional Ollama) and a GHCR image published on release (ghcr.io/ixmxamar/raggity). Bounded server sessions. All opt-in; CLI/local behavior unchanged.

raggity v0.5.0

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 28 Jun 12:33

Lightweight GraphRAG (opt-in, raggity[graph]): set retrieval.graph=true and rag ingest (or rag graph-build) extracts entities+relations per chunk via your LLM into a networkx knowledge graph; at query time the question's entities are linked and their graph neighborhood is merged into retrieval before reranking — better multi-hop/connect-the-dots answers. Not Microsoft-style community summarization; off by default (zero cost unless enabled).

raggity v0.4.0

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 28 Jun 12:10

Interaction: multi-turn conversation memory (rag chat REPL + server sessions, history-aware retrieval) and a minimal vanilla web chat UI served by rag serve --open (zero Node/build deps, SSE streaming at /ask/stream, inline citations). Single-turn rag ask unchanged.

raggity v0.3.0

Choose a tag to compare

@IxMxAMAR IxMxAMAR released this 28 Jun 11:50

Ingestion breadth + connectors. New file types: DOCX, HTML, CSV, PPTX (raggity[docs]), plus OCR for images & scanned PDFs via RapidOCR — ONNX, torch-free (raggity[ocr]). New connector framework with web-crawl (rag ingest-url, raggity[web]), GitHub (rag ingest-repo), and Obsidian vault (rag ingest-obsidian) connectors; sources.urls in config. All optional/opt-in; default md/txt/pdf behavior unchanged.