Release v2.0.0 · RodCor/kimetsu

v2.0.0 — Never explore twice

The biggest token sink is RE-EXPLORATION — the agent re-deriving what the brain
(or the repo) already knows. v2.0 attacks it from three flagship directions and
adds a pluggable storage tier. Backward-compatible: existing project.toml and
brain.db files upgrade in place (schema v3 → v6, automatic on open).

Flagship — Session warm-start (never re-establish your work)

ADDED

Episodic work-resume. A new work_episode record (event-sourced,
schema v5) captures your working state at SessionEnd — the task, what you
did, what FAILED and why (dead-ends), open threads, and the working
hypothesis — scoped per-repo (one live episode per repo). kimetsu resume
prints it; kimetsu checkpoint [note] saves one manually. Distilled via
the cheap model when configured, else a rule-based summary (never blocks).
Episodes are LOCAL-ONLY and never sync/export.
Project digest. kimetsu brain digest [--refresh] assembles a compact
(~400-token) digest — repo manifest + top-usefulness memories + current
focus — cached at .kimetsu/digest.md by content hash, refreshed on git/
corpus drift. (The digest is currently rule-based; a cheap-model
distillation hook-point is present but not yet wired.)
SessionStart injection. A new kimetsu brain session-start-hook emits
the digest + resume as additionalContext, gated by [broker] warm_start
(default on). Wired for Claude Code; other hosts' SessionStart context
surface is not yet verified and is intentionally left unwired.

Flagship — The active brain (never wait on the brain)

ADDED

kimetsu brain ask "<question>". Terminal Q&A answered entirely from
the brain — full retrieval + a grounded answer composed by a LOCAL/cheap
model, with memory-id citations. Zero frontier tokens; works offline. Falls
back to distiller credentials, then to verbatim top-capsules when no model
is configured. Grounded-only: refuses rather than hallucinating when memory
has no answer. --json; an answer marked helpful counts like a citation.
kimetsu_brain_answer MCP tool. A read-only synthesis tool the host
agent can call mid-task ("what do we know about X?") for a grounded, cited
brief instead of re-discovering.
Command recall fast-path. "How do I…" queries surface matching
command-kind memories runnable-line-first.
Answer-grade injection. Very-high-confidence capsules (score ≥
[broker] answer_grade_min_score, default 0.92) are prefixed "Verified
answer from project memory:" so the model can act in one turn. Render-time
only — ranking is untouched — and suppressed when the memory was recently a
floor-drop regret.
Proactive pre-fetch. Opt-in [broker] proactive_prefetch (default off)
warms trajectory-relevant memories at PreToolUse.

Flagship — Memory → skill synthesis (never re-derive a solution)

ADDED

Skill synthesis. A memory cited ≥3 times (or a tight cluster) becomes a
synthesis candidate; kimetsu brain skills [--review] drafts an executable
skill from it — grounded strictly in the cited memories — and, on explicit
accept, installs it into the host-native skill dir with provenance back to
the source memory ids. Propose-only (never auto-installed); flagged stale
when a source memory is superseded. Schema v6 (skill_proposals).

Local-model independence

ADDED

Ollama as a first-class provider (provider = "ollama" — OpenAI-compat,
localhost:11434/v1 default, no key) and a single optional [cheap_model]
config that all cheap-model consumers resolve (distiller, consolidation,
digest, resume, skill draft, ask). Back-compatible with
[learning.distiller]; OPTIONAL everywhere — every consumer degrades
gracefully when no model is configured. kimetsu doctor probes the local
endpoint. New guide: docs/LOCAL-MODELS.md (fully-local = zero external
calls).

Pluggable storage backends

ADDED

RetrievalBackend trait with [storage] backend = "flat" | "graph-lite" | "graph" (default flat). The broker stays backend-agnostic; switching
backends re-projects from the event log.
Graph-lite (Tier 1) — a typed-edge projection (memory_edges, schema
v4) with 1–2 hop expansion blended as graph-provenance candidates (a strict
superset of flat — no recall loss; the broker still filters). supersedes
edges populate now; episode-sourced edge types are reserved.
Petgraph (Tier 2, remote-only) behind the graph feature (off in local
lean/embeddings builds — petgraph is never compiled there). In-memory graph
with centrality / shortest-path / community-detection helpers, plus a
cross-backend benchmark harness. Spike verdict: an embedded graph DB
(Kùzu/Cozo) is not justified through ~100k memories.

Continuous self-tuning + ROI v2

ADDED

Re-tune triggers (≥50 new memories since last tune, or elevated regret
rate) surfaced via kimetsu brain tune --status + a Stop-hook one-liner —
proposed, never auto-applied. Model re-selection advisor
(tune --models) with reindex/download cost stated. Regret-driven
objective: the tune objective now penalizes floor configs that produced
regrets. ROI v2: per-memory ROI (roi --top), output-token accounting,
and warm-start (digest_served/resume_served) savings attribution.

Personal brain sync

ADDED

Event-log replication (not file copying). kimetsu brain sync export --since <cursor> / import move durable memory-lifecycle events as
portable JSONL with per-event idempotency; telemetry, raw queries, and
local-only episodes are excluded by an allowlist; redaction is respected.
[sync] dir drives a server-less directory protocol (Dropbox/Syncthing/NAS)
with per-machine batches and per-source cursors; --dry-run and
--status throughout. Object-storage backends (e.g. S3) are deferred.

Hardening & release

FIXED

Analytics/doctor active counts, reindex, peek/undo, and memory list now
consistently exclude superseded rows; config set/tune --apply preserve
toml comments + unknown keys (toml_edit); dropped-capsule + proactive-state
sidecars write atomically. Cursor/Gemini MCP schemas verified against live
docs.
Release pipeline: a version-guard job fails in seconds on a
tag/workspace-version mismatch (and the built binary must self-report the
tag) — closing the gap that produced the v1.5.0 botch; core GitHub Actions
bumped to Node-24-ready versions; scripts/bump-version.sh codifies the
one-step version bump.

KNOWN LIMITATIONS

SessionStart warm-start is wired for Claude Code only; the digest is
rule-based pending the cheap-model distillation wiring; full retrieval-
quality and first-turn-token benchmarks require the embeddings + Docker
environment and were not run in CI; remote-bench process isolation (#23)
remains open.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.0.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v2.0.0 — Never explore twice

Flagship — Session warm-start (never re-establish your work)

Flagship — The active brain (never wait on the brain)

Flagship — Memory → skill synthesis (never re-derive a solution)

Local-model independence

Pluggable storage backends

Continuous self-tuning + ROI v2

Personal brain sync

Hardening & release

Uh oh!