Releases · RodCor/kimetsu

22 Jun 06:13

RodCor

v2.0.0

35d115e

v2.0.0 Latest

Latest

v2.0.0 — Never explore twice

The biggest token sink is RE-EXPLORATION — the agent re-deriving what the brain
(or the repo) already knows. v2.0 attacks it from three flagship directions and
adds a pluggable storage tier. Backward-compatible: existing project.toml and
brain.db files upgrade in place (schema v3 → v6, automatic on open).

Flagship — Session warm-start (never re-establish your work)

ADDED

Episodic work-resume. A new work_episode record (event-sourced,
schema v5) captures your working state at SessionEnd — the task, what you
did, what FAILED and why (dead-ends), open threads, and the working
hypothesis — scoped per-repo (one live episode per repo). kimetsu resume
prints it; kimetsu checkpoint [note] saves one manually. Distilled via
the cheap model when configured, else a rule-based summary (never blocks).
Episodes are LOCAL-ONLY and never sync/export.
Project digest. kimetsu brain digest [--refresh] assembles a compact
(~400-token) digest — repo manifest + top-usefulness memories + current
focus — cached at .kimetsu/digest.md by content hash, refreshed on git/
corpus drift. (The digest is currently rule-based; a cheap-model
distillation hook-point is present but not yet wired.)
SessionStart injection. A new kimetsu brain session-start-hook emits
the digest + resume as additionalContext, gated by [broker] warm_start
(default on). Wired for Claude Code; other hosts' SessionStart context
surface is not yet verified and is intentionally left unwired.

Flagship — The active brain (never wait on the brain)

ADDED

kimetsu brain ask "<question>". Terminal Q&A answered entirely from
the brain — full retrieval + a grounded answer composed by a LOCAL/cheap
model, with memory-id citations. Zero frontier tokens; works offline. Falls
back to distiller credentials, then to verbatim top-capsules when no model
is configured. Grounded-only: refuses rather than hallucinating when memory
has no answer. --json; an answer marked helpful counts like a citation.
kimetsu_brain_answer MCP tool. A read-only synthesis tool the host
agent can call mid-task ("what do we know about X?") for a grounded, cited
brief instead of re-discovering.
Command recall fast-path. "How do I…" queries surface matching
command-kind memories runnable-line-first.
Answer-grade injection. Very-high-confidence capsules (score ≥
[broker] answer_grade_min_score, default 0.92) are prefixed "Verified
answer from project memory:" so the model can act in one turn. Render-time
only — ranking is untouched — and suppressed when the memory was recently a
floor-drop regret.
Proactive pre-fetch. Opt-in [broker] proactive_prefetch (default off)
warms trajectory-relevant memories at PreToolUse.

Flagship — Memory → skill synthesis (never re-derive a solution)

ADDED

Skill synthesis. A memory cited ≥3 times (or a tight cluster) becomes a
synthesis candidate; kimetsu brain skills [--review] drafts an executable
skill from it — grounded strictly in the cited memories — and, on explicit
accept, installs it into the host-native skill dir with provenance back to
the source memory ids. Propose-only (never auto-installed); flagged stale
when a source memory is superseded. Schema v6 (skill_proposals).

Local-model independence

ADDED

Ollama as a first-class provider (provider = "ollama" — OpenAI-compat,
localhost:11434/v1 default, no key) and a single optional [cheap_model]
config that all cheap-model consumers resolve (distiller, consolidation,
digest, resume, skill draft, ask). Back-compatible with
[learning.distiller]; OPTIONAL everywhere — every consumer degrades
gracefully when no model is configured. kimetsu doctor probes the local
endpoint. New guide: docs/LOCAL-MODELS.md (fully-local = zero external
calls).

Pluggable storage backends

ADDED

RetrievalBackend trait with [storage] backend = "flat" | "graph-lite" | "graph" (default flat). The broker stays backend-agnostic; switching
backends re-projects from the event log.
Graph-lite (Tier 1) — a typed-edge projection (memory_edges, schema
v4) with 1–2 hop expansion blended as graph-provenance candidates (a strict
superset of flat — no recall loss; the broker still filters). supersedes
edges populate now; episode-sourced edge types are reserved.
Petgraph (Tier 2, remote-only) behind the graph feature (off in local
lean/embeddings builds — petgraph is never compiled there). In-memory graph
with centrality / shortest-path / community-detection helpers, plus a
cross-backend benchmark harness. Spike verdict: an embedded graph DB
(Kùzu/Cozo) is not justified through ~100k memories.

Continuous self-tuning + ROI v2

ADDED

Re-tune triggers (≥50 new memories since last tune, or elevated regret
rate) surfaced via kimetsu brain tune --status + a Stop-hook one-liner —
proposed, never auto-applied. Model re-selection advisor
(tune --models) with reindex/download cost stated. Regret-driven
objective: the tune objective now penalizes floor configs that produced
regrets. ROI v2: per-memory ROI (roi --top), output-token accounting,
and warm-start (digest_served/resume_served) savings attribution.

Personal brain sync

ADDED

Event-log replication (not file copying). kimetsu brain sync export --since <cursor> / import move durable memory-lifecycle events as
portable JSONL with per-event idempotency; telemetry, raw queries, and
local-only episodes are excluded by an allowlist; redaction is respected.
[sync] dir drives a server-less directory protocol (Dropbox/Syncthing/NAS)
with per-machine batches and per-source cursors; --dry-run and
--status throughout. Object-storage backends (e.g. S3) are deferred.

Hardening & release

FIXED

Analytics/doctor active counts, reindex, peek/undo, and memory list now
consistently exclude superseded rows; config set/tune --apply preserve
toml comments + unknown keys (toml_edit); dropped-capsule + proactive-state
sidecars write atomically. Cursor/Gemini MCP schemas verified against live
docs.
Release pipeline: a version-guard job fails in seconds on a
tag/workspace-version mismatch (and the built binary must self-report the
tag) — closing the gap that produced the v1.5.0 botch; core GitHub Actions
bumped to Node-24-ready versions; scripts/bump-version.sh codifies the
one-step version bump.

KNOWN LIMITATIONS

SessionStart warm-start is wired for Claude Code only; the digest is
rule-based pending the cheap-model distillation wiring; full retrieval-
quality and first-turn-token benchmarks require the embeddings + Docker
environment and were not run in CI; remote-bench process isolation (#23)
remains open.

Assets 13

checksums.txt

sha256:ffa4d5b276f8275844818dd247a2bba380ca66327829546ca9d9e1b50b4fdaed

1.13 KB 2026-06-22T06:25:04Z
kimetsu-2.0.0-aarch64-apple-darwin-embeddings.tar.gz

sha256:60374dbd709384701c86ff7af7f69f8ca2dc1b4e61da1b0a7b6dae7be81105e3

16 MB 2026-06-22T06:25:04Z
kimetsu-2.0.0-aarch64-apple-darwin-lean.tar.gz

sha256:5a48f2d74f022b05ee09fe8545ec8edb19f1828065497131c49c33581e8cd2f1

6.55 MB 2026-06-22T06:25:04Z
kimetsu-2.0.0-x86_64-apple-darwin-lean.tar.gz

sha256:5101422163a2458f9832c8410e50d7f8f1ab48428a68432e726ad4c513264b83

6.96 MB 2026-06-22T06:25:04Z
kimetsu-2.0.0-x86_64-pc-windows-msvc-embeddings.zip

sha256:557cba624577727b1d13966c9a086c71d8b0e02a5e79a535f699302141693958

15.4 MB 2026-06-22T06:25:04Z
kimetsu-2.0.0-x86_64-pc-windows-msvc-lean.zip

sha256:3ee83f2c9710a7575519b058ff0846486c4cf512e35ccd62526eae6f68b73c29

6.36 MB 2026-06-22T06:25:04Z
kimetsu-2.0.0-x86_64-unknown-linux-gnu-embeddings.tar.gz

sha256:acf75830edd83ccd3213240db264afcfb518d4afc568a36850a2c4ae838a6c71

18.8 MB 2026-06-22T06:25:04Z
kimetsu-2.0.0-x86_64-unknown-linux-gnu-lean.tar.gz

sha256:0ba03a831e576ca7b4358db7ff88beccef204df424f42b31f6369dd1b2641396

7.3 MB 2026-06-22T06:25:04Z
kimetsu-remote-2.0.0-aarch64-apple-darwin.tar.gz

sha256:a4d924f49774bb128514635216f46b1198b0fe92acbe19fa50b393617a82b027

14.8 MB 2026-06-22T06:25:04Z
kimetsu-remote-2.0.0-x86_64-pc-windows-msvc.zip

sha256:8e5215dd28d0eaf3618f3c5beb4254d9d31a2958e68420b60352bf66ed199ed3

14.1 MB 2026-06-22T06:25:04Z
Source code (zip)

2026-06-22T06:11:59Z
Source code (tar.gz)

2026-06-22T06:11:59Z

12 Jun 18:22

github-actions

v1.5.1

acdeaf2

v1.5.1

v1.5.1 — version-stamp re-release

Identical feature set to v1.5.0. The v1.5.0 artifacts were built before the
workspace version bump, so their binaries self-report 1.0.0 (which also
broke the crates.io publish — kimetsu-core@1.0.0 already existed). npm
forbids reusing a published version, so the corrected release ships as
v1.5.1. If you installed v1.5.0 from npm or the GitHub release, update.

FIXED

Workspace and inter-crate versions stamped correctly (kimetsu --version
now reports the release version; crates.io publish unblocked).

Assets 13

12 Jun 16:55

RodCor

v1.5.0

c718b2e

v1.5.0

Warning

Superseded by v1.5.1 — these artifacts were built before the workspace version bump and self-report 1.0.0. Identical feature set; install v1.5.1 instead.

v1.5.0 — pays for itself

ADDED

Telemetry capture. Raw query text is now stored in context.served
telemetry events when [learning] store_queries = true (default; set
false to revert to the pre-v1.5 query-hash-only behavior). A
session_id field is also written when the host provides one (Claude Code
hooks; absent for hosts that do not emit it). Dropped-capsule sidecar
(~/.kimetsu/cache/<hash>/dropped-recent.json) records capsules that
were floor-filtered out so that a later citation of one is detected as a
retrieval.regret event, feeding the self-tuning loop. All telemetry
stays on-machine; nothing is exported.
ROI ledger — kimetsu brain roi. Conservative per-kind token-savings
estimates (failure_pattern=1500, command=400, convention=300, fact=500,
preference=200 tokens per citation) minus brain-injection overhead give a
net-positive / net-negative verdict. Dollar estimates are shown when the
active model is recognized from a built-in price table (Claude 3/4, GPT-4/5
families, Bedrock routing prefixes) or when [model] price_per_mtok is set
in project.toml. --window 7d|30d|all and --json for scripting. The
Stop hook appends a per-session savings sentence when ≥1 citation occurred;
zero-citation sessions are silent. Calibration methodology and honest
limitations: docs/ROI-METHODOLOGY.md.
Token budget — render-time capsule compression and session dedupe.
Two [broker] toggles, both default true:
- compress_capsules: capsule summaries are compressed at render time
  (strips [tags: ...] / (context: ...) annotations, caps at 3
  sentences). Ranking is never affected — this runs only after retrieval
  and reranking. Set false to inject full memory text.
- session_dedupe: the UserPromptSubmit hook skips capsules whose
  handle was already injected earlier in the same session (tracked via
  the proactive-state sidecar). Soft policy: falls back to injecting all
  capsules when dedupe would produce an empty set. This addresses the
  pre-v1.5 behavior where the main hook re-injected the same top capsule
  on every prompt of a long session.
Self-Tuning Brain — kimetsu_brain_cite MCP tool + kimetsu brain tune.
kimetsu_brain_cite is a new write-gated MCP tool that records a
memory.cited event from inside an MCP session, closing the ground-truth
gap when the model leans on a memory but doesn't explicitly call
cite_memory. Personal eval set: tuneset builds positive eval cases from
context.served + citation joins (exact session_id or ±30-minute window
fallback). kimetsu brain tune --status reports case count and kind
coverage. kimetsu brain tune (dry-run by default) sweeps broker.min_lexical_coverage
∈ {0.3, 0.4, 0.5, 0.6} × broker.min_semantic_score ∈ {-1.0(AUTO), 0.0, 0.25,
0.35, 0.45} against the production embedder; --apply writes only the
floor parameters (not the reranker — that change is recommended separately);
--revert restores the previous tune-history entry. A holdout guardrail
(deterministic 20% split) prevents writing a config that regresses the
holdout objective.
Consolidation — kimetsu brain consolidate + kimetsu brain triage.
Schema migrated to v3 (superseded_by column + index on memories).
brain consolidate (Story 3.1, default): brute-force cosine scan within the
same embedding model; clusters at ≥ 0.92 cosine (configurable with
--threshold) are merged — survivor keeps its id and text, members get
superseded_by set and a memory.superseded event written (rebuild-safe);
citations are reassigned to the survivor. --distill (Story 3.2): looser
clusters (0.75–0.85 cosine band, ≥3 memories, ≥1 shared tag) are fed to
the configured distiller; result lands as a memory proposal for human
review; prints clusters and exits 0 when no distiller is configured.
brain triage (Story 3.3): interactive per-item keep / prune / skip of
memories below a usefulness and age threshold (--score-floor 0.2,
--age-days 30); --prune-all --yes for batch non-interactive pruning.
Reach — export redact, Cursor + Gemini CLI installers, CI embeddings job.
kimetsu brain export --redact strips the (context: …) segment from
exported memory text; --redact-tags (requires --redact) additionally
strips the [tags: …] prefix. Both flags are useful for sharing brains
without leaking workspace-specific file paths or tag metadata.
kimetsu plugin install cursor and kimetsu plugin install gemini-cli
write MCP config (.cursor/mcp.json / .gemini/settings.json) and an
always-on guidance file (.cursor/rules/kimetsu-brain/rule.md for workspace
installs; GEMINI.md merged into the project root or ~/.gemini/GEMINI.md
for global installs). Neither host has a UserPromptSubmit-style hook
system, so MCP + the guidance file are the complete integration surface.
Note: Cursor and Gemini CLI config schemas were inferred from their public
docs as of June 2026 and have not been verified against live host builds —
treat these installers as a starting-point and check the generated files
before committing them. CI: a new test-embeddings job (ubuntu-only,
--features embeddings, HuggingFace + fastembed cache) runs alongside the
existing lean test matrix.

CHANGED

kimetsu.mcp_write_tools gate now covers kimetsu_brain_cite alongside
the existing write tools (same env / config / default-true logic for the
local stdio server; remote server remains env-only, default-deny).
Stop hook output includes a per-session token-savings line when the brain
was cited at least once that session.
brain.db schema advanced from v2 → v3 (automatic on first
read-write open; sidecar backup taken per the existing migration policy).
brain export gains --redact / --redact-tags flags (no behavior
change for existing callers).

FIXED

Tune sweep now runs against the production embedder (not the Noop embedder
used in tests), so floor calibration is on real vectors.

Assets 13

10 Jun 18:00

RodCor

v1.0.0

fca78b3

v1.0.0

v1.0.0 — durable migrations, analytics, semantic retrieval, proactive recall

ADDED

Remote cross-encoder rerank stage. kimetsu-remote serve now applies a
cross-encoder reranker to every kimetsu_brain_context call (--reranker,
default jina-reranker-v1-tiny-en, operator-level; "off" disables; any
curated or HuggingFace ONNX id accepted). The default was chosen by the
100-memory benchmark: jina-tiny MRR 0.931 vs 0.914 for TinyBERT on the local
bench; the remote path has no hook-latency budget so the fastest effective
reranker wins. Benchmark lift with reranker: jina-v2-base-code MRR 0.904 →
0.906, bge-small MRR 0.901 → 0.909 (production floors active).
Model-aware AUTO semantic floor + kimetsu-remote benchmark. kimetsu brain bench --remote boots a real kimetsu-remote server per embedder,
drives the 100-case dataset over HTTP MCP (sequential + concurrent), and
reports quality/latency/throughput/server-RSS. Its first run caught a
real bug: the 0.35 cosine floor (calibrated on bge) was KILLING relevant
jina-v2 results on every production path (remote MRR 0.90 -> 0.77).
broker.min_semantic_score now defaults to -1.0 = AUTO: 0.35 on
bge-family models, disabled elsewhere (jina-v2 own precision keeps noise
low); explicit values still win. Confirmed: jina-v2 remote recovered to
MRR 0.906 / recall@4 0.939 on the 100-memory corpus (with the server
reranker: ~416ms/request, ~5 rps at concurrency 4, ~1.2GB peak RSS).
Benchmark-chosen retrieval defaults: jina-v2-base-code + TinyBERT.
kimetsu brain bench (100 real-memory cases) drove the defaults:
embedder jina-v2-base-code (recovers oblique queries bge-small never
pools; ~4x less off-topic noise) + reranker ms-marco-tinybert-l-2-v2
(~43ms, within noise of the best MRR). Existing brains need
kimetsu brain reindex after upgrading (vector dims 384 -> 768); set
embedder.model/embedder.reranker back to taste and re-judge with
kimetsu brain bench. Lean-RAM alternative: bge-small + tinybert.
Local MCP write tools enabled by default (kimetsu.mcp_write_tools).
The brain's own workflow tells the agent to record lessons
(kimetsu_brain_record, the Stop-hook harvest cue), but the privileged-
write gate default-denied unless KIMETSU_MCP_ENABLE_WRITE_TOOLS=1 was
in the MCP server's env — so every session ended with the agent goaded
into a blocked call. The gate is now config-driven for the LOCAL stdio
server: kimetsu.mcp_write_tools (default true), personalizable via
kimetsu config set kimetsu.mcp_write_tools false. Precedence: the env
var when set always wins (both directions) > config > default. The
REMOTE server is unchanged — env-only, default-deny — because a cloned
repo's project.toml is untrusted input and must never enable writes.
Cross-encoder reranking (opt-in) + retrieval eval harness + 300ms
hook budget. The warm daemon can apply a final cross-encoder rerank
stage: over-fetch a 12-capsule pool, score each (query, memory) pair
jointly with a fastembed reranker ([embedder] reranker = a curated id;
local default ms-marco-tinybert-l-2-v2; "off" disables), drop below a 0.30 relevance floor, truncate to the cap.
Every knob was chosen by measurement: jina-reranker-v1-tiny-en
(default) beat the larger turbo model on both quality and speed in the
head-to-head benchmark, a pool of 6 matches pool-12 quality exactly at
half the latency, and summaries must stay FULL (snippet truncation
cratered recall@4 from 0.83 to 0.66 — worse than FTS). With those
settings a warm rerank answers inside the 300ms hook budget on a real
brain (~265ms measured); slower machines degrade gracefully to
floored-FTS for that turn, or set reranker = "off". Backing the
default path, the cosine floor: broker.min_semantic_score is
now ON by default (0.35) and actually wired (it previously defaulted
to 0.0 and was never populated from config in any production path). The
hook's daemon budget tightens 750ms → 300ms; a warm semantic answer
fits with ~70ms to spare, and misses fall back to floored FTS. Daemon
spawn hygiene for the lazy path: the entrypoint now binds the socket
BEFORE loading models (a redundant spawn exits in ms, not seconds), and
on Windows the hook clears HANDLE_FLAG_INHERIT on its std handles before
spawning so the long-lived daemon can never hold the harness's stdout
pipe open (previously the first prompt of a session could stall until
the host's hook timeout). New kimetsu brain eval [--fixture ...]
measures recall@2/4 + MRR across fts / semantic / semantic+rerank modes
against a committed fixture (fixtures/eval-retrieval.json), so ranking
changes are measurable: baseline shows semantic recall@4 0.90 / MRR 0.91
vs FTS 0.72 / 0.81. --rerankers <ids> benchmarks cross-encoders
head-to-head (quality + per-query latency) and --pool sweeps pool
sizes; non-curated models load as user-defined ONNX from any HuggingFace
repo via hf-hub. Benchmark results: jina-tiny recall@4 0.896 / MRR 0.938
at ~44ms per query (pool 6) vs turbo 0.833 / 0.875 at ~137ms (pool 12);
ms-marco TinyBERT-L-2 is ~5× faster still (8.5ms) but its quality
(0.715) merely matches FTS.
Warm embedder daemon — semantic recall at hook time. The
UserPromptSubmit context-hook can now match memories by meaning, not
just lexically. A single per-user daemon (kimetsu brain embed-daemon,
keyed by embedder model) loads the ONNX model once and serves full
embedding/ANN retrieval to the hook over a local socket / named pipe
(interprocess); the hook is a thin client with a ≤300ms budget and a
hard fall-back to floored-FTS, so the prompt is never blocked. kimetsu brain warm (wired to each harness's startup hook) pre-warms it so a
running session never pays a cold model load. One model in RAM regardless
of how many projects/agents are active. Config: [embedder] daemon and
warm_on_start toggle it; [embedder] model (and kimetsu brain model set) pick the model. KIMETSU_EMBED_DAEMON=0 is a runtime kill switch.
Embeddings builds only; lean builds keep the floored-FTS hook. New
kimetsu brain daemon status|stop to inspect/control it.
Durable schema migrations. brain.db now migrates forward
automatically on open via a versioned, forward-only runner (each
migration applied in one transaction). The DB is backed up to a
brain.db.bak-* sidecar before any version-advancing migration
(skipped for empty brains; the three newest backups are kept).
Read-only opens of an un-migrated brain degrade gracefully; an
event-upcast seam keeps old traces replayable. The project.toml
config version is decoupled from the DB schema version so the
schema can evolve without breaking existing projects.
Proof-of-value analytics. New kimetsu brain insights command
and kimetsu_brain_insights MCP tool: retrieval hit-rate &
skip-rate, citation rate, proposal acceptance rate, usefulness
trend, harvest yield, corpus health, and token economy — computed
over a configurable recent-runs window. A new context.served
event records every retrieval (hit or miss); context.injected
now carries injected-token counts.
Semantic retrieval (usearch HNSW ANN). On the embeddings build, an
approximate-nearest-neighbour index (usearch HNSW) finds memories whose
meaning matches the query even with no shared words — O(log N) per
query, so retrieval stays fast as the corpus grows. The index is
candidate generation only; final ranking is an exact cosine rerank over the
stored f32 vectors, so the index can be quantized (f16 by default,
i8/f32 via KIMETSU_ANN_QUANTIZATION) for a large RAM saving with
negligible quality loss. Retrieval is sharpened with embedding-MMR
(collapses true paraphrase duplicates) and an absolute semantic-relevance
floor (genuinely off-topic queries return nothing). Capsule caps are
config-driven (default 8) and injected tokens drop while the relevant
capsule is preserved. The lean (FTS-only) build is unchanged.
Scales to ~1M memories. A million-memory corpus runs on modest
hardware: ~1.8 s p99 semantic retrieval and ~3 GB RAM at 1M (f16
default; ~2.8 GB with KIMETSU_ANN_QUANTIZATION=i8). Both retrieval and
conflict-detection-on-write are O(log N) via the HNSW index — no
brute-force vector scan. Bulk ingest batches embedding; the index builds in
parallel across cores, maintains itself incrementally, persists a sidecar
so a restarted server loads instead of rebuilding, and Kimetsu Remote
pre-warms each repo's index on startup.
Proactive & cost-shrinking recall (the agent brain). Before the
first implementation attempt, a tight retrieval surfaces a "Known
pitfalls" block (failure patterns / conventions) — proactive, not
just post-failure. Tasks are classified (Debug / Feature / Refactor /
Docs / Investigation) to route recall by kind. A per-run recall
ledger deduplicates capsules across stages (rendered once,
back-referenced after), and the long tail is injected as one-line
headlines the agent expands on demand via a new expand_capsule
tool — so brain overhead shrinks in relative terms as tasks grow
(an adaptive sublinear, per-run-capped budget).
kimetsu config edit and kimetsu run abort are now fully
implemented: config edit opens $EDITOR on project.toml and
re-validates on save; run abort cleanly finalizes a dangling run.
No stub subcommands ship.
kimetsu doctor --selftest proves the brain pipeline works
end-to-end (ingest → retrieve → record) without needing a live
model...

Assets 13

03 Jun 03:34

RodCor

v0.9.0

021200f

v0.9.0

v0.9.0 — auto-harvested memories + SessionEnd distiller

ADDED

Credentialed SessionEnd distiller (opt-in). A second, deterministic
memory-harvest path alongside the v0.9.0 in-agent harvester. kimetsu plugin install claude-code and kimetsu plugin install codex now run an
interactive wizard (on a TTY; skip with --no-setup, force with
--setup-harvest) that configures a cheap distiller
model: Anthropic (recommended claude-haiku-4-5), OpenAI (recommended
gpt-5.4-mini), or compatible endpoints via ANTHROPIC_BASE_URL /
OPENAI_BASE_URL. The key + base URL are written to a gitignored .env;
the selection lands in [learning.distiller]
in project.toml. A new SessionEnd hook for Claude Code runs
kimetsu brain session-end-hook; Codex uses its supported Stop hook with
--distill-on-stop. When enabled + credentialed, the distiller reads the
transcript with that model and records lessons through the confidence-gated
propose_or_merge_memory. When the distiller is enabled, the Stop hook's
end-of-session cue is suppressed (the distiller owns end-of-session; the
mid-session PostToolUse resolved-failure cue stays). AnthropicProvider
gained an optional base URL for the LiteLLM case, and the distiller gained
an OpenAI Responses API provider. With --scope global the
wizard configures the distiller once in ~/.kimetsu/ (config + .env); that
global distiller then runs in every project and records into the user brain
(~/.kimetsu/brain.db, available everywhere), with a per-project distiller
taking precedence when one is configured.
Auto-harvested memories (in-agent). The hooks now drive memory
generation, not just retrieval. When a command that failed earlier in the
session succeeds (PostToolUse), or a non-trivial session ends with nothing
recorded (Stop), Kimetsu emits a [kimetsu-harvest] cue telling the agent to
dispatch a new background kimetsu-memory-harvester subagent (installed
at .claude/agents/ for Claude Code and .codex/agents/ for Codex, pinned
to a cheap model). The subagent distills 0–3
generalizable lessons and records them through the confidence-gated
kimetsu_brain_record path — no separate API key or kimetsu-side model
credentials, billed in-agent at the cheap model's rate, non-blocking. Cues
are throttled (at most ~once per resolved failure / once per session) and can
be disabled with [learning] auto_harvest = false in project.toml.

CHANGED

Cleaner help & tool menus. Stripped internal version-history prefixes
(v0.x:, MP-…:) from --help text and the MCP tool/argument descriptions
so menus read as plain present-tense descriptions. (Internal code comments
are unchanged.)

FIXED

Stop hook read the wrong field. The Stop hook counted
kimetsu_brain_record calls from a non-existent inline transcript array;
Claude Code actually passes a transcript_path to a JSONL file. It now reads
the JSONL (with the inline array kept as a fallback), counts both the bare
and MCP-namespaced (mcp__kimetsu__kimetsu_brain_record) tool names, so the
"lessons recorded" banner and end-of-session cue actually fire. The JSONL is
now streamed line-by-line so a long session's multi-MB transcript never
lands in memory at once.
BOM-tolerant install. kimetsu plugin install now strips a leading
UTF-8 BOM before parsing an existing settings.json / hooks.json /
config.toml / .mcp.json, so a config saved by a BOM-emitting editor
(e.g. older Windows Notepad) no longer fails with "expected value at line 1
column 1".
Install polish. The installer now merges Kimetsu's guidance into an
existing CLAUDE.md — workspace .claude/CLAUDE.md or the global
~/.claude/CLAUDE.md — inside /
markers, appending and upgrading in place, never overwriting the user's
content. --force no longer overwrites CLAUDE.md (the whole install is
idempotent and non-destructive; the flag is retained only for compatibility).
A --scope global on the workspace-only kimetsu target warns instead of
silently doing nothing; --workspace is canonicalized leniently so a global
install doesn't fail on a missing workspace path.

Assets 9

02 Jun 15:16

RodCor

v0.8.4

6e1465e

v0.8.4

v0.8.4 — non-destructive plugin install + global scope

ADDED

Global plugin install. kimetsu plugin install <target> --scope global
installs the Kimetsu surface into the user's home for every session —
~/.claude/ + ~/.claude.json (mcpServers) for Claude Code, and
~/.codex/ for Codex — instead of the workspace. --scope defaults to
workspace (the prior behavior). Also exposed as the scope argument on
the kimetsu_plugin_install MCP tool.

FIXED

Hook install no longer clobbers existing hooks. kimetsu plugin install
now merges its hooks into existing Claude settings.json / Codex
hooks.json instead of replacing them. Hooks you already have — even on the
same events Kimetsu uses (UserPromptSubmit, PreToolUse, …) — are
preserved, with Kimetsu's group added alongside. Re-running install is
idempotent (no duplicate groups) and the MCP config + generated docs refresh
without requiring --force.

Assets 9

01 Jun 21:24

github-actions

v0.8.3

2d8211d

v0.8.3

v0.8.3 — npm distribution

ADDED

npm distribution. Kimetsu now publishes to npm — npm install -g kimetsu-ai
installs the prebuilt native binary for your platform, no Rust toolchain
required. Uses the esbuild/turbo model: per-platform packages
(@kimetsu-ai/linux-x64, @kimetsu-ai/darwin-x64, @kimetsu-ai/darwin-arm64,
@kimetsu-ai/win32-x64) selected via optionalDependencies, with a thin
bin/cli.js launcher that execs the matching binary. No postinstall, so it
works under npm install --ignore-scripts. The semantic build is fetched on
demand when KIMETSU_NPM_FLAVOR=embeddings is set. Published from the
existing release pipeline (publish-npm job, gated on the PUBLISH_NPM
repo variable + NPM_TOKEN secret, mirroring the crates.io gate). Sources
live in npm/. Installs the kimetsu command (also available as
kimetsu-ai).

FIXED

Windows npm package. The publish-npm job now extracts the Windows
archive with 7z instead of unzip (PowerShell Compress-Archive uses
backslash path separators that unzip flattens), and publishing is
idempotent so a re-run after a partial failure skips already-published
versions.

Assets 9

01 Jun 21:05

github-actions

v0.8.2

522f184

v0.8.2

v0.8.2 — npm distribution

ADDED

npm distribution. Kimetsu now publishes to npm — npm install -g kimetsu-ai
installs the prebuilt native binary for your platform, no Rust toolchain
required. Uses the esbuild/turbo model: per-platform packages
(@kimetsu-ai/linux-x64, @kimetsu-ai/darwin-x64, @kimetsu-ai/darwin-arm64,
@kimetsu-ai/win32-x64) selected via optionalDependencies, with a thin
bin/cli.js launcher that execs the matching binary. No postinstall, so it
works under npm install --ignore-scripts. The semantic build is fetched on
demand when KIMETSU_NPM_FLAVOR=embeddings is set. Published from the
existing release pipeline (publish-npm job, gated on the PUBLISH_NPM
repo variable + NPM_TOKEN secret, mirroring the crates.io gate). Sources
live in npm/. Installs the kimetsu command (also available as
kimetsu-ai).

Assets 9

01 Jun 20:30

github-actions

v0.8.1

75f9b40

v0.8.1

v0.8.1 — npm distribution

ADDED

npm distribution. Kimetsu now publishes to npm — npm install -g kimetsu
installs the prebuilt native binary for your platform, no Rust toolchain
required. Uses the esbuild/turbo model: per-platform packages
(@kimetsu/linux-x64, @kimetsu/darwin-x64, @kimetsu/darwin-arm64,
@kimetsu/win32-x64) selected via optionalDependencies, with a thin
bin/cli.js launcher that execs the matching binary. No postinstall, so it
works under npm install --ignore-scripts. The semantic build is fetched on
demand when KIMETSU_NPM_FLAVOR=embeddings is set. Published from the
existing release pipeline (new publish-npm job, gated on the PUBLISH_NPM
repo variable + NPM_TOKEN secret, mirroring the crates.io gate). Sources
live in npm/.

Assets 9

31 May 02:25

RodCor

v0.8.0

6692fa4

v0.8.0

v0.8.0 — proactive recall, selectable embedding model, full MCP control

The release that makes the brain proactive and gives the agent (and user)
full control over it from inside Claude Code / Codex.

ADDED

Proactive recall (mid-work). New PreToolUse / PostToolUse Bash hooks
surface a relevant memory while the agent works, not just on prompt:
- after a failed Bash command, surface a matching failure_pattern /
  command fix;
- before a risky Bash command, warn from a matching failure_pattern;
- on a repeated failing command (loop), lower the score floor and bypass
  the throttle so help arrives sooner.
  Retrieval is lexical-FTS-only (no embedding-model load), gated by a high
  score floor (0.45; loop 0.35), capped at one capsule, with per-session
  dedupe + a refractory throttle. Token cost stays ~0 (silent when nothing
  qualifies). Wired into both Claude Code (.claude/settings.json) and Codex
  (.codex/hooks.json) with a matcher: "Bash"; opt out with
  kimetsu plugin install --no-proactive (or proactive:false over MCP).
  Per-session state lives in <repo>/.kimetsu/proactive/<session_id>.json
  and is garbage-collected after 7 days.
Selectable embedding model. New [embedder] table in project.toml
(precedence: KIMETSU_BRAIN_EMBEDDER env > config > default). Inspect and
change it with kimetsu brain model list / kimetsu brain model set <id>
(the latter writes the config and re-embeds the corpus with the new model).
Curated built-ins: bge-small-en-v1.5 (384d, default), bge-m3 (1024d),
jina-v2-base-code (768d).
Full MCP control surface. New tools so an agent can manage the brain
without leaving Claude Code / Codex: kimetsu_brain_model_list /
kimetsu_brain_model_set (re-embeds in-process with the new model),
kimetsu_brain_reindex, kimetsu_brain_memory_search (FTS over memory
text), kimetsu_brain_conflict_resolve, kimetsu_brain_prune, and
kimetsu_brain_config_show. kimetsu_brain_memory_list and
..._memory_proposals gained limit/offset pagination.

CHANGED

reindex can now run against an explicit embedder
(reindex_all_with_embedder), so a model switch re-embeds with the chosen
model regardless of the process's cached default embedder.

FIXED

Test isolation. Tests created project roots under the system temp dir;
on a machine whose $HOME is itself a git repo, ProjectPaths::discover
climbed to $HOME and made parallel tests share one brain.db +
project.lock. Each test root now gets its own git boundary, so plain
cargo test is hermetic.

Assets 9

Releases: RodCor/kimetsu

v2.0.0

v2.0.0 — Never explore twice

Flagship — Session warm-start (never re-establish your work)

Flagship — The active brain (never wait on the brain)

Flagship — Memory → skill synthesis (never re-derive a solution)

Local-model independence

Pluggable storage backends

Continuous self-tuning + ROI v2

Personal brain sync

Hardening & release

Uh oh!

v1.5.1

v1.5.1 — version-stamp re-release

Uh oh!

v1.5.0

v1.5.0 — pays for itself

Uh oh!

v1.0.0

v1.0.0 — durable migrations, analytics, semantic retrieval, proactive recall

Uh oh!

v0.9.0

v0.9.0 — auto-harvested memories + SessionEnd distiller

Uh oh!

v0.8.4

v0.8.4 — non-destructive plugin install + global scope

Uh oh!

v0.8.3

v0.8.3 — npm distribution

Uh oh!

v0.8.2

v0.8.2 — npm distribution

Uh oh!

v0.8.1

v0.8.1 — npm distribution

Uh oh!

v0.8.0

v0.8.0 — proactive recall, selectable embedding model, full MCP control

Uh oh!