Skip to content

Releases: RodCor/kimetsu

v2.0.0

22 Jun 06:13
35d115e

Choose a tag to compare

v2.0.0 — Never explore twice

The biggest token sink is RE-EXPLORATION — the agent re-deriving what the brain
(or the repo) already knows. v2.0 attacks it from three flagship directions and
adds a pluggable storage tier. Backward-compatible: existing project.toml and
brain.db files upgrade in place (schema v3 → v6, automatic on open).

Flagship — Session warm-start (never re-establish your work)

ADDED

  • Episodic work-resume. A new work_episode record (event-sourced,
    schema v5) captures your working state at SessionEnd — the task, what you
    did, what FAILED and why (dead-ends), open threads, and the working
    hypothesis — scoped per-repo (one live episode per repo). kimetsu resume
    prints it; kimetsu checkpoint [note] saves one manually. Distilled via
    the cheap model when configured, else a rule-based summary (never blocks).
    Episodes are LOCAL-ONLY and never sync/export.
  • Project digest. kimetsu brain digest [--refresh] assembles a compact
    (~400-token) digest — repo manifest + top-usefulness memories + current
    focus — cached at .kimetsu/digest.md by content hash, refreshed on git/
    corpus drift. (The digest is currently rule-based; a cheap-model
    distillation hook-point is present but not yet wired.)
  • SessionStart injection. A new kimetsu brain session-start-hook emits
    the digest + resume as additionalContext, gated by [broker] warm_start
    (default on). Wired for Claude Code; other hosts' SessionStart context
    surface is not yet verified and is intentionally left unwired.

Flagship — The active brain (never wait on the brain)

ADDED

  • kimetsu brain ask "<question>". Terminal Q&A answered entirely from
    the brain — full retrieval + a grounded answer composed by a LOCAL/cheap
    model, with memory-id citations. Zero frontier tokens; works offline. Falls
    back to distiller credentials, then to verbatim top-capsules when no model
    is configured. Grounded-only: refuses rather than hallucinating when memory
    has no answer. --json; an answer marked helpful counts like a citation.
  • kimetsu_brain_answer MCP tool. A read-only synthesis tool the host
    agent can call mid-task ("what do we know about X?") for a grounded, cited
    brief instead of re-discovering.
  • Command recall fast-path. "How do I…" queries surface matching
    command-kind memories runnable-line-first.
  • Answer-grade injection. Very-high-confidence capsules (score ≥
    [broker] answer_grade_min_score, default 0.92) are prefixed "Verified
    answer from project memory:" so the model can act in one turn. Render-time
    only — ranking is untouched — and suppressed when the memory was recently a
    floor-drop regret.
  • Proactive pre-fetch. Opt-in [broker] proactive_prefetch (default off)
    warms trajectory-relevant memories at PreToolUse.

Flagship — Memory → skill synthesis (never re-derive a solution)

ADDED

  • Skill synthesis. A memory cited ≥3 times (or a tight cluster) becomes a
    synthesis candidate; kimetsu brain skills [--review] drafts an executable
    skill from it — grounded strictly in the cited memories — and, on explicit
    accept, installs it into the host-native skill dir with provenance back to
    the source memory ids. Propose-only (never auto-installed); flagged stale
    when a source memory is superseded. Schema v6 (skill_proposals).

Local-model independence

ADDED

  • Ollama as a first-class provider (provider = "ollama" — OpenAI-compat,
    localhost:11434/v1 default, no key) and a single optional [cheap_model]
    config that all cheap-model consumers resolve (distiller, consolidation,
    digest, resume, skill draft, ask). Back-compatible with
    [learning.distiller]; OPTIONAL everywhere — every consumer degrades
    gracefully when no model is configured. kimetsu doctor probes the local
    endpoint. New guide: docs/LOCAL-MODELS.md (fully-local = zero external
    calls).

Pluggable storage backends

ADDED

  • RetrievalBackend trait with [storage] backend = "flat" | "graph-lite" | "graph" (default flat). The broker stays backend-agnostic; switching
    backends re-projects from the event log.
  • Graph-lite (Tier 1) — a typed-edge projection (memory_edges, schema
    v4) with 1–2 hop expansion blended as graph-provenance candidates (a strict
    superset of flat — no recall loss; the broker still filters). supersedes
    edges populate now; episode-sourced edge types are reserved.
  • Petgraph (Tier 2, remote-only) behind the graph feature (off in local
    lean/embeddings builds — petgraph is never compiled there). In-memory graph
    with centrality / shortest-path / community-detection helpers, plus a
    cross-backend benchmark harness. Spike verdict: an embedded graph DB
    (Kùzu/Cozo) is not justified through ~100k memories.

Continuous self-tuning + ROI v2

ADDED

  • Re-tune triggers (≥50 new memories since last tune, or elevated regret
    rate) surfaced via kimetsu brain tune --status + a Stop-hook one-liner —
    proposed, never auto-applied. Model re-selection advisor
    (tune --models) with reindex/download cost stated. Regret-driven
    objective
    : the tune objective now penalizes floor configs that produced
    regrets. ROI v2: per-memory ROI (roi --top), output-token accounting,
    and warm-start (digest_served/resume_served) savings attribution.

Personal brain sync

ADDED

  • Event-log replication (not file copying). kimetsu brain sync export --since <cursor> / import move durable memory-lifecycle events as
    portable JSONL with per-event idempotency; telemetry, raw queries, and
    local-only episodes are excluded by an allowlist; redaction is respected.
    [sync] dir drives a server-less directory protocol (Dropbox/Syncthing/NAS)
    with per-machine batches and per-source cursors; --dry-run and
    --status throughout. Object-storage backends (e.g. S3) are deferred.

Hardening & release

FIXED

  • Analytics/doctor active counts, reindex, peek/undo, and memory list now
    consistently exclude superseded rows; config set/tune --apply preserve
    toml comments + unknown keys (toml_edit); dropped-capsule + proactive-state
    sidecars write atomically. Cursor/Gemini MCP schemas verified against live
    docs.
  • Release pipeline: a version-guard job fails in seconds on a
    tag/workspace-version mismatch (and the built binary must self-report the
    tag) — closing the gap that produced the v1.5.0 botch; core GitHub Actions
    bumped to Node-24-ready versions; scripts/bump-version.sh codifies the
    one-step version bump.

KNOWN LIMITATIONS

  • SessionStart warm-start is wired for Claude Code only; the digest is
    rule-based pending the cheap-model distillation wiring; full retrieval-
    quality and first-turn-token benchmarks require the embeddings + Docker
    environment and were not run in CI; remote-bench process isolation (#23)
    remains open.

v1.5.1

12 Jun 18:22
acdeaf2

Choose a tag to compare

v1.5.1 — version-stamp re-release

Identical feature set to v1.5.0. The v1.5.0 artifacts were built before the
workspace version bump, so their binaries self-report 1.0.0 (which also
broke the crates.io publish — kimetsu-core@1.0.0 already existed). npm
forbids reusing a published version, so the corrected release ships as
v1.5.1. If you installed v1.5.0 from npm or the GitHub release, update.

FIXED

  • Workspace and inter-crate versions stamped correctly (kimetsu --version
    now reports the release version; crates.io publish unblocked).

v1.5.0

12 Jun 16:55
c718b2e

Choose a tag to compare

Warning

Superseded by v1.5.1 — these artifacts were built before the workspace version bump and self-report 1.0.0. Identical feature set; install v1.5.1 instead.

v1.5.0 — pays for itself

ADDED

  • Telemetry capture. Raw query text is now stored in context.served
    telemetry events when [learning] store_queries = true (default; set
    false to revert to the pre-v1.5 query-hash-only behavior). A
    session_id field is also written when the host provides one (Claude Code
    hooks; absent for hosts that do not emit it). Dropped-capsule sidecar
    (~/.kimetsu/cache/<hash>/dropped-recent.json) records capsules that
    were floor-filtered out so that a later citation of one is detected as a
    retrieval.regret event, feeding the self-tuning loop. All telemetry
    stays on-machine; nothing is exported.

  • ROI ledger — kimetsu brain roi. Conservative per-kind token-savings
    estimates (failure_pattern=1500, command=400, convention=300, fact=500,
    preference=200 tokens per citation) minus brain-injection overhead give a
    net-positive / net-negative verdict. Dollar estimates are shown when the
    active model is recognized from a built-in price table (Claude 3/4, GPT-4/5
    families, Bedrock routing prefixes) or when [model] price_per_mtok is set
    in project.toml. --window 7d|30d|all and --json for scripting. The
    Stop hook appends a per-session savings sentence when ≥1 citation occurred;
    zero-citation sessions are silent. Calibration methodology and honest
    limitations: docs/ROI-METHODOLOGY.md.

  • Token budget — render-time capsule compression and session dedupe.
    Two [broker] toggles, both default true:

    • compress_capsules: capsule summaries are compressed at render time
      (strips [tags: ...] / (context: ...) annotations, caps at 3
      sentences). Ranking is never affected — this runs only after retrieval
      and reranking. Set false to inject full memory text.
    • session_dedupe: the UserPromptSubmit hook skips capsules whose
      handle was already injected earlier in the same session (tracked via
      the proactive-state sidecar). Soft policy: falls back to injecting all
      capsules when dedupe would produce an empty set. This addresses the
      pre-v1.5 behavior where the main hook re-injected the same top capsule
      on every prompt of a long session.
  • Self-Tuning Brain — kimetsu_brain_cite MCP tool + kimetsu brain tune.
    kimetsu_brain_cite is a new write-gated MCP tool that records a
    memory.cited event from inside an MCP session, closing the ground-truth
    gap when the model leans on a memory but doesn't explicitly call
    cite_memory. Personal eval set: tuneset builds positive eval cases from
    context.served + citation joins (exact session_id or ±30-minute window
    fallback). kimetsu brain tune --status reports case count and kind
    coverage. kimetsu brain tune (dry-run by default) sweeps broker.min_lexical_coverage
    ∈ {0.3, 0.4, 0.5, 0.6} × broker.min_semantic_score ∈ {-1.0(AUTO), 0.0, 0.25,
    0.35, 0.45} against the production embedder; --apply writes only the
    floor parameters (not the reranker — that change is recommended separately);
    --revert restores the previous tune-history entry. A holdout guardrail
    (deterministic 20% split) prevents writing a config that regresses the
    holdout objective.

  • Consolidation — kimetsu brain consolidate + kimetsu brain triage.
    Schema migrated to v3 (superseded_by column + index on memories).
    brain consolidate (Story 3.1, default): brute-force cosine scan within the
    same embedding model; clusters at ≥ 0.92 cosine (configurable with
    --threshold) are merged — survivor keeps its id and text, members get
    superseded_by set and a memory.superseded event written (rebuild-safe);
    citations are reassigned to the survivor. --distill (Story 3.2): looser
    clusters (0.75–0.85 cosine band, ≥3 memories, ≥1 shared tag) are fed to
    the configured distiller; result lands as a memory proposal for human
    review; prints clusters and exits 0 when no distiller is configured.
    brain triage (Story 3.3): interactive per-item keep / prune / skip of
    memories below a usefulness and age threshold (--score-floor 0.2,
    --age-days 30); --prune-all --yes for batch non-interactive pruning.

  • Reach — export redact, Cursor + Gemini CLI installers, CI embeddings job.
    kimetsu brain export --redact strips the (context: …) segment from
    exported memory text; --redact-tags (requires --redact) additionally
    strips the [tags: …] prefix. Both flags are useful for sharing brains
    without leaking workspace-specific file paths or tag metadata.
    kimetsu plugin install cursor and kimetsu plugin install gemini-cli
    write MCP config (.cursor/mcp.json / .gemini/settings.json) and an
    always-on guidance file (.cursor/rules/kimetsu-brain/rule.md for workspace
    installs; GEMINI.md merged into the project root or ~/.gemini/GEMINI.md
    for global installs). Neither host has a UserPromptSubmit-style hook
    system, so MCP + the guidance file are the complete integration surface.
    Note: Cursor and Gemini CLI config schemas were inferred from their public
    docs as of June 2026 and have not been verified against live host builds —
    treat these installers as a starting-point and check the generated files
    before committing them.
    CI: a new test-embeddings job (ubuntu-only,
    --features embeddings, HuggingFace + fastembed cache) runs alongside the
    existing lean test matrix.

CHANGED

  • kimetsu.mcp_write_tools gate now covers kimetsu_brain_cite alongside
    the existing write tools (same env / config / default-true logic for the
    local stdio server; remote server remains env-only, default-deny).
  • Stop hook output includes a per-session token-savings line when the brain
    was cited at least once that session.
  • brain.db schema advanced from v2 → v3 (automatic on first
    read-write open; sidecar backup taken per the existing migration policy).
  • brain export gains --redact / --redact-tags flags (no behavior
    change for existing callers).

FIXED

  • Tune sweep now runs against the production embedder (not the Noop embedder
    used in tests), so floor calibration is on real vectors.

v1.0.0

10 Jun 18:00
fca78b3

Choose a tag to compare

v1.0.0 — durable migrations, analytics, semantic retrieval, proactive recall

ADDED

  • Remote cross-encoder rerank stage. kimetsu-remote serve now applies a
    cross-encoder reranker to every kimetsu_brain_context call (--reranker,
    default jina-reranker-v1-tiny-en, operator-level; "off" disables; any
    curated or HuggingFace ONNX id accepted). The default was chosen by the
    100-memory benchmark: jina-tiny MRR 0.931 vs 0.914 for TinyBERT on the local
    bench; the remote path has no hook-latency budget so the fastest effective
    reranker wins. Benchmark lift with reranker: jina-v2-base-code MRR 0.904 →
    0.906, bge-small MRR 0.901 → 0.909 (production floors active).

  • Model-aware AUTO semantic floor + kimetsu-remote benchmark. kimetsu brain bench --remote boots a real kimetsu-remote server per embedder,
    drives the 100-case dataset over HTTP MCP (sequential + concurrent), and
    reports quality/latency/throughput/server-RSS. Its first run caught a
    real bug: the 0.35 cosine floor (calibrated on bge) was KILLING relevant
    jina-v2 results on every production path (remote MRR 0.90 -> 0.77).
    broker.min_semantic_score now defaults to -1.0 = AUTO: 0.35 on
    bge-family models, disabled elsewhere (jina-v2 own precision keeps noise
    low); explicit values still win. Confirmed: jina-v2 remote recovered to
    MRR 0.906 / recall@4 0.939 on the 100-memory corpus (with the server
    reranker: ~416ms/request, ~5 rps at concurrency 4, ~1.2GB peak RSS).

  • Benchmark-chosen retrieval defaults: jina-v2-base-code + TinyBERT.
    kimetsu brain bench (100 real-memory cases) drove the defaults:
    embedder jina-v2-base-code (recovers oblique queries bge-small never
    pools; ~4x less off-topic noise) + reranker ms-marco-tinybert-l-2-v2
    (~43ms, within noise of the best MRR). Existing brains need
    kimetsu brain reindex after upgrading (vector dims 384 -> 768); set
    embedder.model/embedder.reranker back to taste and re-judge with
    kimetsu brain bench. Lean-RAM alternative: bge-small + tinybert.

  • Local MCP write tools enabled by default (kimetsu.mcp_write_tools).
    The brain's own workflow tells the agent to record lessons
    (kimetsu_brain_record, the Stop-hook harvest cue), but the privileged-
    write gate default-denied unless KIMETSU_MCP_ENABLE_WRITE_TOOLS=1 was
    in the MCP server's env — so every session ended with the agent goaded
    into a blocked call. The gate is now config-driven for the LOCAL stdio
    server: kimetsu.mcp_write_tools (default true), personalizable via
    kimetsu config set kimetsu.mcp_write_tools false. Precedence: the env
    var when set always wins (both directions) > config > default. The
    REMOTE server is unchanged — env-only, default-deny — because a cloned
    repo's project.toml is untrusted input and must never enable writes.

  • Cross-encoder reranking (opt-in) + retrieval eval harness + 300ms
    hook budget.
    The warm daemon can apply a final cross-encoder rerank
    stage: over-fetch a 12-capsule pool, score each (query, memory) pair
    jointly with a fastembed reranker ([embedder] reranker = a curated id;
    local default ms-marco-tinybert-l-2-v2; "off" disables), drop below a 0.30 relevance floor, truncate to the cap.
    Every knob was chosen by measurement: jina-reranker-v1-tiny-en
    (default) beat the larger turbo model on both quality and speed in the
    head-to-head benchmark, a pool of 6 matches pool-12 quality exactly at
    half the latency, and summaries must stay FULL (snippet truncation
    cratered recall@4 from 0.83 to 0.66 — worse than FTS). With those
    settings a warm rerank answers inside the 300ms hook budget on a real
    brain (~265ms measured); slower machines degrade gracefully to
    floored-FTS for that turn, or set reranker = "off". Backing the
    default path, the cosine floor: broker.min_semantic_score is
    now ON by default (0.35) and actually wired (it previously defaulted
    to 0.0 and was never populated from config in any production path). The
    hook's daemon budget tightens 750ms → 300ms; a warm semantic answer
    fits with ~70ms to spare, and misses fall back to floored FTS. Daemon
    spawn hygiene for the lazy path: the entrypoint now binds the socket
    BEFORE loading models (a redundant spawn exits in ms, not seconds), and
    on Windows the hook clears HANDLE_FLAG_INHERIT on its std handles before
    spawning so the long-lived daemon can never hold the harness's stdout
    pipe open (previously the first prompt of a session could stall until
    the host's hook timeout). New kimetsu brain eval [--fixture ...]
    measures recall@2/4 + MRR across fts / semantic / semantic+rerank modes
    against a committed fixture (fixtures/eval-retrieval.json), so ranking
    changes are measurable: baseline shows semantic recall@4 0.90 / MRR 0.91
    vs FTS 0.72 / 0.81. --rerankers <ids> benchmarks cross-encoders
    head-to-head (quality + per-query latency) and --pool sweeps pool
    sizes; non-curated models load as user-defined ONNX from any HuggingFace
    repo via hf-hub. Benchmark results: jina-tiny recall@4 0.896 / MRR 0.938
    at ~44ms per query (pool 6) vs turbo 0.833 / 0.875 at ~137ms (pool 12);
    ms-marco TinyBERT-L-2 is ~5× faster still (8.5ms) but its quality
    (0.715) merely matches FTS.

  • Warm embedder daemon — semantic recall at hook time. The
    UserPromptSubmit context-hook can now match memories by meaning, not
    just lexically. A single per-user daemon (kimetsu brain embed-daemon,
    keyed by embedder model) loads the ONNX model once and serves full
    embedding/ANN retrieval to the hook over a local socket / named pipe
    (interprocess); the hook is a thin client with a ≤300ms budget and a
    hard fall-back to floored-FTS, so the prompt is never blocked. kimetsu brain warm (wired to each harness's startup hook) pre-warms it so a
    running session never pays a cold model load. One model in RAM regardless
    of how many projects/agents are active. Config: [embedder] daemon and
    warm_on_start toggle it; [embedder] model (and kimetsu brain model set) pick the model. KIMETSU_EMBED_DAEMON=0 is a runtime kill switch.
    Embeddings builds only; lean builds keep the floored-FTS hook. New
    kimetsu brain daemon status|stop to inspect/control it.

  • Durable schema migrations. brain.db now migrates forward
    automatically on open via a versioned, forward-only runner (each
    migration applied in one transaction). The DB is backed up to a
    brain.db.bak-* sidecar before any version-advancing migration
    (skipped for empty brains; the three newest backups are kept).
    Read-only opens of an un-migrated brain degrade gracefully; an
    event-upcast seam keeps old traces replayable. The project.toml
    config version is decoupled from the DB schema version so the
    schema can evolve without breaking existing projects.

  • Proof-of-value analytics. New kimetsu brain insights command
    and kimetsu_brain_insights MCP tool: retrieval hit-rate &
    skip-rate, citation rate, proposal acceptance rate, usefulness
    trend, harvest yield, corpus health, and token economy — computed
    over a configurable recent-runs window. A new context.served
    event records every retrieval (hit or miss); context.injected
    now carries injected-token counts.

  • Semantic retrieval (usearch HNSW ANN). On the embeddings build, an
    approximate-nearest-neighbour index (usearch HNSW) finds memories whose
    meaning matches the query even with no shared words — O(log N) per
    query
    , so retrieval stays fast as the corpus grows. The index is
    candidate generation only; final ranking is an exact cosine rerank over the
    stored f32 vectors, so the index can be quantized (f16 by default,
    i8/f32 via KIMETSU_ANN_QUANTIZATION) for a large RAM saving with
    negligible quality loss. Retrieval is sharpened with embedding-MMR
    (collapses true paraphrase duplicates) and an absolute semantic-relevance
    floor (genuinely off-topic queries return nothing). Capsule caps are
    config-driven (default 8) and injected tokens drop while the relevant
    capsule is preserved. The lean (FTS-only) build is unchanged.

  • Scales to ~1M memories. A million-memory corpus runs on modest
    hardware: ~1.8 s p99 semantic retrieval and ~3 GB RAM at 1M (f16
    default; ~2.8 GB with KIMETSU_ANN_QUANTIZATION=i8). Both retrieval and
    conflict-detection-on-write are O(log N) via the HNSW index — no
    brute-force vector scan. Bulk ingest batches embedding; the index builds in
    parallel across cores, maintains itself incrementally, persists a sidecar
    so a restarted server loads instead of rebuilding, and Kimetsu Remote
    pre-warms each repo's index on startup.

  • Proactive & cost-shrinking recall (the agent brain). Before the
    first implementation attempt, a tight retrieval surfaces a "Known
    pitfalls" block (failure patterns / conventions) — proactive, not
    just post-failure. Tasks are classified (Debug / Feature / Refactor /
    Docs / Investigation) to route recall by kind. A per-run recall
    ledger deduplicates capsules across stages (rendered once,
    back-referenced after), and the long tail is injected as one-line
    headlines the agent expands on demand via a new expand_capsule
    tool — so brain overhead shrinks in relative terms as tasks grow
    (an adaptive sublinear, per-run-capped budget).

  • kimetsu config edit and kimetsu run abort are now fully
    implemented: config edit opens $EDITOR on project.toml and
    re-validates on save; run abort cleanly finalizes a dangling run.
    No stub subcommands ship.

  • kimetsu doctor --selftest proves the brain pipeline works
    end-to-end (ingest → retrieve → record) without needing a live
    model...

Read more

v0.9.0

03 Jun 03:34
021200f

Choose a tag to compare

v0.9.0 — auto-harvested memories + SessionEnd distiller

ADDED

  • Credentialed SessionEnd distiller (opt-in). A second, deterministic
    memory-harvest path alongside the v0.9.0 in-agent harvester. kimetsu plugin install claude-code and kimetsu plugin install codex now run an
    interactive wizard (on a TTY; skip with --no-setup, force with
    --setup-harvest) that configures a cheap distiller
    model: Anthropic (recommended claude-haiku-4-5), OpenAI (recommended
    gpt-5.4-mini), or compatible endpoints via ANTHROPIC_BASE_URL /
    OPENAI_BASE_URL. The key + base URL are written to a gitignored .env;
    the selection lands in [learning.distiller]
    in project.toml. A new SessionEnd hook for Claude Code runs
    kimetsu brain session-end-hook; Codex uses its supported Stop hook with
    --distill-on-stop. When enabled + credentialed, the distiller reads the
    transcript with that model and records lessons through the confidence-gated
    propose_or_merge_memory. When the distiller is enabled, the Stop hook's
    end-of-session cue is suppressed (the distiller owns end-of-session; the
    mid-session PostToolUse resolved-failure cue stays). AnthropicProvider
    gained an optional base URL for the LiteLLM case, and the distiller gained
    an OpenAI Responses API provider. With --scope global the
    wizard configures the distiller once in ~/.kimetsu/ (config + .env); that
    global distiller then runs in every project and records into the user brain
    (~/.kimetsu/brain.db, available everywhere), with a per-project distiller
    taking precedence when one is configured.
  • Auto-harvested memories (in-agent). The hooks now drive memory
    generation, not just retrieval. When a command that failed earlier in the
    session succeeds (PostToolUse), or a non-trivial session ends with nothing
    recorded (Stop), Kimetsu emits a [kimetsu-harvest] cue telling the agent to
    dispatch a new background kimetsu-memory-harvester subagent (installed
    at .claude/agents/ for Claude Code and .codex/agents/ for Codex, pinned
    to a cheap model). The subagent distills 0–3
    generalizable lessons and records them through the confidence-gated
    kimetsu_brain_record path — no separate API key or kimetsu-side model
    credentials, billed in-agent at the cheap model's rate, non-blocking. Cues
    are throttled (at most ~once per resolved failure / once per session) and can
    be disabled with [learning] auto_harvest = false in project.toml.

CHANGED

  • Cleaner help & tool menus. Stripped internal version-history prefixes
    (v0.x:, MP-…:) from --help text and the MCP tool/argument descriptions
    so menus read as plain present-tense descriptions. (Internal code comments
    are unchanged.)

FIXED

  • Stop hook read the wrong field. The Stop hook counted
    kimetsu_brain_record calls from a non-existent inline transcript array;
    Claude Code actually passes a transcript_path to a JSONL file. It now reads
    the JSONL (with the inline array kept as a fallback), counts both the bare
    and MCP-namespaced (mcp__kimetsu__kimetsu_brain_record) tool names, so the
    "lessons recorded" banner and end-of-session cue actually fire. The JSONL is
    now streamed line-by-line so a long session's multi-MB transcript never
    lands in memory at once.
  • BOM-tolerant install. kimetsu plugin install now strips a leading
    UTF-8 BOM before parsing an existing settings.json / hooks.json /
    config.toml / .mcp.json, so a config saved by a BOM-emitting editor
    (e.g. older Windows Notepad) no longer fails with "expected value at line 1
    column 1".
  • Install polish. The installer now merges Kimetsu's guidance into an
    existing CLAUDE.md — workspace .claude/CLAUDE.md or the global
    ~/.claude/CLAUDE.md — inside <!-- kimetsu:begin -->/<!-- kimetsu:end -->
    markers, appending and upgrading in place, never overwriting the user's
    content. --force no longer overwrites CLAUDE.md (the whole install is
    idempotent and non-destructive; the flag is retained only for compatibility).
    A --scope global on the workspace-only kimetsu target warns instead of
    silently doing nothing; --workspace is canonicalized leniently so a global
    install doesn't fail on a missing workspace path.

v0.8.4

02 Jun 15:16
6e1465e

Choose a tag to compare

v0.8.4 — non-destructive plugin install + global scope

ADDED

  • Global plugin install. kimetsu plugin install <target> --scope global
    installs the Kimetsu surface into the user's home for every session —
    ~/.claude/ + ~/.claude.json (mcpServers) for Claude Code, and
    ~/.codex/ for Codex — instead of the workspace. --scope defaults to
    workspace (the prior behavior). Also exposed as the scope argument on
    the kimetsu_plugin_install MCP tool.

FIXED

  • Hook install no longer clobbers existing hooks. kimetsu plugin install
    now merges its hooks into existing Claude settings.json / Codex
    hooks.json instead of replacing them. Hooks you already have — even on the
    same events Kimetsu uses (UserPromptSubmit, PreToolUse, …) — are
    preserved, with Kimetsu's group added alongside. Re-running install is
    idempotent (no duplicate groups) and the MCP config + generated docs refresh
    without requiring --force.

v0.8.3

01 Jun 21:24
2d8211d

Choose a tag to compare

v0.8.3 — npm distribution

ADDED

  • npm distribution. Kimetsu now publishes to npm — npm install -g kimetsu-ai
    installs the prebuilt native binary for your platform, no Rust toolchain
    required. Uses the esbuild/turbo model: per-platform packages
    (@kimetsu-ai/linux-x64, @kimetsu-ai/darwin-x64, @kimetsu-ai/darwin-arm64,
    @kimetsu-ai/win32-x64) selected via optionalDependencies, with a thin
    bin/cli.js launcher that execs the matching binary. No postinstall, so it
    works under npm install --ignore-scripts. The semantic build is fetched on
    demand when KIMETSU_NPM_FLAVOR=embeddings is set. Published from the
    existing release pipeline (publish-npm job, gated on the PUBLISH_NPM
    repo variable + NPM_TOKEN secret, mirroring the crates.io gate). Sources
    live in npm/. Installs the kimetsu command (also available as
    kimetsu-ai).

FIXED

  • Windows npm package. The publish-npm job now extracts the Windows
    archive with 7z instead of unzip (PowerShell Compress-Archive uses
    backslash path separators that unzip flattens), and publishing is
    idempotent so a re-run after a partial failure skips already-published
    versions.

v0.8.2

01 Jun 21:05
522f184

Choose a tag to compare

v0.8.2 — npm distribution

ADDED

  • npm distribution. Kimetsu now publishes to npm — npm install -g kimetsu-ai
    installs the prebuilt native binary for your platform, no Rust toolchain
    required. Uses the esbuild/turbo model: per-platform packages
    (@kimetsu-ai/linux-x64, @kimetsu-ai/darwin-x64, @kimetsu-ai/darwin-arm64,
    @kimetsu-ai/win32-x64) selected via optionalDependencies, with a thin
    bin/cli.js launcher that execs the matching binary. No postinstall, so it
    works under npm install --ignore-scripts. The semantic build is fetched on
    demand when KIMETSU_NPM_FLAVOR=embeddings is set. Published from the
    existing release pipeline (publish-npm job, gated on the PUBLISH_NPM
    repo variable + NPM_TOKEN secret, mirroring the crates.io gate). Sources
    live in npm/. Installs the kimetsu command (also available as
    kimetsu-ai).

v0.8.1

01 Jun 20:30
75f9b40

Choose a tag to compare

v0.8.1 — npm distribution

ADDED

  • npm distribution. Kimetsu now publishes to npm — npm install -g kimetsu
    installs the prebuilt native binary for your platform, no Rust toolchain
    required. Uses the esbuild/turbo model: per-platform packages
    (@kimetsu/linux-x64, @kimetsu/darwin-x64, @kimetsu/darwin-arm64,
    @kimetsu/win32-x64) selected via optionalDependencies, with a thin
    bin/cli.js launcher that execs the matching binary. No postinstall, so it
    works under npm install --ignore-scripts. The semantic build is fetched on
    demand when KIMETSU_NPM_FLAVOR=embeddings is set. Published from the
    existing release pipeline (new publish-npm job, gated on the PUBLISH_NPM
    repo variable + NPM_TOKEN secret, mirroring the crates.io gate). Sources
    live in npm/.

v0.8.0

31 May 02:25
6692fa4

Choose a tag to compare

v0.8.0 — proactive recall, selectable embedding model, full MCP control

The release that makes the brain proactive and gives the agent (and user)
full control over it from inside Claude Code / Codex.

ADDED

  • Proactive recall (mid-work). New PreToolUse / PostToolUse Bash hooks
    surface a relevant memory while the agent works, not just on prompt:
    • after a failed Bash command, surface a matching failure_pattern /
      command fix;
    • before a risky Bash command, warn from a matching failure_pattern;
    • on a repeated failing command (loop), lower the score floor and bypass
      the throttle so help arrives sooner.
      Retrieval is lexical-FTS-only (no embedding-model load), gated by a high
      score floor (0.45; loop 0.35), capped at one capsule, with per-session
      dedupe + a refractory throttle. Token cost stays ~0 (silent when nothing
      qualifies). Wired into both Claude Code (.claude/settings.json) and Codex
      (.codex/hooks.json) with a matcher: "Bash"; opt out with
      kimetsu plugin install --no-proactive (or proactive:false over MCP).
      Per-session state lives in <repo>/.kimetsu/proactive/<session_id>.json
      and is garbage-collected after 7 days.
  • Selectable embedding model. New [embedder] table in project.toml
    (precedence: KIMETSU_BRAIN_EMBEDDER env > config > default). Inspect and
    change it with kimetsu brain model list / kimetsu brain model set <id>
    (the latter writes the config and re-embeds the corpus with the new model).
    Curated built-ins: bge-small-en-v1.5 (384d, default), bge-m3 (1024d),
    jina-v2-base-code (768d).
  • Full MCP control surface. New tools so an agent can manage the brain
    without leaving Claude Code / Codex: kimetsu_brain_model_list /
    kimetsu_brain_model_set (re-embeds in-process with the new model),
    kimetsu_brain_reindex, kimetsu_brain_memory_search (FTS over memory
    text), kimetsu_brain_conflict_resolve, kimetsu_brain_prune, and
    kimetsu_brain_config_show. kimetsu_brain_memory_list and
    ..._memory_proposals gained limit/offset pagination.

CHANGED

  • reindex can now run against an explicit embedder
    (reindex_all_with_embedder), so a model switch re-embeds with the chosen
    model regardless of the process's cached default embedder.

FIXED

  • Test isolation. Tests created project roots under the system temp dir;
    on a machine whose $HOME is itself a git repo, ProjectPaths::discover
    climbed to $HOME and made parallel tests share one brain.db +
    project.lock. Each test root now gets its own git boundary, so plain
    cargo test is hermetic.