-
Notifications
You must be signed in to change notification settings - Fork 1
reflection and recovery
github-actions[bot] edited this page Jun 26, 2026
·
3 revisions
parent: memoryvault.md part_slug: reflection-and-recovery title: "Reflection sidecar + crash-recovery markers" status: pending visibility: published author: Alex Herrero contributors: [] created: 2026-05-15 updated: 2026-05-15 last_major_revision: 2026-05-15 dependencies: [write-primitives, recall-loop] estimated_scope: L prd: project:
Parent design: MemoryVault — see Detailed Design §2 (Reflection sidecar) and §6 (Crash recovery — session-id markers) for full architectural context. These two subsections are bundled because they share the Stop/idle hook scaffolding.
This part ships the write loop — the reflection logic that mines conversation transcripts for durable entries + the crash-recovery mechanism that ensures no session's learnings are lost when the Stop hook doesn't fire.
Three trigger surfaces for reflection (all running the same logic):
-
Manual
/memory reflect [--session <path>]— user-initiated; runs against the current session transcript by default or a specified transcript path. Validates the reflection logic against arbitrary input. - Stop-event hook — fires automatically on Claude Code session Stop. Scoped to the just-ended session's transcript.
-
Idle-time hook (new crickets primitive added by this part) — fires when Claude Code has been idle > N minutes (N TBD, default 30); also scans
.harness/session-id-*.startmarker files for orphans (crashed sessions where Stop didn't fire) and runs reflection retroactively on those transcripts.
Reflection logic (shared across all three triggers):
-
Read the session transcript at
~/.claude/projects/<repo>/<session-id>.jsonl. -
Two parallel mining passes:
- 3-category mine (writes to MemoryVault): scan for Successful Workflows, User Preferences, Fixes & Workarounds. Each candidate gets a confidence rating per the heuristic in step 3.
-
Idea-candidate mine (writes to user-vault
Ideas.md+ deep research in_idea-incubator/): scan for follow-ups, future project ideas, research candidates. Each candidate gets a 2-sentence summary + the agent does deep research. (The actual idea-ledger writing is handled by theidea-ledgerpart — this part just produces the candidates.)
-
Tri-modal routing for 3-category candidates per the locked heuristic:
-
HIGH = explicit user signal during session (user said "always X" / user manually corrected the agent / user locked a design call) → auto-save via
/memory save. -
MEDIUM = pattern-inferred (agent noticed pattern 3+ times) → interactive review prompt with approve / edit / reject / skip / supersede-existing-X options. Controlled by
memory.review_mode: interactive (default) | silent. -
LOW = single-instance inference → write to
MemoryVault/personal-private/_inbox/<slug>.mdfor batch review later.
-
HIGH = explicit user signal during session (user said "always X" / user manually corrected the agent / user locked a design call) → auto-save via
Crash-recovery markers:
-
SessionStart hook writes
.harness/session-id-<session-uuid>.start(one file per session, contents = session start timestamp + transcript path). -
Stop hook (after reflection succeeds) renames
.start→.reflected. On failure, file stays as.startfor retry. -
Idle-time hook scans for
.startfiles older than 1 hour (idle threshold for assuming the session is dead) → runs reflection retroactively → renames to.reflectedon success. -
GC:
.reflectedmarkers older than 30 days get deleted on next idle pass.
All markers live in .harness/ (gitignored, runtime-only).
-
write-primitives— reflection writes via/memory savefor HIGH-confidence candidates and the interactive review supersede path uses/memory evolve. Without write primitives, reflection has nowhere to land its output. -
recall-loop— interactive review's MEDIUM-confidence prompts surface "this candidate contradicts existing entry X — supersede?" decisions, which requires the recall engine to find the contradicting entry. Without recall, the supersede-existing-X option degenerates to plain approve/reject.
- Stop hook fires on session end — install at the 2 host destinations; run a Claude Code session; verify Stop hook fires + reflection runs against the just-ended session's transcript.
- Idle hook fires after N min idle — same; let session sit idle past the threshold; verify the hook fires.
- Reflection mines 3 categories — fixture transcript with seeded "user said 'always X'" / "user corrected the agent" / "hit error Z resolved by W" patterns; verify each pattern surfaces as a candidate in the correct category.
- Reflection mines idea candidates — fixture transcript with "we should also do X later" / "this could be its own project" patterns; verify each surfaces as an idea candidate (raw — idea-ledger part handles the write).
-
Tri-modal routing places entries correctly — HIGH candidates auto-save to canonical paths; MEDIUM candidates trigger interactive review prompts; LOW candidates land in
_inbox/. -
Interactive review prompts work —
memory.review_mode: interactivedefault; verify each MEDIUM candidate produces an approve/edit/reject/skip/supersede prompt; verify each option does the right thing. -
memory.review_mode: silenttoggle — flip the setting; verify MEDIUM candidates auto-save without prompting. -
Crashed-session marker recovery — start a session, kill Claude Code mid-session (so Stop never fires); verify
.startmarker stays in place; trigger idle hook (or wait for it); verify reflection runs retroactively against the orphan session's transcript + marker gets renamed to.reflected. -
Marker GC at 30 days — seed old
.reflectedmarkers; verify idle hook deletes them on next pass. -
Manual
/memory reflect— invoke directly against a specified transcript path; verify reflection runs + produces candidates routed per tri-modal logic. -
Smoke install verifies hooks land —
smoke-install-bash.sh+.ps1extended for both Stop + idle hook scripts at the 2 host destinations. - All 3 OS CI workflows green on the commit that lands this part.
- The idle-time hook is the new crickets primitive. The existing Stop-event hook pattern from plan #4+#5 (
commit-on-stop) is the reference shape; idle is similar but with a different trigger condition. The hook needs to know "how long has the agent been idle?" — Claude Code's hook lifecycle doesn't expose this directly, so the hook likely polls a heartbeat file the SessionStart writes + updates on every UserPromptSubmit. - The 3-category extraction heuristic is the load-bearing-but-uncertain piece. Ship instrumented — every mined candidate logs its category + confidence + rationale so the user can validate via the interactive review + manually inspect via
/memory inspect. - The "user said 'always X'" pattern detection is straightforward (keyword scan + nearby user-turn). The "user manually corrected the agent" pattern is harder — requires looking at agent turn → user reverts/corrects → agent applies correction. Defer fancy detection to a follow-up if the simple heuristic is too noisy.
- Idea-candidate mining lives in this part but the idea-ledger writes (Ideas.md +
_idea-incubator/) are the next part's scope. This part's reflection logic just emits idea candidates as in-memory objects; the idea-ledger part subscribes to those. -
.harness/session-id-*.startmarker contents: just the session start timestamp + transcript path is sufficient. Don't put PII (transcript content) in markers — they're runtime metadata.