Self-correcting memory for LLM agents. A Claude Code plugin that learns from sessions, surfaces relevant memories, measures whether they're actually followed, and fixes the ones that aren't.
Claude Code has several instruction sources — CLAUDE.md, rules, skills — but no way to know if they're working. An instruction that's always loaded but never followed wastes context budget. A great instruction that only matches narrow keywords goes unseen. Without measurement, instruction sets decay: duplicates accumulate, stale guidance persists, and useful patterns stay buried.
Engram hooks into every phase of a Claude Code session to create a closed feedback loop:
Learn ──→ Surface ──→ Maintain
↑ │
└──────────────────────┘
-
Learn — Extracts structured memories from user corrections, instructions, and contextual facts. Each memory is a TOML file with tier-based metadata: title, content, principle, anti-pattern, keywords, concepts, and confidence tier (A/B/C).
-
Surface — At every prompt and tool use, retrieves relevant memories via BM25 keyword scoring and injects them as context. A per-hook token budget caps total injection to avoid overwhelming the model.
-
Maintain — Periodic diagnosis generates proposals for each memory based on its effectiveness quadrant. Proposals are applied with user confirmation: rewrites for stale content, keyword broadening for hidden gems, escalation for persistent violations, deletion for noise.
Engram manages one instruction source — memories — and cross-references against other Claude Code sources for deduplication:
| Source type | Description | Surfacing behavior |
|---|---|---|
memory |
TOML files in ~/.claude/engram/data/memories/ |
Keyword-matched per prompt via BM25 |
Cross-reference sources (used for suppression, not managed by engram): CLAUDE.md, .claude/rules/, plugin skills.
Each memory TOML embeds its own registry data: content hash, surfaced count, last surfaced timestamp, evaluation counters (followed/contradicted/ignored), and enforcement level.
Every time a memory is surfaced, the registry increments its surfaced_count and updates last_surfaced. At session end, the evaluator classifies each surfaced memory's outcome:
- Followed — The model's behavior aligned with the instruction
- Contradicted — The model did the opposite of what the instruction said
- Ignored — The instruction was surfaced but had no observable effect
From these counters, engram computes effectiveness (followed / total evaluations) and frecency (recency-weighted frequency with configurable half-life decay).
The registry classifies every instruction into one of four quadrants:
| High effectiveness | Low effectiveness | |
|---|---|---|
| Often surfaced | Working — Keep as-is | Leech — Rewrite or escalate |
| Rarely surfaced | Hidden Gem — Broaden keywords | Noise — Remove |
All memories are classified uniformly by the same thresholds. An additional "Insufficient" quadrant applies when evaluation data is below the minimum threshold.
Each quadrant has a prescribed action:
- Working — Check for staleness; rewrite if content is outdated
- Leech — Diagnose root cause (content quality, wrong keywords, enforcement gap) and either rewrite content or escalate enforcement
- Hidden Gem — LLM suggests additional keywords/concepts to increase surfacing coverage
- Noise — Delete the memory TOML file
Maintenance proposals are generated by engram maintain and applied interactively via engram maintain --apply --proposals <file>.
For persistent leech memories (instructions that keep getting violated), engram applies a graduated escalation ladder:
- Advisory — Standard surfacing (default)
- Emphasized advisory — Surfaced with emphasis markers
- Reminder — Reminder injected after tool use via PostToolUse hook
Escalation level is tracked per-memory.
Engram builds a directed graph of relationships between memories. When a memory is surfaced, spreading activation propagates to related memories, allowing conceptually linked instructions to surface together even when keyword overlap is low.
Links are created automatically via merge-on-write: when new memories are stored, engram detects duplicates and near-duplicates, merging them and preserving relationship edges. Links are re-computed after each merge to keep the graph consistent.
Engram hooks into 7 Claude Code hook points:
| Phase | Hook | What happens |
|---|---|---|
| Start | SessionStart |
Build binary if stale. Run maintain/triage. Notify user that /recall is available. |
| Prompt | UserPromptSubmit |
Surface prompt-relevant memories (BM25). Detect inline corrections (UC-3). |
| Prompt (async) | UserPromptSubmit |
Incremental learning extraction from transcript delta. |
| Tool use | PreToolUse |
Surface tool-specific memories (e.g., file-path-relevant instructions). |
| After tool | PostToolUse |
Surface memories relevant to the tool call and its output. |
| Tool failure | PostToolUseFailure |
Diagnose errors and surface relevant memories. |
| Compact | PreCompact |
Flush: learning extraction. |
| End | Stop |
Flush: learning extraction. |
Previous session context is loaded on demand via the /recall skill, which summarizes recent transcripts from the same project using Haiku. This keeps session-start lightweight and avoids injecting stale context automatically.
All data lives in ~/.claude/engram/data/:
| File | Purpose |
|---|---|
memories/*.toml |
Structured memory files with embedded registry data (surfaced count, evaluation counters, enforcement level) |
surfacing-log.jsonl |
Running log of which memories were surfaced and when |
learn-offset.json |
Offset tracking for incremental transcript learning |
Each memory is a TOML file with structured fields:
title = "Use targ for builds"
content = "Always use targ build system instead of raw go commands"
observation_type = "workflow_preference"
concepts = ["build-system", "tooling"]
keywords = ["targ", "build", "go test", "go vet"]
principle = "Use targ test, targ check, targ build for all operations"
anti_pattern = "Running go test or go vet directly"
rationale = "targ encodes hard-won lessons about build configuration"
confidence = "A"Confidence tiers: A (explicit instruction — "always/never/remember"), B (teachable correction), C (contextual fact). Anti-patterns are required for tier A, optional for B, empty for C.
Requires Go 1.25+.
# Clone and install as a Claude Code plugin
git clone https://github.com/toejough/engram.git
# Add to Claude Code — the plugin auto-builds on first session
claude plugin add /path/to/engramThe binary auto-builds on first hook invocation and rebuilds when Go source files change.
cmd/engram/ CLI entry point (thin wiring layer)
internal/ Business logic (33 packages, all DI boundaries)
hooks/ Shell scripts for Claude Code hook integration
skills/ Plugin skills (memory triage)
.claude-plugin/ Plugin manifest (plugin.json)
docs/specs/ Specification artifacts (UC, REQ, DES, ARCH, TEST)
- DI everywhere — No function in
internal/callsos.*,http.*, or any I/O directly. All I/O through injected interfaces, wired atcmd/andcli.goedges. - Pure Go, no CGO — TF-IDF/BM25 for retrieval instead of ONNX. External embedding API if vector similarity needed.
- Fire and forget — Registry writes never block the critical path. Write failures are logged but don't fail the operation.
- Measure impact, not frequency — A memory surfaced 1000 times but never followed is a leech, not a success.
23 use cases across 5 specification layers (UC → REQ/DES/ARCH → TEST → IMPL).
| UC | Name |
|---|---|
| UC-1 | Session Learning |
| UC-2 | Hook-Time Surfacing & Enforcement |
| UC-3 | Remember & Correct |
| UC-6 | Memory Effectiveness Review |
| UC-14 | Structured Session Continuity |
| UC-15 | Automatic Outcome Signal |
| UC-16 | Unified Memory Maintenance |
| UC-17 | Context Budget Management |
| UC-18 | PostToolUse Proactive Reminders |
| UC-21 | Enforcement Escalation Ladder |
| UC-23 | Unified Instruction Registry |
| UC-24 | Proposal Application |
| UC-27 | Global Binary Installation |
| UC-28 | Automatic Maintenance and Promotion Triggers |
| UC-32 | Memory Graph with Spreading Activation |
| UC-33 | Merge-on-Write |
| UC-34 | Pre-Classification Duplicate Consolidation |
| UC-P1-1 | Cross-Source Contradiction Detection |
| UC-P5f-1 | Re-compute Links After Merge |
See LICENSE for details.