Skip to content

Episode consolidation: distill episodes into semantic knowledge #38

Description

@ahoward

Context

In cognitive architectures, episodic memory doesn't just accumulate — it consolidates. Repeated patterns across episodes get distilled into semantic knowledge (concepts) and procedural knowledge (rules). This prevents memory bloat and enables generalization.

Why

Without consolidation:

  • Memory grows linearly with every interaction
  • Agents can't generalize from experience ("I've failed at X three times" → "X is unreliable")
  • Stale episodes dilute relevant ones in search results

Design — Manual, User-Approved

Consolidation is manual and requires user approval. No automatic background processing. The LLM distillation step is too brittle and expensive to run unattended.

Workflow

brane consolidate --dry-run    # show proposed merges (diff)
brane consolidate --apply      # apply after user reviews diff
  1. Cluster: Group similar episodes by vector similarity (existing HNSW index)
  2. Propose: LLM generates a concept name + type for each cluster
  3. Diff: Output a human-readable diff showing which episodes → which concepts
  4. Wait for approval: User reviews and confirms
  5. Apply: Create concepts, create DERIVED_FROM edges, soft-archive source episodes

New Edge Type

DERIVED_FROM — links a distilled concept back to its source episodes. Preserves the reasoning chain.

Output Format (dry-run)

Cluster 1 (3 episodes, similarity 0.92):
  - [#12] "auth middleware caused test timeout" (2026-03-20)
  - [#15] "auth middleware race condition under load" (2026-03-22)
  - [#19] "auth middleware flaky in CI" (2026-03-25)
  → Proposed concept: AuthMiddlewareFragility (Caveat)

Cluster 2 (2 episodes, similarity 0.88):
  ...

Apply? [y/N]

MCP Tool

consolidate — returns the dry-run diff as text. Agent presents to user for approval. No auto-apply via MCP.

Implementation Notes

  • Clustering: group episodes by vector similarity with configurable threshold (default 0.85)
  • LLM call for naming only — small, bounded, one call per cluster
  • DERIVED_FROM edges enable "why does brane think X?" → trace back to experiences
  • Source episodes are soft-archived (not deleted) after consolidation
  • Respects agent_id — consolidate per-agent by default

Acceptance Criteria

  • brane consolidate --dry-run outputs proposed merges
  • brane consolidate --apply creates concepts + DERIVED_FROM edges
  • Source episodes soft-archived after apply
  • Per-agent consolidation (default) or cross-agent (explicit flag)
  • MCP tool returns dry-run diff only (no auto-apply)
  • Idempotent — already-consolidated episodes skipped

Depends On

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions