Skip to content

fix(audit): provenance + fallback gaps in recall and consolidation#46

Merged
amitpaz1 merged 1 commit into
mainfrom
audit-fixes-provenance-fallback
May 8, 2026
Merged

fix(audit): provenance + fallback gaps in recall and consolidation#46
amitpaz1 merged 1 commit into
mainfrom
audit-fixes-provenance-fallback

Conversation

@amitpaz1
Copy link
Copy Markdown
Collaborator

@amitpaz1 amitpaz1 commented May 8, 2026

Summary

Three gaps surfaced by the May 2026 recall-path audit, all in one PR:

  • Gap 1 — dream subagent destroyed provenance. New atomic consolidate_memories(source_ids, content, type, reason) MCP tool + POST /v1/memories/consolidate route create the new memory and supersede every source in one shot. Dream prompt steps 1 (CONSOLIDATE) and 3 (PROMOTE) now route through it instead of bare remember(...) + forget(...).
  • Gap 2 — silent fallback on empty recall. _hybrid_recall now returns a HybridRetrieveReport carrying pre-min_score best_score and a per-retriever attempted map (vector / fts / graph → ok|empty|error). Route surfaces both fields. MCP recall empty-result branch now lists concrete fallback tools (search, list_memories, recent_activity, topics, graph_query) and echoes active filters so a near-miss-by-filter is visible. Drops the swallowing try/except from _safe_text_recall and the bulk path of _safe_graph_recall so real failures reach gather(return_exceptions=True) and the diagnostic is accurate; degradation behavior unchanged because _normalize still maps BaseException[].
  • Gap 3 — provenance was untyped JSON with no read surface. New list_supersession_sources(memory_id) Store method on Postgres + SQLite + protocol; new GET /v1/memories/{id}/provenance route returns sources (events superseded by this id), chain (this id's own supersessions), and metadata_sources as fallback for pre-Phase-6F consolidations. New provenance(memory_id) MCP tool formats all three.

Test plan

  • ruff check src/ tests/ — clean
  • pytest tests/test_hybrid_retrieval.py tests/services/test_temporal_supersession.py tests/test_mcp_provenance_and_recall.py tests/test_mcp.py tests/test_dreams.py — 152 passed
  • 13 new tests covering: SQLite + PG persistence for list_supersession_sources; consolidate_memories service helper; HTTP /v1/memories/consolidate happy path + 404 + dedup; /v1/memories/{id}/provenance with sources + chain + metadata_sources; MCP consolidate_memories body shape + failure path; MCP provenance formatter; MCP recall empty-result fallback hints; HybridRetrieveReport.best_score survives min_score filter; attempted map per retriever
  • Existing tests in tests/test_hybrid_retrieval.py and tests/services/test_temporal_supersession.py updated to read .results from the new report return shape
  • CI green

Out of scope

  • Retrofitting the legacy ConsolidationEngine (src/lore/consolidation.py) to write memory_supersessions rows alongside its existing soft-archive behavior. The provenance endpoint reads meta.consolidated_from as a fallback so classic-engine artifacts surface, but the typed audit trail isn't populated for that path. Worth a follow-up PR; not blocking this audit fix.

🤖 Generated with Claude Code

Three gaps identified in the recall path audit (May 2026), all
contributing to the "compiled knowledge layer drift" failure mode
flagged in the Pinecone Nexus / Karpathy LLM-wiki write-up:

Gap 1 — dream subagent destroyed provenance.
  Phase 6E's `claude -p` subagent was instructed to call
  `mcp__lore__remember` + `mcp__lore__forget` directly when promoting
  observations to lessons or merging duplicates. Neither tool wrote a
  `memory_supersessions` row, so the synthesized lesson had no link
  back to its source claims.
  Fix: new atomic `consolidate_memories(source_ids, content, type,
  reason)` MCP tool and `POST /v1/memories/consolidate` route that
  create the new memory and supersede every source in one operation.
  Dream prompt steps 1 (CONSOLIDATE near-duplicates) and 3 (PROMOTE
  observations) now route through it.

Gap 2 — recall fallback was silent.
  `RetrieveResponse` exposed per-hit `score` but no aggregate, so an
  empty result was indistinguishable from a near-miss. The MCP recall
  empty branch returned a flat sentence with no fallback hints.
  Fix: `_hybrid_recall` now returns a `HybridRetrieveReport` carrying
  pre-min_score `best_score` and a per-retriever `attempted` map
  (vector / fts / graph → ok|empty|error). Route surfaces both fields.
  MCP recall empty message now lists concrete fallback tools
  (search / list_memories / recent_activity / topics / graph_query)
  and echoes active filters so a near-miss-by-filter is visible.
  Drops try/except from `_safe_text_recall` and `_safe_graph_recall`'s
  bulk path so real exceptions reach the gather; degradation
  semantics unchanged because `_normalize` still maps
  BaseException → [].

Gap 3 — provenance was untyped JSON with no read surface.
  Classic-engine consolidations stored `consolidated_from` in the
  freeform `meta` blob; no API exposed the reverse lookup.
  Fix: new `list_supersession_sources(memory_id)` Store method on
  Postgres + SQLite + protocol; `GET /v1/memories/{id}/provenance`
  returns sources (events superseded by this id), chain (this id's
  own supersessions), and `metadata_sources` as a fallback for
  pre-Phase-6F consolidations. New `provenance(memory_id)` MCP tool
  formats all three.

Tests: 13 new tests across `tests/services/test_temporal_supersession.py`
and a new `tests/test_mcp_provenance_and_recall.py` covering
list_supersession_sources, consolidate_memories service helper, the
HTTP routes' happy + error paths, MCP tool body shape and response
formatting, recall fallback message content, and the report dataclass
on `_hybrid_recall`. Existing hybrid-retrieval and supersession tests
updated to read `.results` from the new report return shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@amitpaz1 amitpaz1 merged commit ae1aa93 into main May 8, 2026
6 checks passed
amitpaz1 added a commit that referenced this pull request May 8, 2026
HttpStore._request returns the raw response on 4xx/404 instead of
raising, so the MCP temporal wrapper was formatting error bodies as
``{"status": "ok", "id": null, "superseded_count": null}``. Smoke
test on a freshly merged PR #46 surfaced this — calling
consolidate_memories with bogus source ids reported success despite
the route returning 404 "Source memory not found: nope".

Fix: _temporal_request now inspects status_code and raises
RuntimeError with the server's message / detail / error field on any
non-2xx. The existing ``except Exception as e: return f"Failed to ..."``
clauses on supersede / consolidate_memories / provenance /
supersession_chain / list_at_time turn this into a clear Failed-to
message at the MCP boundary.

Test plan: 4 new tests in test_mcp_provenance_and_recall.py covering
2xx pass-through, 404 with message, 422 with FastAPI's "detail"
field, and the end-to-end consolidate_memories surface that proved
the bug live.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant