feat(search): Layer 4 — provenance trust-tiering at recall rank time#126
Merged
Conversation
Auto-captured transcript memories (source_client in HOOK_SOURCE_CLIENTS) could outrank deliberate user assertions in unprompted recall — the shape of the 2026-05-18 Desktop incident, where a hook-captured fragment dominated. The existing HOOK_SOURCE_CONFIDENCE_CEILING save cap only moves the 0.25-weighted confidence term (≤0.025 composite delta) — too weak on its own. Layer 4 of the 5-layer defense plan (private/mnemon-injection-defense-layers-260518.md): a flat PROVENANCE_DEMOTION_FACTOR (0.85) multiplier on the composite score for hook-sourced results, so a hook capture needs ~18% more relevance+recency to tie an equal-relevance user memory. Rank-only: explicit memory_get(id) bypasses composite scoring, so direct lookups are unaffected (the plan's stated boundary). Provenance was not previously carried into search results, so this threads source_client through: - store.SearchResult gains source_client; search_bm25 / search_vector SELECT and populate d.source_client. - search.rrf_fuse preserves it across fusion; mmr_rerank already spreads all dataclass fields so it carries automatically. - search.composite_score applies the demotion and carries source_client onto ScoredResult (observability). - New config.PROVENANCE_DEMOTION_FACTOR, documented as stacking on the save-time confidence cap. No S3/MCP schema change — server.py builds its output dict explicitly and does not expose source_client (additive-only contract preserved). +6 tests (exact-factor demotion; non-hook sources untouched; source_client carried into ScoredResult; ranking; survives RRF; store-level provenance threading). Full suite 782 passed. Pre-existing unrelated ruff error in upgrade.py:711 left untouched (not mine; CI runs pytest). CHANGELOG/version bump deferred to next batched chore: bump ritual PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 18, 2026
cipher813
added a commit
that referenced
this pull request
May 18, 2026
…njection defense (#128) Batched release bump for the four post-rc17 security PRs (#124 bare <system> defang, #125 Layer 0 capture-time rejection, #126 Layer 4 provenance trust-tiering, #127 Layer 1 spotlighting envelope), none of which bumped individually per the deferred-to-batched-ritual convention. - pyproject.toml + src/mnemon/__init__.py: 0.6.0rc17 → 0.6.0rc18 - CHANGELOG.md: new [0.6.0rc18] Security section summarizing the five-layer plan's shipped layers + the deferred items README PyPI badge is dynamic (shields.io/pypi/v) — no change. Suite 786 passed. Tag v0.6.0rc18 + GitHub Release + Fly redeploy are the post-merge deploy steps (per ROADMAP pre-deploy ritual), not part of this PR. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Context
Layer 4 of the 5-layer stored-injection defense plan (
private/mnemon-injection-defense-layers-260518.md). Follows PR #124 (Layer 2, merged) and PR #125 (Layer 0, merged). Driver: memory #2362.Problem
Auto-captured transcript memories (
source_client ∈ HOOK_SOURCE_CLIENTS) could outrank deliberate user assertions in unprompted recall — the shape of the 2026-05-18 Desktop incident. The existingHOOK_SOURCE_CONFIDENCE_CEILINGsave-time cap only moves the 0.25-weighted confidence term (≤0.025 composite delta) — too weak alone.Change
A flat
PROVENANCE_DEMOTION_FACTOR(0.85) multiplier on the composite score for hook-sourced results — a hook capture needs ~18% more relevance+recency to tie an equal-relevance user memory. Rank-only: explicitmemory_get(id)bypasses composite scoring, so direct lookups are unaffected (the plan's stated boundary). Stacks on top of the save-time confidence cap.Provenance wasn't previously carried into search results, so this threads
source_clientthrough the pipeline:store.SearchResultgainssource_client;search_bm25/search_vectorSELECT + populated.source_clientsearch.rrf_fusepreserves it across fusion;mmr_rerankcarries it automatically (spreads all dataclass fields)search.composite_scoreapplies the demotion + carriessource_clientontoScoredResultfor observabilityconfig.PROVENANCE_DEMOTION_FACTOR, documentedNo S3/MCP schema change —
server.pybuilds its output dict explicitly and does not exposesource_client(additive-only contract preserved).Tests
+6: exact-factor demotion; non-hook sources untouched;
source_clientonScoredResult; ranking below equal user memory; survives RRF fusion; store-level save→search provenance threading. Full suite: 782 passed.Note: one pre-existing unrelated ruff error in
upgrade.py:711left untouched (not in scope; CI runs pytest, not ruff). CHANGELOG/version bump deferred to the next batchedchore: bumpritual PR.Remaining plan layers (separate PRs)
🤖 Generated with Claude Code