feat(search): Layer 4 — provenance trust-tiering at recall rank time by cipher813 · Pull Request #126 · cipher813/mnemon

cipher813 · 2026-05-18T18:54:33Z

Context

Layer 4 of the 5-layer stored-injection defense plan (private/mnemon-injection-defense-layers-260518.md). Follows PR #124 (Layer 2, merged) and PR #125 (Layer 0, merged). Driver: memory #2362.

Problem

Auto-captured transcript memories (source_client ∈ HOOK_SOURCE_CLIENTS) could outrank deliberate user assertions in unprompted recall — the shape of the 2026-05-18 Desktop incident. The existing HOOK_SOURCE_CONFIDENCE_CEILING save-time cap only moves the 0.25-weighted confidence term (≤0.025 composite delta) — too weak alone.

Change

A flat PROVENANCE_DEMOTION_FACTOR (0.85) multiplier on the composite score for hook-sourced results — a hook capture needs ~18% more relevance+recency to tie an equal-relevance user memory. Rank-only: explicit memory_get(id) bypasses composite scoring, so direct lookups are unaffected (the plan's stated boundary). Stacks on top of the save-time confidence cap.

Provenance wasn't previously carried into search results, so this threads source_client through the pipeline:

store.SearchResult gains source_client; search_bm25 / search_vector SELECT + populate d.source_client
search.rrf_fuse preserves it across fusion; mmr_rerank carries it automatically (spreads all dataclass fields)
search.composite_score applies the demotion + carries source_client onto ScoredResult for observability
New config.PROVENANCE_DEMOTION_FACTOR, documented

No S3/MCP schema change — server.py builds its output dict explicitly and does not expose source_client (additive-only contract preserved).

Tests

+6: exact-factor demotion; non-hook sources untouched; source_client on ScoredResult; ranking below equal user memory; survives RRF fusion; store-level save→search provenance threading. Full suite: 782 passed.

Note: one pre-existing unrelated ruff error in upgrade.py:711 left untouched (not in scope; CI runs pytest, not ruff). CHANGELOG/version bump deferred to the next batched chore: bump ritual PR.

Remaining plan layers (separate PRs)

Layer 1 — spotlighting/data envelope (Claude Code preamble first, MCP best-effort)
Layer 3 — dual-representation storage (conditional; only if Layer 0 proves insufficient)

🤖 Generated with Claude Code

Auto-captured transcript memories (source_client in HOOK_SOURCE_CLIENTS) could outrank deliberate user assertions in unprompted recall — the shape of the 2026-05-18 Desktop incident, where a hook-captured fragment dominated. The existing HOOK_SOURCE_CONFIDENCE_CEILING save cap only moves the 0.25-weighted confidence term (≤0.025 composite delta) — too weak on its own. Layer 4 of the 5-layer defense plan (private/mnemon-injection-defense-layers-260518.md): a flat PROVENANCE_DEMOTION_FACTOR (0.85) multiplier on the composite score for hook-sourced results, so a hook capture needs ~18% more relevance+recency to tie an equal-relevance user memory. Rank-only: explicit memory_get(id) bypasses composite scoring, so direct lookups are unaffected (the plan's stated boundary). Provenance was not previously carried into search results, so this threads source_client through: - store.SearchResult gains source_client; search_bm25 / search_vector SELECT and populate d.source_client. - search.rrf_fuse preserves it across fusion; mmr_rerank already spreads all dataclass fields so it carries automatically. - search.composite_score applies the demotion and carries source_client onto ScoredResult (observability). - New config.PROVENANCE_DEMOTION_FACTOR, documented as stacking on the save-time confidence cap. No S3/MCP schema change — server.py builds its output dict explicitly and does not expose source_client (additive-only contract preserved). +6 tests (exact-factor demotion; non-hook sources untouched; source_client carried into ScoredResult; ranking; survives RRF; store-level provenance threading). Full suite 782 passed. Pre-existing unrelated ruff error in upgrade.py:711 left untouched (not mine; CI runs pytest). CHANGELOG/version bump deferred to next batched chore: bump ritual PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…njection defense (#128) Batched release bump for the four post-rc17 security PRs (#124 bare <system> defang, #125 Layer 0 capture-time rejection, #126 Layer 4 provenance trust-tiering, #127 Layer 1 spotlighting envelope), none of which bumped individually per the deferred-to-batched-ritual convention. - pyproject.toml + src/mnemon/__init__.py: 0.6.0rc17 → 0.6.0rc18 - CHANGELOG.md: new [0.6.0rc18] Security section summarizing the five-layer plan's shipped layers + the deferred items README PyPI badge is dynamic (shields.io/pypi/v) — no change. Suite 786 passed. Tag v0.6.0rc18 + GitHub Release + Fly redeploy are the post-merge deploy steps (per ROADMAP pre-deploy ritual), not part of this PR. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cipher813 merged commit bc676f9 into main May 18, 2026
9 checks passed

cipher813 deleted the feat/layer4-provenance-trust-tiering branch May 18, 2026 19:33

This was referenced May 18, 2026

feat(hooks): Layer 1 — spotlighting envelope for recalled context (Claude Code path) #127

Merged

chore: bump version to 0.6.0rc18 + CHANGELOG for the layered stored-injection defense #128

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(search): Layer 4 — provenance trust-tiering at recall rank time#126

feat(search): Layer 4 — provenance trust-tiering at recall rank time#126
cipher813 merged 1 commit into
mainfrom
feat/layer4-provenance-trust-tiering

cipher813 commented May 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cipher813 commented May 18, 2026

Context

Problem

Change

Tests

Remaining plan layers (separate PRs)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant