Skip to content

feat(capture-attention): env-var override + privacy-harden calibration fixture (0.7.0rc4)#162

Merged
cipher813 merged 3 commits into
mainfrom
feat/capture-attention-activation-infra
May 24, 2026
Merged

feat(capture-attention): env-var override + privacy-harden calibration fixture (0.7.0rc4)#162
cipher813 merged 3 commits into
mainfrom
feat/capture-attention-activation-infra

Conversation

@cipher813
Copy link
Copy Markdown
Owner

Summary

Activation infrastructure for the Phase A capture-attention soak, ready for operator flag-flip on Fly.

  • MNEMON_CAPTURE_ATTENTION_ENABLED env-var override mirrors the standing-tier pattern (MNEMON_STANDING_TIER_ENABLED) — operator can flip activation on Fly via flyctl secrets set without a code change + redeploy. New store._capture_attention_enabled() helper called at request time.
  • mnemon attention-status reports the effective flag value (env-var applied), not the static config default.
  • Calibration script fixes unblocking the operator pre-soak workflow:
    • VecStore.get(vec_id) -> np.ndarray | None added — scripts/calibrate_capture_threshold.py:79 called the method but it didn't exist.
    • Near-neighbor pair sampling replaces uniform-random. Random pairs across a 2510-memory vault cluster at cosine 0.1–0.4 (uninformative); new sampler keeps every pair in the calibration decision region.
  • Privacy hardening: tests/fixtures/capture_attention_pairs.json is now gitignored — every operator run overwrites it with real vault content. Schema template at capture_attention_pairs.example.json.

Calibration result (2026-05-24, vault snapshot 2510 live memories)

threshold precision recall
0.70 0.550 1.000
0.75 0.550 1.000
0.80 0.588 0.909
0.85 0.750 0.818 ← recommended
0.90 0.750 0.545

Matches existing CAPTURE_ATTENTION_THRESHOLD = 0.85 default — no config change needed. Precision plateaus above 0.85 while recall craters → diminishing returns confirm the cut.

Calibration-sample precision of 0.75 is naturally below the soak target of ≥0.80 because the sampler biases to near-neighbor pairs (every pair has cos ≥ 0.55, lots of edge negatives). Production gate (N≥2 distinct sessions) is much stricter than the single-pair test; soak target is measured against real save events.

Post-merge operator workflow

  1. mnemon upgrade web --app-name mnemon-memory --mnemon-version 0.7.0rc4
  2. flyctl secrets set MNEMON_CAPTURE_ATTENTION_ENABLED=true -a mnemon-memory
  3. mnemon attention-status against live remote — should show Flag enabled : True
  4. ≥1 week soak observation. Acceptance: boost_rate ≤ 0.25 (auto) + ≥80% precision on 20-canonical manual review (operator).

Test plan

  • pytest tests/test_vecstore.py tests/test_capture_attention.py -v — 41 passed
  • Full suite: pytest --cov — 873 passed, coverage 86.46% (gate ≥80%)
  • Calibration smoke-tested against live /tmp/mnemon-prod-snap.sqlite
  • No private vault content in tree (git status shows D tests/fixtures/capture_attention_pairs.json; local file gitignored)
  • Operator: mnemon upgrade web after merge + PyPI publish
  • Operator: flyctl secrets set MNEMON_CAPTURE_ATTENTION_ENABLED=true
  • Operator: mnemon attention-status against live remote confirms flag flip

cipher813 added 3 commits May 24, 2026 07:15
scripts/calibrate_capture_threshold.py:79 calls vs.get(vec_id) but
VecStore had no such method, blocking the Phase A soak activation
workflow (calibration is the gate before flag-flip). Script was added
in #153 (capture-attention Phase A) but never run end-to-end.

Add get(vec_id) -> np.ndarray | None mirroring the has/delete
single-id shape; returns a defensive copy matching export_all's
mutation-safety contract. 3 new tests (returns vector, missing→None,
defensive-copy invariant). Suite 855 → 858 passing.
Random sampling across a 2510-memory vault produces pair cosines
clustered at 0.1-0.4 (clearly-different topics) — operator verdicts on
those pairs carry no information about whether
CAPTURE_ATTENTION_THRESHOLD should be 0.80 or 0.85. The decision region
lives at cosine 0.70-0.95.

Replace with near-neighbor sampling: pick a random anchor, take its
top non-self neighbor via vs.search(), accept if cosine ≥ 0.55 (well
below the lowest calibration threshold so edge-negatives survive).
Pairs sorted by cosine descending so the operator sees high-confidence
near-dupes first.

Verified against /tmp/mnemon-prod-snap.sqlite (2510 live memories):
20-pair sample now spans cosine 0.751-0.999 — entirely in the
calibration-relevant range, vs. 4/4 obvious-different of the prior
uniform-random run.
…n fixture (0.7.0rc4)

Activation infrastructure for the Phase A capture-attention soak.

env-var override:
- New MNEMON_CAPTURE_ATTENTION_ENABLED env var takes precedence over
  config.CAPTURE_ATTENTION_ENABLED. Mirrors the standing-tier pattern
  (MNEMON_STANDING_TIER_ENABLED) so operators can flip activation on
  Fly via `flyctl secrets set` without a code change + redeploy, and
  the next save picks it up without restarting the server. New
  store._capture_attention_enabled() helper called at request time
  from Store.save and cli attention-status.
- cli attention-status now reports the EFFECTIVE flag value (env-var
  override applied), not the static config default — Fly secret flips
  show up here immediately.
- 15 new tests (5 helper resolution cases x parametrize over
  truthy/falsy aliases + whitespace).

Privacy hardening of calibration output:
- tests/fixtures/capture_attention_pairs.json is now gitignored —
  every operator run overwrites it with real vault titles + snippets
  (personal context) that must not land in a public-repo commit. PR
  #153 originally shipped the path tracked with a placeholder schema;
  the placeholder moves to capture_attention_pairs.example.json so
  the format is still discoverable.

Suite 858 → 873 passing; coverage 86.46% (gate ≥80%).
Version 0.7.0rc3 → 0.7.0rc4.
@cipher813 cipher813 merged commit dfb18d2 into main May 24, 2026
10 checks passed
@cipher813 cipher813 deleted the feat/capture-attention-activation-infra branch May 24, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant