docs(architecture): PROD-COGNITION-REPLAY — 100% Rust cognition + proof from PROD not POC#1386
Merged
Merged
Conversation
Joel 2026-05-18: 'We need 100% Rust cognition sooner rather than later
and proof it works. Solid recording and replay of persona, FROM PROD,
not just dummy proof of concepts these guys always rig up. They need
to up their game.'
The substrate has shipped end-to-end in Rust over the last 48 hours
(governor + working-set + recall + audit-recorder + check_redundancy
oxidation, ~25+ PRs). None of it has been validated against
production traffic. TurnReplayRecord type exists; no production turn
has been recorded. Chat-roundtrip-live-harness exists; it consumes
RuntimeFrame::synthetic_chat('hello'). Tests pass; demos work;
behavior under real load — unknown. That's the gap.
This document specifies the structural answer: a production-recording
to deterministic-replay to bit-equal-validation loop where every
persona turn in production produces a signed TurnReplayRecord that
can be replayed against current substrate with deterministic-identical
output, or fails loud with a typed ReplayDivergence.
## Four Substrate-Enforced Properties
Property 1 — Every turn produces a signed TurnReplayRecord.
Substrate enforces by type; persona-cognition handle_frame returns
ModuleResult::Ok only after the record is signed.
Property 2 — Records persist to a tamper-evident archive.
~/.continuum/replay/<turn_date>/<turn_id>.jsonl with chain-hash
linking. Same shape as audit-recorder (#1344). Persona-private by
default; federation requires explicit consent.
Property 3 — Deterministic replay against current substrate.
'cargo replay <turn_id>' reconstructs substrate state (policy_version,
working-set tier sizes, persona IdentityStateSnapshot), re-runs
persona-cognition, produces a new record, diffs structured fields
bit-equal. Three named divergence severities:
BoundedNonDeterminism (logged), DecisionBoundaryCrossed (FAILS the
harness), SubstrateStateDrift (flagged + rerun).
Property 4 — Sentinel + harnesses consume records FROM PROD, not
synthetic. Sentinel-AI attribution loop reads from the replay
archive only; if archive is empty, emits NoTracesYet (explicit,
not silent). Validation harnesses get a Tier-1 entry
prod-replay-harness that consumes captured records and asserts
bit-equal reproduction.
## Capture Discipline (Substrate-Enforced)
1. No synthetic-fixture path produces TurnReplayRecord. Test scaffolds
construct synthetic frames but persona-cognition writes records
ONLY when invoked through the production module-loop. Synthetic
runs do not write to the archive. Prevents 'replay-harness passes
against fake data' failure mode.
2. Sampling configurable; defaults 100%. High-volume deployments sample
via governor policy; sampling decisions are themselves recorded.
Per-persona consent applies; opted-out persona's turns produce no
records, replay-harness skips with NotCaptured marker.
3. Privacy isolation structural. Cross-persona read requires explicit
consent (same shape as engram sharing).
4. Records content-addressable. turn_id = content hash of
(persona, frame_id, signature). Federation collisions are
deterministic; no duplicates, no silent overwrites.
## Replay Discipline
1. Substrate-state reconstruction is faithful or refused.
ReplayError::PolicyVersionUnknown when local doesn't have the
recorded policy version. Never silently substituted.
2. Recall index snapshotted, not regenerated. Replay loads exact
artifacts by content hash; ArtifactRetired error if any were
retired in the meantime. Catches 'replay passes only because
substrate evolved away from original state.'
3. Determinism boundaries named. BoundedNonDeterminism allowed for
documented sources (parallel embedding order, tie-breaking);
anything outside the documented set is DecisionBoundaryCrossed.
4. Replay cost = capture cost inverted. Capture sub-ms;
replay bounded by original inference cost. Harnesses bound by
turn count or wall-clock budget, feasible per-PR.
## End-To-End ASCII Flow
Four-stage diagram showing: production capture → archive →
deterministic replay → sentinel attribution → validation harness.
Every step typed, every transition observable, every divergence has
a named severity.
## Acceptance Criteria
Capture: persona-cognition produces signed records on production
path only (regression test asserts synthetic path produces 0
records, production path produces N for N turns). Archive
append-only with chain-hash. Cross-persona read denied.
Replay: bit-equal reproduction in structured-fields domain.
Tampered record fails verify. Retired-artifact records surface
ArtifactRetired not silent substitution.
End-to-end: prod-replay-harness as Tier 1 in
PERFORMANCE-HARNESS-FRAMEWORK; DecisionBoundaryCrossed divergence
fails PR.
Sentinel: reads from replay archive (not synthetic); smoke test
empties archive, observes NoTracesYet emission; populates archive,
observes attribution within one consolidation cycle.
## Why This Earns Its Space
A 25-PR substrate landing is impressive volume but it's substrate
scaffolding. Without prod-replay, every claim about behavior is
'the tests say so.' With prod-replay: a persona that drifted in
production is reproducible bit-for-bit; sentinel's claims are
checkable against real turn-by-turn evidence; regressions trip the
harness before they can poison main; the 'rigged demo' gap is
closed by structural enforcement, not by adding QA process.
This is 100% Rust cognition + proof it works as substrate property,
not as audit findings.
## Open Questions (6)
Sampling under high load. Replay archive size growth + cold archive.
Cross-substrate-version replay. Capture during sentinel refinement.
Federated replay-records. The 'always rig up' failure mode the
substrate must structurally prevent (synthetic path producing 0
records is the test).
Doc-only PR. Implementation lands per Lane D + the next-tier cognition
modules. This document specifies the alpha-gate.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds `docs/architecture/PROD-COGNITION-REPLAY.md` (287 lines). The spec for the production-recording → deterministic-replay → bit-equal-validation loop. Doc-only PR; implementation lands per Lane D + the next-tier cognition modules.
Why
Joel 2026-05-18: "We need 100% Rust cognition sooner rather than later and proof it works. Solid recording and replay of persona, FROM PROD, not just dummy proof of concepts these guys always rig up. They need to up their game."
The substrate shipped end-to-end in Rust over the weekend (~25+ PRs across governor + working-set + recall + audit-recorder + check_redundancy oxidation). None of it has been validated against production traffic. TurnReplayRecord exists in PERSONA-COGNITION-CONTRACT; no record has ever been written from a real turn. The chat-roundtrip-live-harness uses `RuntimeFrame::synthetic_chat("hello")`. This document closes that gap structurally.
Four Properties The Substrate Enforces
The "Always Rig Up" Failure Mode Closed Structurally
Synthetic-fixture path produces 0 records. `persona-cognition` writes `TurnReplayRecord` ONLY when invoked through the production module-loop. Test scaffolds may construct synthetic frames but those runs don't write to the archive. Replay-harness "looks good in demo" cannot be confused for "works in prod" because there's no fake data to pass against. A regression test asserts the structural property: synthetic path produces 0 records, production path produces N for N turns.
Acceptance Criteria
Capture side: persona-cognition produces signed records on production path only. Cross-persona read denied. Chain-hash linking.
Replay side: bit-equal in structured-fields domain. Tampered records fail verify. Retired-artifact records surface ArtifactRetired, not silent substitution.
End-to-end: prod-replay-harness as Tier 1; DecisionBoundaryCrossed divergence fails PR.
Sentinel: smoke test empties archive, observes NoTracesYet; populates archive, observes attribution within one consolidation cycle.
Six Open Questions
Sampling under high load. Replay archive size growth + cold archive. Cross-substrate-version replay. Capture during sentinel refinement. Federated replay-records. The "always rig up" structural-prevention test (synthetic path = 0 records, with a regression test).
Companion PRs
This is the alpha-gate for cognition. The substrate-shaped pieces shipped; this is the validation layer that proves it actually works.