Skip to content

docs(architecture): PROD-COGNITION-REPLAY — 100% Rust cognition + proof from PROD not POC#1386

Merged
joelteply merged 1 commit into
canaryfrom
joel/docs-prod-cognition-replay
May 18, 2026
Merged

docs(architecture): PROD-COGNITION-REPLAY — 100% Rust cognition + proof from PROD not POC#1386
joelteply merged 1 commit into
canaryfrom
joel/docs-prod-cognition-replay

Conversation

@joelteply
Copy link
Copy Markdown
Contributor

What

Adds `docs/architecture/PROD-COGNITION-REPLAY.md` (287 lines). The spec for the production-recording → deterministic-replay → bit-equal-validation loop. Doc-only PR; implementation lands per Lane D + the next-tier cognition modules.

Why

Joel 2026-05-18: "We need 100% Rust cognition sooner rather than later and proof it works. Solid recording and replay of persona, FROM PROD, not just dummy proof of concepts these guys always rig up. They need to up their game."

The substrate shipped end-to-end in Rust over the weekend (~25+ PRs across governor + working-set + recall + audit-recorder + check_redundancy oxidation). None of it has been validated against production traffic. TurnReplayRecord exists in PERSONA-COGNITION-CONTRACT; no record has ever been written from a real turn. The chat-roundtrip-live-harness uses `RuntimeFrame::synthetic_chat("hello")`. This document closes that gap structurally.

Four Properties The Substrate Enforces

  1. Every turn produces a signed TurnReplayRecord. Substrate enforces by type: persona-cognition's handle_frame returns ModuleResult::Ok only after the record is signed.
  2. Records persist to a tamper-evident archive. `~/.continuum/replay/<turn_date>/<turn_id>.jsonl` with chain-hash linking; persona-private by default; same shape as audit-recorder.
  3. Deterministic replay against current substrate. `cargo replay <turn_id>` reconstructs substrate state and re-runs persona-cognition; bit-equal in structured fields, or fails loud with typed ReplayComparison (BoundedNonDeterminism / DecisionBoundaryCrossed / SubstrateStateDrift).
  4. Sentinel + harnesses consume records FROM PROD, not synthetic. Sentinel-AI's attribution loop reads from the replay archive only; if empty, emits NoTracesYet (explicit, not silent). `prod-replay-harness` is added as Tier 1 in PERFORMANCE-HARNESS-FRAMEWORK.

The "Always Rig Up" Failure Mode Closed Structurally

"these guys always rig up" — Joel naming the failure: a working demo that doesn't survive production.

Synthetic-fixture path produces 0 records. `persona-cognition` writes `TurnReplayRecord` ONLY when invoked through the production module-loop. Test scaffolds may construct synthetic frames but those runs don't write to the archive. Replay-harness "looks good in demo" cannot be confused for "works in prod" because there's no fake data to pass against. A regression test asserts the structural property: synthetic path produces 0 records, production path produces N for N turns.

Acceptance Criteria

Capture side: persona-cognition produces signed records on production path only. Cross-persona read denied. Chain-hash linking.

Replay side: bit-equal in structured-fields domain. Tampered records fail verify. Retired-artifact records surface ArtifactRetired, not silent substitution.

End-to-end: prod-replay-harness as Tier 1; DecisionBoundaryCrossed divergence fails PR.

Sentinel: smoke test empties archive, observes NoTracesYet; populates archive, observes attribution within one consolidation cycle.

Six Open Questions

Sampling under high load. Replay archive size growth + cold archive. Cross-substrate-version replay. Capture during sentinel refinement. Federated replay-records. The "always rig up" structural-prevention test (synthetic path = 0 records, with a regression test).

Companion PRs

This is the alpha-gate for cognition. The substrate-shaped pieces shipped; this is the validation layer that proves it actually works.

Joel 2026-05-18: 'We need 100% Rust cognition sooner rather than later
and proof it works. Solid recording and replay of persona, FROM PROD,
not just dummy proof of concepts these guys always rig up. They need
to up their game.'

The substrate has shipped end-to-end in Rust over the last 48 hours
(governor + working-set + recall + audit-recorder + check_redundancy
oxidation, ~25+ PRs). None of it has been validated against
production traffic. TurnReplayRecord type exists; no production turn
has been recorded. Chat-roundtrip-live-harness exists; it consumes
RuntimeFrame::synthetic_chat('hello'). Tests pass; demos work;
behavior under real load — unknown. That's the gap.

This document specifies the structural answer: a production-recording
to deterministic-replay to bit-equal-validation loop where every
persona turn in production produces a signed TurnReplayRecord that
can be replayed against current substrate with deterministic-identical
output, or fails loud with a typed ReplayDivergence.

## Four Substrate-Enforced Properties

Property 1 — Every turn produces a signed TurnReplayRecord.
Substrate enforces by type; persona-cognition handle_frame returns
ModuleResult::Ok only after the record is signed.

Property 2 — Records persist to a tamper-evident archive.
~/.continuum/replay/<turn_date>/<turn_id>.jsonl with chain-hash
linking. Same shape as audit-recorder (#1344). Persona-private by
default; federation requires explicit consent.

Property 3 — Deterministic replay against current substrate.
'cargo replay <turn_id>' reconstructs substrate state (policy_version,
working-set tier sizes, persona IdentityStateSnapshot), re-runs
persona-cognition, produces a new record, diffs structured fields
bit-equal. Three named divergence severities:
BoundedNonDeterminism (logged), DecisionBoundaryCrossed (FAILS the
harness), SubstrateStateDrift (flagged + rerun).

Property 4 — Sentinel + harnesses consume records FROM PROD, not
synthetic. Sentinel-AI attribution loop reads from the replay
archive only; if archive is empty, emits NoTracesYet (explicit,
not silent). Validation harnesses get a Tier-1 entry
prod-replay-harness that consumes captured records and asserts
bit-equal reproduction.

## Capture Discipline (Substrate-Enforced)

1. No synthetic-fixture path produces TurnReplayRecord. Test scaffolds
   construct synthetic frames but persona-cognition writes records
   ONLY when invoked through the production module-loop. Synthetic
   runs do not write to the archive. Prevents 'replay-harness passes
   against fake data' failure mode.

2. Sampling configurable; defaults 100%. High-volume deployments sample
   via governor policy; sampling decisions are themselves recorded.
   Per-persona consent applies; opted-out persona's turns produce no
   records, replay-harness skips with NotCaptured marker.

3. Privacy isolation structural. Cross-persona read requires explicit
   consent (same shape as engram sharing).

4. Records content-addressable. turn_id = content hash of
   (persona, frame_id, signature). Federation collisions are
   deterministic; no duplicates, no silent overwrites.

## Replay Discipline

1. Substrate-state reconstruction is faithful or refused.
   ReplayError::PolicyVersionUnknown when local doesn't have the
   recorded policy version. Never silently substituted.

2. Recall index snapshotted, not regenerated. Replay loads exact
   artifacts by content hash; ArtifactRetired error if any were
   retired in the meantime. Catches 'replay passes only because
   substrate evolved away from original state.'

3. Determinism boundaries named. BoundedNonDeterminism allowed for
   documented sources (parallel embedding order, tie-breaking);
   anything outside the documented set is DecisionBoundaryCrossed.

4. Replay cost = capture cost inverted. Capture sub-ms;
   replay bounded by original inference cost. Harnesses bound by
   turn count or wall-clock budget, feasible per-PR.

## End-To-End ASCII Flow

Four-stage diagram showing: production capture → archive →
deterministic replay → sentinel attribution → validation harness.
Every step typed, every transition observable, every divergence has
a named severity.

## Acceptance Criteria

Capture: persona-cognition produces signed records on production
path only (regression test asserts synthetic path produces 0
records, production path produces N for N turns). Archive
append-only with chain-hash. Cross-persona read denied.

Replay: bit-equal reproduction in structured-fields domain.
Tampered record fails verify. Retired-artifact records surface
ArtifactRetired not silent substitution.

End-to-end: prod-replay-harness as Tier 1 in
PERFORMANCE-HARNESS-FRAMEWORK; DecisionBoundaryCrossed divergence
fails PR.

Sentinel: reads from replay archive (not synthetic); smoke test
empties archive, observes NoTracesYet emission; populates archive,
observes attribution within one consolidation cycle.

## Why This Earns Its Space

A 25-PR substrate landing is impressive volume but it's substrate
scaffolding. Without prod-replay, every claim about behavior is
'the tests say so.' With prod-replay: a persona that drifted in
production is reproducible bit-for-bit; sentinel's claims are
checkable against real turn-by-turn evidence; regressions trip the
harness before they can poison main; the 'rigged demo' gap is
closed by structural enforcement, not by adding QA process.

This is 100% Rust cognition + proof it works as substrate property,
not as audit findings.

## Open Questions (6)

Sampling under high load. Replay archive size growth + cold archive.
Cross-substrate-version replay. Capture during sentinel refinement.
Federated replay-records. The 'always rig up' failure mode the
substrate must structurally prevent (synthetic path producing 0
records is the test).

Doc-only PR. Implementation lands per Lane D + the next-tier cognition
modules. This document specifies the alpha-gate.
@joelteply joelteply merged commit 29bf1ce into canary May 18, 2026
2 checks passed
@joelteply joelteply deleted the joel/docs-prod-cognition-replay branch May 18, 2026 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant