feat(agent): add sliding replay window and configurable self-verification by simonovic86 · Pull Request #2 · simonovic86/igor

simonovic86 · 2026-03-01T22:44:05Z

Summary

Replace single-tick replay state with a sliding window of TickSnapshots (default: 16), enabling verification of any recent tick rather than only the most recent one
Add --replay-window and --verify-interval CLI flags to configure replay retention size and verification frequency
Decouple self-verification from the checkpoint timer — verification now runs every N ticks (default: 5) and sweeps the window oldest-first

Test plan

make check passes (fmt, vet, lint, 52 tests)
TestReplayWindow_Eviction — verifies FIFO eviction with window size 3 over 5 ticks
TestLatestSnapshot — verifies accessor for empty and populated windows
TestTick_RecordObservations — verifies snapshot storage in ReplayWindow
Migration replay tests updated to use new ReplayWindow API
Multi-node integration test passes with replay data verification
Config defaults test covers new ReplayWindowSize and VerifyInterval fields

🤖 Generated with Claude Code

Implement deterministic single-tick replay verification (CM-4) with a new internal/replay package. The replay engine creates an isolated wazero sandbox with replay-mode hostcalls that return recorded observation values, resumes from a checkpoint, executes one tick, and compares the resulting state against the expected post-tick checkpoint. Wire replay into the live tick loop: each tick captures pre/post state and the sealed event log, and the periodic checkpoint (every 5s) triggers replay verification of the last tick. Divergences are logged but do not halt execution (EI-6: Safety Over Liveness). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Source node includes replay data (pre-tick state + event log) in the migration package. Target node replays the last tick in an isolated sandbox before accepting the agent — if the replayed state diverges from the checkpoint, the migration is rejected. Adds ReplayData/ReplayEntry protocol types, staleness guard to ensure replay data matches the stored checkpoint, and ExtractAgentState helper for v1 checkpoint parsing. Backward compatible: nil replay data means verification is skipped. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Spin up two in-process libp2p nodes, migrate an agent between them, and verify the full survival chain: tick → checkpoint → migrate → replay-verify → resume → continue ticking. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Richer example agent that tracks tick count, birth time, uptime, and a running luck value from random bytes. Demonstrates all three observation hostcalls with CM-4 replay-compatible lifecycle functions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tion Replace single-tick replay state with a sliding window of TickSnapshots, enabling verification of any recent tick. Add --replay-window and --verify-interval CLI flags to configure retention size and verification frequency. Decouple verification from checkpoint timer to run on a tick counter instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 92bc522ad7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-01T22:48:59Z

internal/replay/engine.go

+	if _, err := tickFn.Call(ctx); err != nil {
+		result.Error = fmt.Errorf("replay: agent_tick: %w", err)


Bound replay tick execution with a timeout

ReplayTick invokes agent_tick with tickFn.Call(ctx) but never creates a deadline-bound context, so replay duration is fully unbounded by default. In the migration flow, this is called from handleIncomingMigration using context.Background(), which means a transferred agent whose agent_tick hangs can block replay verification indefinitely and tie up the migration handler. Use a bounded timeout context for replay ticks (matching live tick policy) to prevent migration DoS from malformed or malicious packages.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-01T22:48:59Z

internal/agent/instance.go

+	// Capture pre-tick state for replay verification
+	preState, err := i.captureState(ctx)
+	if err != nil {
+		return fmt.Errorf("pre-tick checkpoint failed: %w", err)
+	}


Enforce timeout and metering for replay snapshot captures

Instance.Tick now captures pre/post replay states via captureState outside the timed agent_tick section, and these calls run on the parent context rather than the 100ms tick timeout. Since agent_checkpoint is guest-provided code, an agent can do expensive work (or loop) during checkpoint capture and bypass both timeout enforcement and budget charging, undermining execution safety and billing invariants. Snapshot capture should be deadline-bound and included in per-tick cost accounting.

Useful? React with 👍 / 👎.

…eplay+metering improvements - Add -race flag to test pipeline and CI job timeout (15min) - Fix MigrateAgent EI-1 race: compare-and-delete prevents removing a concurrently registered instance for the same agent ID - Add 2-minute context timeout and 30s stream read deadline to incoming migration handler to prevent resource exhaustion - Cache WASM compilation in replay engine via wazero.CompilationCache shared across ReplayTick invocations (IMPROVEMENTS #1) - Store PostStateHash [32]byte instead of full post-state copy in TickSnapshot, halving snapshot memory (IMPROVEMENTS #2) - Use nanosecond precision for tick metering so sub-microsecond ticks are no longer free (IMPROVEMENTS #10) - Replace custom bytesEqual with bytes.Equal Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…eplay+metering improvements (#4) - Add -race flag to test pipeline and CI job timeout (15min) - Fix MigrateAgent EI-1 race: compare-and-delete prevents removing a concurrently registered instance for the same agent ID - Add 2-minute context timeout and 30s stream read deadline to incoming migration handler to prevent resource exhaustion - Cache WASM compilation in replay engine via wazero.CompilationCache shared across ReplayTick invocations (IMPROVEMENTS #1) - Store PostStateHash [32]byte instead of full post-state copy in TickSnapshot, halving snapshot memory (IMPROVEMENTS #2) - Use nanosecond precision for tick metering so sub-microsecond ticks are no longer free (IMPROVEMENTS #10) - Replace custom bytesEqual with bytes.Equal Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

Replace MustInstantiate with error-returning Instantiate across all WASM init sites. Extract shared tick timeout constant to config.TickTimeout. Unify manifest sidecar loading into pkg/manifest.LoadSidecarData. Fix CI to install TinyGo before tests so WASM integration tests run. Add test coverage reporting. Add validateIncomingManifest and LoadSidecarData tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: code structure, docs accuracy, and test coverage Extract tick loop logic from cmd/igord/main.go into internal/runner for testability. Deduplicate captureState/replayResume (3 copies) into internal/wasmutil. Update HOSTCALL_ABI.md from "Design Draft" to reflect implemented state. Add WASM hash mismatch and receipt corruption tests. Flag unimplemented spec items (CM-5, OA-2, EI-11) in status doc. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: code quality, security hardening, and test coverage (#2) Replace MustInstantiate with error-returning Instantiate across all WASM init sites. Extract shared tick timeout constant to config.TickTimeout. Unify manifest sidecar loading into pkg/manifest.LoadSidecarData. Fix CI to install TinyGo before tests so WASM integration tests run. Add test coverage reporting. Add validateIncomingManifest and LoadSidecarData tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

simonovic86 and others added 8 commits March 1, 2026 22:38

Merge branch 'claude/stoic-faraday'

8004273

chore(dev): add igord to .gitignore

3855fdf

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'claude/stoic-faraday'

fd7cf7a

simonovic86 changed the title ~~Claude/loving mcclintock~~ feat(agent): add sliding replay window and configurable self-verification Mar 1, 2026

simonovic86 merged commit 0990ad1 into main Mar 1, 2026
1 check passed

simonovic86 deleted the claude/loving-mcclintock branch March 1, 2026 22:45

chatgpt-codex-connector bot reviewed Mar 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): add sliding replay window and configurable self-verification#2

feat(agent): add sliding replay window and configurable self-verification#2
simonovic86 merged 8 commits intomainfrom
claude/loving-mcclintock

simonovic86 commented Mar 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if _, err := tickFn.Call(ctx); err != nil {
		result.Error = fmt.Errorf("replay: agent_tick: %w", err)

Conversation

simonovic86 commented Mar 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

simonovic86 commented Mar 1, 2026 •

edited

Loading