Return replay retry state from orchestration recovery by juliusmarminge · Pull Request #1728 · pingdotgg/t3code

juliusmarminge · 2026-04-04T02:59:45Z

Summary

Change completeReplayRecovery() to return structured replay outcome data instead of a bare boolean.
Preserve the replay progress signal while separately surfacing whether another replay should be attempted.
Add coverage for the new retry behavior when replay makes no progress with and without newer observed sequences.

Testing

Updated apps/web/src/orchestrationRecovery.test.ts to assert the new completion shape and retry cases.
Not run: bun fmt
Not run: bun lint
Not run: bun typecheck
Not run: bun run test

Note

Medium Risk
Changes orchestration replay recovery control flow and retry behavior in the app root event router, which can impact client/server state synchronization during sequence gaps. Risk is mitigated by added unit tests but could still affect recovery edge cases (e.g., no-progress replays, disposal timing).

Overview
Replay recovery now returns structured completion data instead of a boolean. completeReplayRecovery() returns { replayMadeProgress, shouldReplay }, separating “did the replay advance the sequence” from “do we still need another replay.”

Adds bounded, backoff-based retries for no-progress replays. New deriveReplayRetryDecision tracks consecutive no-progress attempts for the same replay frontier, retries with exponential delays (100ms base) up to a max (3), resets the budget when the frontier changes, and logs a warning when stopping early.

Updates callers and tests. EventRouter uses the new completion shape and retry decision (including clearing the tracker on replay failure), and orchestrationRecovery.test.ts is expanded to assert the new return type and retry/stop behavior.

^{Reviewed by Cursor Bugbot for commit cb1880f. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Return structured replay retry state with exponential backoff from orchestration recovery

completeReplayRecovery now returns a ReplayRecoveryCompletion object ({ replayMadeProgress, shouldReplay }) instead of a boolean, and no longer unconditionally clears pendingReplay on no-progress completion.
Adds deriveReplayRetryDecision in orchestrationRecovery.ts to compute whether to retry a replay and with what delay: immediate retry on progress, capped exponential backoff (base 100ms, up to 3 attempts) when there is no progress on the same frontier, and budget reset when the frontier changes.
routes/__root.tsx uses the new return value to schedule retries, logging a warning when no-progress retries are exhausted.
Behavioral Change: previously a truthy return from completeReplayRecovery triggered an immediate retry unconditionally; now retries are budgeted and backoff-gated when no progress is observed.

^{Macroscope summarized cb1880f.}

- Distinguish replay progress from retry eligibility - Cover retry and no-op replay cases in tests

coderabbitai · 2026-04-04T02:59:53Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7d9bd605-25a8-4d21-a771-4341fe803f78

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch t3code/project-open-flag

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Caller treats object return value as boolean
- Changed the truthiness check on completeReplayRecovery() to access .shouldReplay on the returned ReplayRecoveryCompletion object, so replay recovery is only triggered when shouldReplay is true.

Or push these changes by commenting:

@cursor push 4d69809910

Preview (4d69809910)

diff --git a/apps/web/src/routes/__root.tsx b/apps/web/src/routes/__root.tsx
--- a/apps/web/src/routes/__root.tsx
+++ b/apps/web/src/routes/__root.tsx
@@ -440,7 +440,7 @@
         return;
       }
 
-      if (!disposed && recovery.completeReplayRecovery()) {
+      if (!disposed && recovery.completeReplayRecovery().shouldReplay) {
         void recoverFromSequenceGap();
       }
     };

_{You can send follow-ups to the cloud agent here.}

apps/web/src/orchestrationRecovery.ts

macroscopeapp · 2026-04-04T03:01:50Z

Approvability

Verdict: Needs human review

This PR introduces new runtime behavior for orchestration recovery: exponential backoff delays and capped retry attempts. While well-tested and authored by the module's owner, the changes affect when and how often the app retries during sequence gap recovery, warranting human review.

^{You can customize Macroscope's approvability policy. Learn more.}

- wait briefly when replay recovery makes no progress - retry sequence-gap recovery after replay completion

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

✅ Fixed: Unbounded retry loop when replay consistently makes no progress
- Reset highestObservedSequence to latestSequence when replay makes no progress, so stale observations no longer permanently satisfy the observedAhead condition and cause infinite retries.

Or push these changes by commenting:

@cursor push 89821febb3

Preview (89821febb3)

diff --git a/apps/web/src/orchestrationRecovery.test.ts b/apps/web/src/orchestrationRecovery.test.ts
--- a/apps/web/src/orchestrationRecovery.test.ts
+++ b/apps/web/src/orchestrationRecovery.test.ts
@@ -65,7 +65,7 @@
     });
   });
 
-  it("retries replay when no progress was made but higher live sequences were observed", () => {
+  it("does not retry replay when no progress was made even if higher live sequences were previously observed", () => {
     const coordinator = createOrchestrationRecoveryCoordinator();
 
     coordinator.beginSnapshotRecovery("bootstrap");
@@ -75,11 +75,11 @@
 
     expect(coordinator.completeReplayRecovery()).toEqual({
       replayMadeProgress: false,
-      shouldReplay: true,
+      shouldReplay: false,
     });
     expect(coordinator.getState()).toMatchObject({
       latestSequence: 3,
-      highestObservedSequence: 5,
+      highestObservedSequence: 3,
       pendingReplay: false,
       inFlight: null,
     });

diff --git a/apps/web/src/orchestrationRecovery.ts b/apps/web/src/orchestrationRecovery.ts
--- a/apps/web/src/orchestrationRecovery.ts
+++ b/apps/web/src/orchestrationRecovery.ts
@@ -130,6 +130,9 @@
         replayStartSequence !== null && state.latestSequence > replayStartSequence;
       replayStartSequence = null;
       state.inFlight = null;
+      if (!replayMadeProgress) {
+        state.highestObservedSequence = state.latestSequence;
+      }
       const replayResolution = resolveReplayNeedAfterRecovery();
       return {
         replayMadeProgress,

_{You can send follow-ups to the cloud agent here.}

^{Reviewed by Cursor Bugbot for commit 5c18bad. Configure here.}

apps/web/src/routes/__root.tsx

- Add retry tracking for stalled orchestration replay - Reset retry budget when the replay frontier changes - Log when replay recovery stops after exhausting retries - Co-authored-by: codex <codex@users.noreply.github.com>

…-retry-state Merge upstream: Return replay retry state from orchestration recovery (pingdotgg#1728)

Return replay retry state from recovery coordinator

a41968d

- Distinguish replay progress from retry eligibility - Cover retry and no-op replay cases in tests

github-actions bot added size:S 10-29 changed lines (additions + deletions). vouch:trusted PR author is trusted by repo permissions or the VOUCHED list. labels Apr 4, 2026

cursor bot reviewed Apr 4, 2026

View reviewed changes

apps/web/src/orchestrationRecovery.ts Show resolved Hide resolved

Delay replay recovery retries until progress resumes

5c18bad

- wait briefly when replay recovery makes no progress - retry sequence-gap recovery after replay completion

github-actions bot added size:M 30-99 changed lines (additions + deletions). and removed size:S 10-29 changed lines (additions + deletions). labels Apr 4, 2026

cursor bot reviewed Apr 4, 2026

View reviewed changes

apps/web/src/routes/__root.tsx Show resolved Hide resolved

Bound replay recovery retries after no progress

cb1880f

- Add retry tracking for stalled orchestration replay - Reset retry budget when the replay frontier changes - Log when replay recovery stops after exhausting retries - Co-authored-by: codex <codex@users.noreply.github.com>

github-actions bot added size:L 100-499 changed lines (additions + deletions). and removed size:M 30-99 changed lines (additions + deletions). labels Apr 4, 2026

juliusmarminge merged commit 6de4b47 into main Apr 4, 2026
12 checks passed

juliusmarminge deleted the t3code/project-open-flag branch April 4, 2026 05:05

aaditagrawal pushed a commit to aaditagrawal/t3code that referenced this pull request Apr 5, 2026

Return replay retry state from orchestration recovery (pingdotgg#1728)

d342960

aaditagrawal mentioned this pull request Apr 5, 2026

Merge upstream: Return replay retry state from orchestration recovery (#1728) aaditagrawal/t3code#47

Merged

aaditagrawal added a commit to aaditagrawal/t3code that referenced this pull request Apr 5, 2026

Merge pull request #47 from aaditagrawal/merge/upstream-1728-recovery…

1afeec4

…-retry-state Merge upstream: Return replay retry state from orchestration recovery (pingdotgg#1728)

gigq pushed a commit to gigq/t3code that referenced this pull request Apr 6, 2026

Return replay retry state from orchestration recovery (pingdotgg#1728)

10b9b45

Chrono-byte pushed a commit to Chrono-byte/t3code that referenced this pull request Apr 7, 2026

Return replay retry state from orchestration recovery (pingdotgg#1728)

d6f3239

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return replay retry state from orchestration recovery#1728

Return replay retry state from orchestration recovery#1728
juliusmarminge merged 3 commits intomainfrom
t3code/project-open-flag

juliusmarminge commented Apr 4, 2026 •

edited by macroscopeapp bot

Loading

Uh oh!

coderabbitai bot commented Apr 4, 2026 •

edited

Loading

Review skipped

Uh oh!

cursor bot left a comment •

edited

Loading

Uh oh!

Uh oh!

macroscopeapp bot commented Apr 4, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

juliusmarminge commented Apr 4, 2026 • edited by macroscopeapp bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Return structured replay retry state with exponential backoff from orchestration recovery

Uh oh!

coderabbitai bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

cursor bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

macroscopeapp bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approvability

Uh oh!

cursor bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

juliusmarminge commented Apr 4, 2026 •

edited by macroscopeapp bot

Loading

coderabbitai bot commented Apr 4, 2026 •

edited

Loading

cursor bot left a comment •

edited

Loading

macroscopeapp bot commented Apr 4, 2026 •

edited

Loading

cursor bot left a comment •

edited

Loading