Skip to content

feat(e2e-cli): add flush-retry loop to simulate real flush policy#1173

Open
abueide wants to merge 6 commits intomasterfrom
tapi/cli-flush-retry-loop
Open

feat(e2e-cli): add flush-retry loop to simulate real flush policy#1173
abueide wants to merge 6 commits intomasterfrom
tapi/cli-flush-retry-loop

Conversation

@abueide
Copy link
Contributor

@abueide abueide commented Mar 18, 2026

Summary

  • Replaces the single flush() + 500ms wait with a proper flush-retry loop that simulates the flush policy cadence a real app uses
  • The CLI now drives repeated flush cycles — flush → check pending → wait for backoff → repeat — until the queue is empty or maxRetries is exceeded
  • Reports accurate success (false when events remain in queue or were permanently dropped) and sentBatches (computed from delivered event count)
  • Forward-compatible with the tapi RetryManager: reads backoff state (READY / BACKING_OFF / RATE_LIMITED) and waitUntilTime when available; falls back to a fixed 100ms delay on master where RetryManager doesn't exist
  • Enables the retry test suite in e2e-config.json

Context

The SDK uses a deferred retry model: failed events stay in a persistent queue and are retried on the next flush policy trigger (timer every 30s, count threshold at 20). The CLI had no flush policies running — it called flush() once and exited. Events that received retryable errors (5xx, 429) were left in the queue with no retry, causing ~20 retry e2e tests to fail.

This PR is independent of the tapi PR stack (#1156#1160). On master, the SDK doesn't differentiate retryable from non-retryable errors (all non-2xx leave events in queue for retry), so the retry loop enables basic retry behavior. Full error classification, backoff timing, and X-Retry-Count headers require the tapi features.

What this fixes (on master)

  • Retry tests that only need basic retry → success pattern (500 → retry → 200)
  • success field reporting (was hardcoded to true)
  • sentBatches tracking (was hardcoded to 0)

What still needs the tapi branch

  • Error classification (permanent drops on 4xx vs retry on 5xx)
  • RetryManager backoff timing (exponential backoff, Retry-After)
  • X-Retry-Count header incrementing

Test plan

  • esbuild build succeeds
  • CLI smoke test with empty input returns {"success":true,"sentBatches":0}
  • All 66 existing unit test suites pass (362 tests)
  • Run e2e tests against mock server to verify retry loop behavior

🤖 Generated with Claude Code

abueide and others added 6 commits March 18, 2026 13:27
The CLI previously called flush() once and exited, so events that
received retryable errors (5xx, 429) stayed in the queue with no
retry. This implements the retry loop that flush policies drive in a
real app: flush → check pending → wait for backoff → repeat.

- Flush-retry loop respects maxRetries from test config
- Forward-compatible with tapi RetryManager (reads backoff state when
  available, falls back to fixed delay on master)
- Tracks permanently dropped events via logger interception
- Reports success=false when events remain or are dropped
- Computes sentBatches from delivered event count
- Enables retry test suite in e2e-config.json

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of manually calling flush() in a loop and reading private
RetryManager state, let the SDK's built-in flush policies drive
retries. TimerFlushPolicy fires every flushInterval (100ms default
for e2e), and the RetryManager gates actual uploads during backoff.

The CLI just triggers the initial flush, then polls pendingEvents()
until the queue drains or 30s timeout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Pass maxRetries from test config into httpConfig overrides so the SDK
  enforces retry limits during e2e tests
- Set output.error when permanentDropCount > 0 so failure reporting tests
  get a truthy error field
- Add BROWSER_BATCHING=true to e2e-config.json to skip tests that assume
  ephemeral per-request batching (RN uses persistent queue re-chunking)
- Add jsx: react to tsconfig.json for .tsx transitive imports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant