Skip to content

release: rebuild open main PR wave with review fixes#387

Merged
ndycode merged 42 commits intomainfrom
rebuild/main-open-prs
Apr 6, 2026
Merged

release: rebuild open main PR wave with review fixes#387
ndycode merged 42 commits intomainfrom
rebuild/main-open-prs

Conversation

@ndycode
Copy link
Copy Markdown
Owner

@ndycode ndycode commented Apr 6, 2026

Summary

  • rebuild the full open PR stack onto one integration branch from main
  • fold in the remaining review fixes so the rebuilt branch is mergeable on its own
  • replace the split review queue with one superseding PR

Includes

  • retry, failover, and quota-handling work from #357, #358, #359, #360, #361, #362, #363, #364, #365, #366, #370, #374, #375, and #386
  • runtime, storage, sync, and reporting work from #367, #368, #369, #371, #372, #373, #376, #377, #378, #379, #380, #381, #382, #383, #384, and #385
  • the review follow-ups that were still blocking the split stack, including request-budget accounting fixes, stable rate-limit identity tracking, safer 404 remapping, atomic runtime snapshot persistence, snapshot versioning, and runtime observability test isolation

Validation

  • npm run lint
  • npm run typecheck
  • npm test
  • npm run build

Notes

  • supersedes #357 through #386; those PRs have been closed in favor of this branch
  • this branch was rebuilt from main in dependency order and validated after the stacked request-reliability/runtime-observability chain was folded in
  • rollback path is to revert rebuild/main-open-prs or back out the grouped rebuild commits if the combined wave needs to be unwound

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

this is a large integration wave folding ~30 PRs into one branch. it ships: configurable rate-limit backoff with jitter, per-request finite outbound attempt budgets, pool-exhaustion and server-burst fast-fail cooldowns, adaptive stream failover candidate selection (filters cooldown + rate-limit state), runtime observability snapshots with atomic temp-rename persistence, storage health inspection with silent wal probing, config-key-driven live-sync restarts, and session-affinity write-version guards against stale async overwrites. test coverage across the new modules is substantial and the overall architecture is sound.

two issues need attention before merge:

  • lib/storage/account-clear.ts — the reset marker is now written after the primary storage files are deleted. a crash or persistent EBUSY/EPERM on walPath during the clear leaves no marker, and the next startup performs stale wal recovery — silently undoing the wipe. this is a windows filesystem safety regression introduced by the new sequencing.
  • lib/request/response-metadata.tsMAX_RETRY_HINT_MS was raised from 5 minutes to 24 hours. a single misbehaving Retry-After header can now lock an account for a full day because delayMs = Math.max(baseDelay, jitteredDelay) surfaces the full baseDelay past the per-backoff maxBackoffMs cap.

other notable changes:

  • retryAllAccountsRateLimited default flipped to false, retryAllAccountsMaxRetries default changed to 0 — this is a breaking behavioral change for users relying on the previous defaults; worth a changelog note
  • withStreamingFailover now aborts replay when emittedBytes > 0, preventing duplicate stream content and side-effectful tool replay — correct
  • CooldownReason union in migrations.ts gains server-error to align with the failure-policy.ts rename — consistent

Confidence Score: 4/5

safe to merge after addressing the account-clear ordering regression and the 24 h retry-after cap

two P1 findings remain: the account-clear reset marker ordering creates a windows crash-safety regression that can silently restore stale data after a failed wipe, and the 24 h MAX_RETRY_HINT_MS can lock accounts for a full day on a single bad Retry-After header. both are in changed lines in this PR and both have straightforward fixes. the rest of the work is well-structured and the test coverage is thorough.

lib/storage/account-clear.ts (reset marker ordering), lib/request/response-metadata.ts (24 h retry-after cap)

Important Files Changed

Filename Overview
lib/storage/account-clear.ts reset marker ordering changed to post-deletion — crash or persistent EBUSY on walPath leaves no marker and enables stale WAL recovery on Windows
lib/request/response-metadata.ts MAX_RETRY_HINT_MS raised from 5 min to 24 h, allowing server Retry-After headers to lock accounts for a full day
lib/request/request-resilience.ts new pool-exhaustion and server-burst cooldown module; clearServerBurstCooldown called on success may prematurely reset burst state
lib/request/request-attempt-budget.ts new finite outbound request attempt budget with stream failover candidate ordering — clean
lib/request/rate-limit-backoff.ts backoff now per-plugin-instance configurable with jitter and stable account key for dedup identity — solid
lib/request/stream-failover.ts emittedBytes > 0 guard stops replay after partial emission — correct safety fix
lib/runtime/runtime-observability.ts new snapshot persistence with atomic temp-rename writes and VITEST env guard for test isolation
lib/runtime/live-sync.ts config-key-driven sync restart on debounce/poll changes — correct
lib/storage/health.ts new storage health summary type and factory — clean
lib/storage.ts new inspectStorageHealth with silent WAL probe; loadAccountsFromJournal gains silent option
lib/session-affinity.ts write-version guard prevents stale async affinity overwrites — correct concurrency fix
index.ts major request pipeline additions with mixed indentation artifacts from stacked merge
lib/config.ts rate-limit backoff params exposed as config; retryAllAccounts defaults changed to false/0
lib/request/failure-policy.ts cooldownReason renamed network-error to server-error — aligns with migrations.ts union
lib/storage/migrations.ts CooldownReason union gains server-error to match failure-policy rename

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Incoming Request] --> B{Pool exhaustion\ncooldown active?}
    B -- yes --> C[Return 429\nfast-fail]
    B -- no --> D{Server burst\ncooldown active?}
    D -- yes --> E[Return 503\nfast-fail]
    D -- no --> F[Compute outbound\nattempt budget]
    F --> G[Account selection\nfrom pool]
    G --> H{Budget\nremaining?}
    H -- no --> I[Return 503\nbudget exhausted]
    H -- yes --> J[Fetch to upstream]
    J --> K{Response status}
    K -- 2xx --> L[Clear pool +\nburst cooldowns]
    L --> M[Sync observability\nsnapshot]
    M --> N[Return success]
    K -- 429 --> O{Short cooldown\nAND short retry\ncount ok?}
    O -- yes --> P[Sleep + retry\nsame account]
    P --> H
    O -- no --> Q[Account cooldown\n→ rotate]
    Q --> G
    K -- 5xx --> R[recordServerBurstFailure]
    R --> S{Burst threshold\nreached?}
    S -- yes --> T[Arm burst cooldown\n→ early 503 next req]
    S -- no --> U[Account cooldown\n→ rotate]
    U --> G
    K -- stream ok --> V[withStreamingFailover]
    V --> W{emittedBytes > 0\nor failovers exhausted?}
    W -- yes --> X[Stop failover]
    W -- no --> Y[Adaptive candidate\nselection]
    Y --> H
Loading

Fix All in Codex

Prompt To Fix All With AI
This is a comment left during a code review.
Path: lib/storage/account-clear.ts
Line: 58-64

Comment:
**reset marker written after file deletion**

the old code wrote the reset marker *first* to fence off stale re-use, then deleted the files. the new order is: delete `path` → delete `walPath` → write marker. if `clearPath(params.walPath, true)` throws after all 5 retries — which happens on windows when `EBUSY`/`EPERM` persists — the function propagates the error and the marker is never written. on the next startup, `path` is gone but `walPath` still exists with no reset marker, so `inspectStorageHealth` detects a recoverable wal and restores stale data the caller intended to wipe. the original first-marker ordering was deliberate crash-safety for this exact scenario.

```suggestion
	await fs.writeFile(
		params.resetMarkerPath,
		JSON.stringify({ version: 1, createdAt: Date.now() }),
		{ encoding: "utf-8", mode: 0o600 },
	);
	await clearPath(params.path, true);
	await clearPath(params.walPath, true);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/request/response-metadata.ts
Line: 1

Comment:
**24 h retry-after cap can lock accounts for a full day**

`MAX_RETRY_HINT_MS` was raised from 5 minutes to 24 hours. in `getRateLimitBackoff`, the final `delayMs` is `Math.max(baseDelay, jitteredDelay)`. `baseDelay` comes directly from `serverRetryAfterMs` (the parsed `Retry-After` header value), while `jitteredDelay` is capped at `maxBackoffMs` (default 60 s). so `Math.max(86_400_000, 60_000)` = 86,400,000 ms — the full `baseDelay` wins and is passed straight to `accountManager.setAccountCooldown`, locking the account for 24 hours. one misbehaving upstream (or a misconfigured header) silently makes every matching account unusable for a full day with no operator warning. the 5-minute cap was an effective upper bound. if intentionally large server hints need to be honored, add an explicit opt-in config knob rather than raising the shared ceiling.

```suggestion
const MAX_RETRY_HINT_MS = 5 * 60 * 1000;
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: index.ts
Line: 332-340

Comment:
**indentation lost on closure-local variables**

`liveAccountSync`, `liveAccountSyncPath`, `liveAccountSyncConfigKey`, `sessionAffinityStore`, and `sessionAffinityWriteVersion` lost their leading tab inside the `OpenAIOAuthPlugin` closure. the variables are still correctly in-scope (verified), but the missing indentation makes them visually appear to be module-level, which is misleading for readers and static-analysis tools. the `RuntimeMetrics` type body (lines ~362–388) has the same regression. looks like a merge-rebase artifact — a quick `prettier` pass before landing would clean this up.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: lib/request/request-resilience.ts
Line: 75-82

Comment:
**success clears full server-burst window across all accounts**

`clearServerBurstCooldown()` is called unconditionally on every successful response. it wipes the entire `serverBurstState` including `windowStartedAt` and the `accountIndices` set. a single in-flight request that lands successfully after three accounts have already triggered the burst threshold resets the detector's memory completely. the burst window is there to protect against a concurrent storm across multiple accounts; one success from any concurrent request can pre-empt a valid cooldown. consider only clearing `cooldownUntil` on success and letting `windowStartedAt` expire naturally, so the burst history is preserved for the dedup window:

```typescript
export function clearServerBurstCooldownOnSuccess(): void {
    if (serverBurstState.cooldownUntil !== null) {
        serverBurstState.cooldownUntil = null;
    }
}
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (7): Last reviewed commit: "test: normalize fetch helpers review nit" | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

ndycode and others added 30 commits April 6, 2026 11:30
lib/runtime/metrics.ts is no longer imported anywhere in the runtime and
its dedicated test only exercised that isolated dead code. Remove the
unused module and orphan test to shrink maintenance surface without
changing behavior.
The forecast and report commands duplicated the same refreshed-account
persistence flow and forecast-result serialization logic. That increases
maintenance cost and makes future fixes easy to miss in one path.

Extract the shared pieces into a single helper module and keep both
commands delegating to the same implementation.
parallel-probe used forced casts to distinguish overload inputs and to
normalize probe failures. Replace those with explicit type guards and
error normalization so the module keeps the same behavior without
leaning on unsafe assertions.
Overlapping requests that share a prompt cache session can finish out of
order. Older responses were allowed to overwrite newer session-affinity
state, which lets stale previous_response_id and account affinity win.

Version session-affinity writes per request so the newest in-flight
request for a session keeps control of the stored response id and
preferred account.
Quota cache saves and update-check cache writes could overlap within the
same process. That risks write races, and the update checker also wrote
directly to the final cache file instead of staging through a temp file.

Serialize in-process writes for both caches and make update-check cache
writes atomic via temp-file rename so concurrent runtime writers do not
corrupt the on-disk state.
Live account sync reused the same controller once created, even when its
effective debounce/poll configuration changed. That leaves stale runtime
state in place and keeps watching with outdated timing parameters.

Track a live-sync config key and recreate the controller when the
configuration changes, matching the refresh-guardian pattern.
…ation

When OpenAI returns 404 with usage_limit_reached body, fetch-helpers.ts
remaps it to a 429 response. But the main pipeline was checking the
original response.status (404) instead of errorResponse.status (429),
so the rate-limit handling branch never ran. This caused exhausted
accounts to appear healthy and sessions to be killed instead of
rotating to available accounts.

Change index.ts:1861 from:
  if (response.status === 429)
to:
  if (errorResponse.status === 429 && rateLimit)

The added rateLimit guard also prevents entitlement-style 429s (where
rateLimit is explicitly set to undefined) from incorrectly marking
accounts as rate-limited.

Tests: 2 new tests covering both scenarios, full suite passes (3314 tests).
The 404 error remapper was converting ambiguous responses into 429s
based on free-form body text and generic rate_limit_exceeded codes.
That makes not-found errors look like cooldown signals and feeds weak
rate-limit reasons downstream.

Only remap structured 404 responses that carry explicit usage_limit
codes, while preserving entitlement detection. This keeps quota-limit
signals intact without manufacturing ambiguous 429s from plain text or
generic rate-limit payloads.
The preemptive quota scheduler calculated deferrals from the nearest
reset window. When a later primary or secondary reset was still active,
that shortened the effective cooldown and let requests resume too early.

Use the longest relevant reset window for 429 deferrals and for
near-exhaustion quota deferrals so cooldowns stay aligned with the
latest active quota window.
The 5xx handler rotates accounts and applies failure policy but does
not inform the preemptiveQuotaScheduler about upstream capacity
pressure. This means the scheduler keeps sending requests into an
overloaded backend while the 429 handler already calls
markRateLimited().

Add scheduler notification for overload-specific status codes
(502 Bad Gateway, 503 Service Unavailable, 529 Overloaded) inside
the 5xx handler, mirroring the 429 handler's awareness. Generic 500
errors are excluded since they do not indicate capacity pressure.
Accounts with tracked workspaces could remain eligible for family
selection even after every workspace had been disabled. That leaves a
workspace-exhausted account in the rotation pool and can block fallback
selection unnecessarily.

Treat accounts whose tracked workspaces are all disabled as unavailable
for selection, while keeping legacy workspace-less accounts eligible.
When an upstream perpetually returns short Retry-After values (<= 5s), the short-retry path would loop indefinitely on the same account. Add MAX_SHORT_RETRY_ATTEMPTS (3) so that after 3 consecutive short-cooldown 429s the request falls through to the existing long-cooldown rotation path, which marks the account rate-limited and rotates to the next one.

Changes:

- lib/request/rate-limit-backoff.ts: add MAX_SHORT_RETRY_ATTEMPTS = 3 constant

- index.ts: import MAX_SHORT_RETRY_ATTEMPTS and add attempt < MAX_SHORT_RETRY_ATTEMPTS guard to short-retry condition

- test/index.test.ts: add 2 test cases covering the bound (at-limit rotates, below-limit retries)
The runtime 429 backoff was purely exponential, so multiple clients or
parallel workers could retry in lockstep and create avoidable bursts.

Add bounded jitter when a new backoff window is created, while keeping
duplicate 429s within the dedup window stable so the same retry window is
reused instead of re-randomized.
Several 429 retry and backoff thresholds were still hardcoded in the
rate-limit backoff module, unlike other runtime retry controls that
already respect plugin config and environment overrides.

Expose those constants through plugin config/env getters and apply them
at runtime so operators can tune dedup windows, reset windows, max
backoff, and short-retry behavior without patching code.
The update checker silently swallowed invalid package metadata and cached
JSON parse failures. Keep the same fallback behavior, but log those cases
at debug level so malformed local state is observable during diagnosis.
Persisted account rate-limit state and persisted quota-cache views were
tracked separately, so the CLI menu could lose rate-limit context between
live probes. Merge stored account rate-limit resets into the display-side
quota view so persisted 429 pressure stays visible even when the cache is
otherwise empty.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
lib/proactive-refresh.ts (1)

95-137: ⚠️ Potential issue | 🟠 Major

stop logging raw account emails from the background refresher.

lib/proactive-refresh.ts:111-136 emits account.email on refresh start, success, and failure. these logs are persistent operational output for a background task, so this adds avoidable pii retention; log account.index only, or at least sanitizeEmail(account.email).

suggested fix
 	const timeUntilExpiry = getTimeUntilExpiry(account);
+	const safeEmail = sanitizeEmail(account.email);
 	log.info("Proactively refreshing token", {
 		accountIndex: account.index,
-		email: account.email,
+		email: safeEmail,
 		expiresInMs: timeUntilExpiry,
 		expiresInMinutes: Math.round(timeUntilExpiry / 60000),
 	});
@@
 	if (result.type === "success") {
 		log.info("Proactive refresh succeeded", {
 			accountIndex: account.index,
-			email: account.email,
+			email: safeEmail,
 		});
@@
 	log.warn("Proactive refresh failed", {
 		accountIndex: account.index,
-		email: account.email,
+		email: safeEmail,
 		failureReason: result.reason,
 	});
As per coding guidelines, `lib/**`: check for logging that leaks tokens or emails.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/proactive-refresh.ts` around lines 95 - 137, In proactiveRefreshAccount,
remove raw account.email from persistent logs (the log.info at the start, the
log.info on success, and the log.warn on failure) and instead log only
account.index or a sanitized email; update the three logging calls inside
proactiveRefreshAccount to either omit the email field or replace it with
sanitizeEmail(account.email) if that helper exists/import it (or add one) so no
raw PII is written to logs while retaining accountIndex for observability.
lib/codex-manager/commands/report.ts (1)

354-365: ⚠️ Potential issue | 🟠 Major

email/accountid-only refresh updates never persist here.

lib/codex-manager/commands/report.ts:375 compares previousEmail to account.email and previousAccountId to account.accountId before applyRefreshedAccountPatch() runs. those fields are still the old values, so a refresh that only updates email or account id skips both persistence and the in-memory patch. compare against refreshPatch.email / refreshPatch.accountId instead, and add a vitest in test/codex-manager-report-command.test.ts for unchanged tokens with a new account id.

suggested fix
-				if (
-					previousRefreshToken !== refreshPatch.refreshToken ||
-					previousAccessToken !== refreshPatch.accessToken ||
-					previousExpiresAt !== refreshPatch.expiresAt ||
-					previousEmail !== account.email ||
-					previousAccountId !== account.accountId
-				) {
+				if (
+					previousRefreshToken !== refreshPatch.refreshToken ||
+					previousAccessToken !== refreshPatch.accessToken ||
+					previousExpiresAt !== refreshPatch.expiresAt ||
+					(refreshPatch.email !== undefined &&
+						previousEmail !== refreshPatch.email) ||
+					(refreshPatch.accountId !== undefined &&
+						previousAccountId !== refreshPatch.accountId)
+				) {
As per coding guidelines, `lib/**`: focus on auth rotation, windows filesystem IO, and concurrency.

Also applies to: 394-398

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager/commands/report.ts` around lines 354 - 365, The comparisons
that gate persistence use the old in-memory account values (previousEmail vs
account.email and previousAccountId vs account.accountId) so email- or
accountId-only refreshes never persist; change the checks around where
refreshPatch is built to compare previousEmail to refreshPatch.email and
previousAccountId to refreshPatch.accountId (also update the similar checks at
the other occurrence around lines ~394-398) before calling
applyRefreshedAccountPatch(), ensuring you still set refreshPatch fields as
currently done; then add a vitest in test/codex-manager-report-command.test.ts
that simulates an unchanged token but a new accountId to assert the patch
persists.
index.ts (1)

2056-2076: ⚠️ Potential issue | 🟠 Major

stream-failover 429s still bypass the new identity-based backoff key.

the primary 429 path at index.ts:2060 passes a stable account identity into getRateLimitBackoff(), but the stream-failover 429 branch at index.ts:2397 still omits that fourth argument. failover probes will maintain a separate backoff bucket and can re-hit 429s the primary path would have deduped. pass fallbackAccountId or the resolved email there too, and cover it in test/index-retry.test.ts.

suggested fix
-																		const { delayMs } = getRateLimitBackoff(
-																			fallbackAccount.index,
-																			fallbackQuotaKey,
-																			retryAfterMs,
-																		);
+																		const { delayMs } = getRateLimitBackoff(
+																			fallbackAccount.index,
+																			fallbackQuotaKey,
+																			retryAfterMs,
+																			fallbackAccountId ??
+																				fallbackResolvedEmail ??
+																				null,
+																		);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@index.ts` around lines 2056 - 2076, The stream-failover 429 branch calls
getRateLimitBackoff without the stable account identity, so its backoff bucket
is separate; update the stream-failover 429 handling (the branch near the
current secondary failover logic) to pass the same fourth argument used in the
primary path—account.accountId ?? account.email ?? fallbackAccountId (or the
resolved email)—into getRateLimitBackoff, ensure
preemptiveQuotaScheduler.markRateLimited uses the resulting cooldown
consistently, and add/adjust test/index-retry.test.ts to assert that failover
probes increment the same backoff bucket (i.e., that a follow-up 429 is deduped
by the identity-based backoff).
♻️ Duplicate comments (3)
lib/codex-manager.ts (1)

650-683: ⚠️ Potential issue | 🟠 Major

the menu now displays persisted 429s, but the auto-probe gate still ignores them.

lib/codex-manager.ts:1102-1108 switched the dashboard view over to getPersistedQuotaViewForAccount(), but resolveMenuQuotaProbeInput() still gates probes off getQuotaCacheEntryForAccount() at lib/codex-manager.ts:783-795. with a stale or empty cache entry and a future persisted rateLimitResetTimes.codex, the ui shows rate-limited while the background menu refresh keeps probing the same account and generating avoidable 429s. please reuse the persisted-reset predicate in both paths and add the stale-cache + persisted-reset vitest.

As per coding guidelines, lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.

Also applies to: 1102-1108

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager.ts` around lines 650 - 683, The menu probe path still uses
getQuotaCacheEntryForAccount to decide whether to skip probes, which ignores
persisted rateLimitResetTimes.codex added by getPersistedQuotaViewForAccount;
update resolveMenuQuotaProbeInput to reuse the same persisted-reset predicate
(calling getRateLimitResetTimeForFamily and comparing against cached
primary/secondary resetAtMs or using getPersistedQuotaViewForAccount) so probes
are suppressed when a persisted 429 exists even if the in-memory cache is stale,
and add a vitest that simulates a stale/empty cache with a future persisted
rateLimitResetTimes.codex to assert the UI shows rate-limited and the probe is
skipped (include tests covering EBUSY/429 retry behavior per guidelines).
index.ts (1)

937-938: 🧹 Nitpick | 🔵 Trivial

split the session-affinity version bump out of the const binding.

index.ts:937 still trips biome's noAssignInExpressions, and the side effect is harder to scan in a hot path. increment first, then bind the value.

suggested fix
-								const sessionAffinityVersion =
-									(sessionAffinityWriteVersion += 1);
+								sessionAffinityWriteVersion += 1;
+								const sessionAffinityVersion =
+									sessionAffinityWriteVersion;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@index.ts` around lines 937 - 938, The current expression const
sessionAffinityVersion = (sessionAffinityWriteVersion += 1); performs an
assignment inside a const binding and trips lint rules; change it to increment
the counter first and then bind the value: perform sessionAffinityWriteVersion
+= 1 (or ++sessionAffinityWriteVersion) on its own line, then set const
sessionAffinityVersion = sessionAffinityWriteVersion; this removes the
assignment-in-expression and makes the side effect in
sessionAffinityWriteVersion explicit.
lib/codex-manager/commands/report.ts (1)

326-329: ⚠️ Potential issue | 🟠 Major

probe notes still leak account emails.

formatAccountLabel() is still used when pushing probeErrors at lib/codex-manager/commands/report.ts:328, and the same pattern remains at lib/codex-manager/commands/report.ts:391, lib/codex-manager/commands/report.ts:403, and lib/codex-manager/commands/report.ts:428. those strings are emitted in text mode and in the json report, so full email addresses still leave the process. switch these paths to a redacted label helper and assert the redaction in test/codex-manager-report-command.test.ts.

As per coding guidelines, lib/**: check for logging that leaks tokens or emails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager/commands/report.ts` around lines 326 - 329, Replace any use
of formatAccountLabel when building probeErrors entries with the redacted
account-label helper (e.g., formatAccountLabelRedacted or
getRedactedAccountLabel) so that emitted probeErrors strings do not contain full
email addresses; update the probeErrors.push sites (the ones currently calling
formatAccountLabel) to call the redacting helper instead and ensure the message
text remains the same aside from using the redacted label; finally, update
test/codex-manager-report-command.test.ts to assert that the produced text and
JSON report values contain the redacted label (no raw email) for those probe
error cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@index.ts`:
- Around line 2510-2522: The current code still calls
sessionAffinityStore.remember(...) in two places (around where remember() is
used and at the blocks that currently call remember()/updateLastResponseId),
which bypasses the versioned ordering in lib/session-affinity.ts; replace those
remember(...) calls with
sessionAffinityStore.rememberWithVersion(sessionAffinityKey, ...,
sessionAffinityVersion) and ensure updateLastResponseId continues to be called
with sessionAffinityVersion (use the existing
updateLastResponseId(sessionAffinityKey, responseId, Date.now(),
sessionAffinityVersion) pattern), and add a vitest in
test/session-affinity.test.ts that races two non-continuation responses to
assert the mapping respects the newer write (i.e., the versioned write wins) to
prevent older completion from overwriting newer mappings.
- Around line 1983-2007: The server-burst recorder is never invoked for real
5xxs; add a call to recordServerBurstFailure(...) in the same branch that
handles overload-type server errors (where response.status is 502/503/529 and
policy.cooldownMs is a number) before any retry/rotation decision so the burst
circuit can open (this is the place alongside evaluateFailurePolicy(...) and
preemptiveQuotaScheduler.markRateLimited(...)); then add a Vitest in
test/index-retry.test.ts that drives three cross-account 5xx responses and
asserts that the next request fast-fails with a 503 (verifying
getServerBurstCooldownRemaining() and the burst fast-fail behavior).

In `@lib/codex-manager/commands/report.ts`:
- Around line 163-199: The flag-parsing code for --max-accounts and --max-probes
(the blocks that set options.maxAccounts and options.maxProbes) currently uses
Number.parseInt which silently truncates malformed numeric strings; change the
validation to explicitly reject non-integer strings before parsing (e.g. require
the raw token or the part after "--max-accounts=" / "--max-probes=" to match a
whole-digit regex like /^\d+$/ or otherwise verify Number(value) is an integer
and that the string does not include exponent/decimal/alpha chars), then parse
and assign only when the check passes; also add unit tests in
test/codex-manager-report-command.test.ts covering "1.9", "1e3", "2foo" and
similar cases to assert the command returns an error for those inputs.

In `@lib/request/request-resilience.ts`:
- Around line 48-60: The reset logic for serverBurstState currently clears the
entire state based only on windowStartedAt which can prematurely clear an active
cooldown; change the reset in request-resilience.ts so you only reinitialize
serverBurstState if there is no armed cooldown or the cooldown has expired
(i.e., check serverBurstState.cooldownUntil is null or <= now before resetting),
preserve cooldownUntil when it is still in the future, and continue to add
accountIndex and arm cooldown as before (symbols: serverBurstState,
SERVER_BURST_COOLDOWN_MS, SERVER_BURST_THRESHOLD, accountIndex). Also add a
vitest in test/request-resilience.test.ts that simulates failures at t=0,5,9,12s
to assert the cooldown remains armed until expiry and is not cleared by later
window resets.
- Around line 27-30: The armPoolExhaustionCooldown function currently overwrites
poolExhaustionCooldownUntil with now + bounded, which can shorten an active
cooldown; change armPoolExhaustionCooldown to compute the newExpiry = now +
bounded and set poolExhaustionCooldownUntil =
Math.max(poolExhaustionCooldownUntil, newExpiry) (refer to function
armPoolExhaustionCooldown and the variable poolExhaustionCooldownUntil), and add
a vitest in test/request-resilience.test.ts that sets an initial longer
cooldown, then calls armPoolExhaustionCooldown with a shorter wait while the
cooldown is active and asserts the expiry remains the original (monotonic)
value.

In `@README.md`:
- Around line 22-23: The README storage table is missing the new on-disk
artifact created by lib/runtime/runtime-observability.ts which persists
runtime-observability.json; update the storage table entries (including the
other referenced section near the earlier block) to list
runtime-observability.json, its approximate path, and a short description (e.g.,
"runtime observability metadata for request budget/cooldown/traffic") so
auditors and cleanup scripts know to handle this file — search for the
symbol/runtime-observability.json in lib/runtime/runtime-observability.ts and
add a corresponding row to the README storage table entries around the two
referenced sections.

---

Outside diff comments:
In `@index.ts`:
- Around line 2056-2076: The stream-failover 429 branch calls
getRateLimitBackoff without the stable account identity, so its backoff bucket
is separate; update the stream-failover 429 handling (the branch near the
current secondary failover logic) to pass the same fourth argument used in the
primary path—account.accountId ?? account.email ?? fallbackAccountId (or the
resolved email)—into getRateLimitBackoff, ensure
preemptiveQuotaScheduler.markRateLimited uses the resulting cooldown
consistently, and add/adjust test/index-retry.test.ts to assert that failover
probes increment the same backoff bucket (i.e., that a follow-up 429 is deduped
by the identity-based backoff).

In `@lib/codex-manager/commands/report.ts`:
- Around line 354-365: The comparisons that gate persistence use the old
in-memory account values (previousEmail vs account.email and previousAccountId
vs account.accountId) so email- or accountId-only refreshes never persist;
change the checks around where refreshPatch is built to compare previousEmail to
refreshPatch.email and previousAccountId to refreshPatch.accountId (also update
the similar checks at the other occurrence around lines ~394-398) before calling
applyRefreshedAccountPatch(), ensuring you still set refreshPatch fields as
currently done; then add a vitest in test/codex-manager-report-command.test.ts
that simulates an unchanged token but a new accountId to assert the patch
persists.

In `@lib/proactive-refresh.ts`:
- Around line 95-137: In proactiveRefreshAccount, remove raw account.email from
persistent logs (the log.info at the start, the log.info on success, and the
log.warn on failure) and instead log only account.index or a sanitized email;
update the three logging calls inside proactiveRefreshAccount to either omit the
email field or replace it with sanitizeEmail(account.email) if that helper
exists/import it (or add one) so no raw PII is written to logs while retaining
accountIndex for observability.

---

Duplicate comments:
In `@index.ts`:
- Around line 937-938: The current expression const sessionAffinityVersion =
(sessionAffinityWriteVersion += 1); performs an assignment inside a const
binding and trips lint rules; change it to increment the counter first and then
bind the value: perform sessionAffinityWriteVersion += 1 (or
++sessionAffinityWriteVersion) on its own line, then set const
sessionAffinityVersion = sessionAffinityWriteVersion; this removes the
assignment-in-expression and makes the side effect in
sessionAffinityWriteVersion explicit.

In `@lib/codex-manager.ts`:
- Around line 650-683: The menu probe path still uses
getQuotaCacheEntryForAccount to decide whether to skip probes, which ignores
persisted rateLimitResetTimes.codex added by getPersistedQuotaViewForAccount;
update resolveMenuQuotaProbeInput to reuse the same persisted-reset predicate
(calling getRateLimitResetTimeForFamily and comparing against cached
primary/secondary resetAtMs or using getPersistedQuotaViewForAccount) so probes
are suppressed when a persisted 429 exists even if the in-memory cache is stale,
and add a vitest that simulates a stale/empty cache with a future persisted
rateLimitResetTimes.codex to assert the UI shows rate-limited and the probe is
skipped (include tests covering EBUSY/429 retry behavior per guidelines).

In `@lib/codex-manager/commands/report.ts`:
- Around line 326-329: Replace any use of formatAccountLabel when building
probeErrors entries with the redacted account-label helper (e.g.,
formatAccountLabelRedacted or getRedactedAccountLabel) so that emitted
probeErrors strings do not contain full email addresses; update the
probeErrors.push sites (the ones currently calling formatAccountLabel) to call
the redacting helper instead and ensure the message text remains the same aside
from using the redacted label; finally, update
test/codex-manager-report-command.test.ts to assert that the produced text and
JSON report values contain the redacted label (no raw email) for those probe
error cases.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 104f094b-7f8f-4957-84d9-2bbabf7a198e

📥 Commits

Reviewing files that changed from the base of the PR and between 5cf1472 and 372da70.

📒 Files selected for processing (12)
  • README.md
  • index.ts
  • lib/codex-manager.ts
  • lib/codex-manager/commands/report.ts
  • lib/codex-manager/commands/status.ts
  • lib/proactive-refresh.ts
  • lib/quota-probe.ts
  • lib/request/request-resilience.ts
  • lib/runtime/runtime-observability.ts
  • test/index-retry.test.ts
  • test/request-resilience.test.ts
  • test/runtime-observability.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/proactive-refresh.ts
  • lib/quota-probe.ts
  • lib/codex-manager/commands/status.ts
  • lib/codex-manager.ts
  • lib/request/request-resilience.ts
  • lib/runtime/runtime-observability.ts
  • lib/codex-manager/commands/report.ts
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/runtime-observability.test.ts
  • test/index-retry.test.ts
  • test/request-resilience.test.ts
🪛 Biome (2.4.10)
index.ts

[error] 938-938: The assignment should not be in an expression.

(lint/suspicious/noAssignInExpressions)

🔇 Additional comments (2)
test/request-resilience.test.ts (1)

13-55: good deterministic coverage for the new cooldown helpers.

test/request-resilience.test.ts:13-55 pins timestamps explicitly and resets shared state, so these cases should stay stable without real timers while exercising the pool cooldown, server burst cooldown, and candidate ordering paths.

As per coding guidelines, test/**: tests must stay deterministic and use vitest.

test/index-retry.test.ts (1)

444-661: good regression coverage for the bounded retry budget.

test/index-retry.test.ts:444-661 covers the no-replay default and the bounded-attempt paths with fixed inputs, and it asserts the operator-facing log context instead of only the final status code.

As per coding guidelines, test/**: tests must stay deterministic and use vitest.

Comment thread index.ts Outdated
Comment thread index.ts
Comment thread lib/codex-manager/commands/report.ts
Comment thread lib/request/request-resilience.ts
Comment thread lib/request/request-resilience.ts Outdated
Comment thread lib/runtime/runtime-observability.ts
Comment thread lib/runtime/runtime-observability.ts
Comment thread README.md
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
test/auto-update-checker.test.ts (2)

502-526: ⚠️ Potential issue | 🟠 Major

prove serialized order and final cache integrity.

test/auto-update-checker.test.ts:502-526 still passes if the two writes interleave or the last rename writes stale json. capture the writeFileSync/renameSync event order and parse the payload from the last renamed temp file so this test actually proves the queue in lib/auto-update-checker.ts:33-77 serialized the writes and left "5.0.1" in the cache.

proposed test shape
 		it("serializes concurrent cache writes through temp-file renames", async () => {
 			vi.mocked(fs.writeFileSync).mockClear();
 			vi.mocked(fs.renameSync).mockClear();
+			const events: string[] = [];
+			const payloadByTempPath = new Map<string, string>();
+			vi.mocked(fs.writeFileSync).mockImplementation((path, data) => {
+				const tempPath = String(path);
+				events.push(`write:${tempPath}`);
+				payloadByTempPath.set(tempPath, String(data));
+			});
+			vi.mocked(fs.renameSync).mockImplementation((from, to) => {
+				events.push(`rename:${String(from)}->${String(to)}`);
+			});
 			vi.mocked(globalThis.fetch)
 				.mockResolvedValueOnce({
 					ok: true,
@@ -515,11 +525,17 @@
 			await Promise.all([checkForUpdates(true), checkForUpdates(true)]);

 			expect(fs.renameSync).toHaveBeenCalledTimes(2);
+			expect(events.map((event) => event.split(":")[0])).toEqual([
+				"write",
+				"rename",
+				"write",
+				"rename",
+			]);
+			const finalTempPath = String(vi.mocked(fs.renameSync).mock.calls.at(-1)?.[0]);
+			expect(JSON.parse(payloadByTempPath.get(finalTempPath) ?? "null")).toMatchObject({
+				latestVersion: "5.0.1",
+			});
 			const writeTargets = vi
 				.mocked(fs.writeFileSync)
 				.mock.calls.map((call) => String(call[0]));
 			expect(
 				writeTargets.every((target) =>
 					target.includes("update-check-cache.json."),

As per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/auto-update-checker.test.ts` around lines 502 - 526, The test currently
only checks calls but not ordering or final cache contents; update the test
around checkForUpdates to record the sequence of fs.writeFileSync and
fs.renameSync calls (e.g., by inspecting vi.mocked(fs.writeFileSync).mock.calls
and vi.mocked(fs.renameSync).mock.calls), determine which temp file name was the
last one passed to renameSync, read/parse the JSON payload that was written to
that temp filename (from the corresponding writeFileSync call) and assert it
contains version "5.0.1"; ensure you still assert two renames occurred and that
the last rename's payload is the authoritative final cache content to prove
lib/auto-update-checker.ts (checkForUpdates) serialized writes correctly.

477-500: ⚠️ Potential issue | 🟠 Major

cover the exhausted rename path.

test/auto-update-checker.test.ts:477-500 only proves lib/auto-update-checker.ts:51-64 recovers from transient EBUSY/EPERM. it never drives the permanent-failure cleanup at lib/auto-update-checker.ts:68-75, so temp-file leaks on windows can regress unnoticed. add a vitest case that keeps renameSync failing and asserts unlinkSync and logger.warn.

proposed test shape
+		it.each(["EBUSY", "EPERM"] as const)(
+			"cleans up the temp file when rename exhausts retries (%s)",
+			async (code) => {
+				vi.mocked(fs.renameSync).mockImplementation(() => {
+					const error = new Error("busy") as NodeJS.ErrnoException;
+					error.code = code;
+					throw error;
+				});
+				vi.mocked(globalThis.fetch).mockResolvedValue({
+					ok: true,
+					json: async () => ({ version: "5.0.0" }),
+				} as Response);
+
+				await checkForUpdates(true);
+
+				expect(fs.unlinkSync).toHaveBeenCalledWith(
+					expect.stringContaining("update-check-cache.json."),
+				);
+				expect(logger.warn).toHaveBeenCalledWith(
+					"Failed to save update cache",
+					expect.objectContaining({ error: expect.any(String) }),
+				);
+			},
+		);

As per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/auto-update-checker.test.ts` around lines 477 - 500, Add a deterministic
Vitest case in test/auto-update-checker.test.ts that simulates persistent rename
failures to exercise the permanent-failure branch in checkForUpdates: mock
fs.renameSync to always throw a NodeJS.ErrnoException with code "EBUSY" or
"EPERM", ensure globalThis.fetch returns an ok response, call
checkForUpdates(true), and then assert that fs.unlinkSync was called to clean up
the temp file and that the module logger.warn (or the logger used in
lib/auto-update-checker.ts) was invoked with a warning about the failure; keep
the test isolated and use vi.mocked to clear/restore mocks.
lib/auto-update-checker.ts (1)

289-299: ⚠️ Potential issue | 🟠 Major

make clearUpdateCache() awaitable.

lib/auto-update-checker.ts:289 returns before the queued clear runs. a caller that does clearUpdateCache(); await checkForUpdates(); can still read the stale cache at lib/auto-update-checker.ts:236 in the same tick, and test/auto-update-checker.test.ts:630-647 is already compensating with manual microtask flushing. return the queued promise here and add a vitest regression for the immediate clear-then-check flow.

proposed fix
-export function clearUpdateCache(): void {
-	void enqueueUpdateCacheWrite(() => {
+export function clearUpdateCache(): Promise<void> {
+	return enqueueUpdateCacheWrite(() => {
 		try {
 			if (existsSync(CACHE_FILE)) {
 				writeCacheContents("{}");
 			}
 		} catch {
 			// Ignore errors
 		}
 	});
 }

As per coding guidelines, lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/auto-update-checker.ts` around lines 289 - 299, clearUpdateCache
currently returns before the queued clear runs because it fires-and-forgets
enqueueUpdateCacheWrite; change clearUpdateCache to return the Promise from
enqueueUpdateCacheWrite (i.e., make it async or return Promise<void>) so callers
can await the cache-clear completing, update any callers/tests accordingly, and
add a Vitest regression that does await clearUpdateCache(); await
checkForUpdates(); to assert the cache is empty immediately. Locate the call
site using the function name clearUpdateCache and the queue function
enqueueUpdateCacheWrite, ensure the enqueue implementation still handles
EBUSY/429 retry semantics per existing queue behavior, and update tests
referenced in test/auto-update-checker.test.ts to cover the immediate
clear-then-check flow.
lib/codex-manager.ts (1)

650-683: ⚠️ Potential issue | 🟠 Major

the synthesized 429 view still treats stale state as live.

lib/codex-manager.ts:650-683 returns cached 429 entries even after both cached reset times have passed, and when it synthesizes a persisted 429 it keeps the stale updatedAt. that leaks into lib/codex-manager.ts:959 as a false rate-limited status and into lib/codex-manager.ts:783-795 as repeat probes every ttl window while rateLimitResetTimes.codex is still in the future. gate the synthetic view on a future reset timestamp, not just status === 429, and add vitest regressions for expired cached 429 plus stale-cache + future-persisted reset.

As per coding guidelines, lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/config-explain.test.ts`:
- Around line 131-146: Update the vitest in test/config-explain.test.ts that
currently checks only "retryAllAccountsMaxRetries": call
getPluginConfigExplainReport() (using the existing helper expectEntry) and
assert that the four new explain entries added in lib/config.ts (the same keys
you added) all exist and have the expected value and defaultValue; then
JSON.stringify/JSON.parse the whole report and assert those same keys and their
value/defaultValue survive the round-trip (use serialized.entries.find by key);
finally add a sub-test that sets the corresponding env vars in process.env,
re-imports or re-calls getPluginConfigExplainReport(), and asserts each of the
four entries shows source: "env" (clean up env after); keep using vitest APIs
(no mocking of real secrets) and deterministic setup/teardown.

In `@test/fetch-helpers.test.ts`:
- Around line 497-509: Add a new unit test that verifies mixed entitlement vs
rate-limit signals are handled with entitlement precedence: create a response
body where error.code is 'usage_not_included' and error.type is
'usage_limit_reached', wrap it in a Response with status 404, call
handleErrorResponse, and assert the mapped response status is 403 and that
rateLimit is undefined; place the test alongside the existing tests in
fetch-helpers.test.ts near the other error-mapping cases so it documents that
entitlement detection in handleErrorResponse wins over rate-limit detection.
- Around line 1030-1054: Add a regression test for handleErrorResponse that
verifies a 404 response with malformed JSON body does not get remapped or throw;
create a new test similar to the existing cases (referencing handleErrorResponse
in the test file) that constructs a Response('{ invalid json', { status: 404 }),
calls await handleErrorResponse(response), and asserts result.status is 404 and
rateLimit is undefined so the try/catch path that falls through to null is
exercised.

---

Duplicate comments:
In `@lib/auto-update-checker.ts`:
- Around line 289-299: clearUpdateCache currently returns before the queued
clear runs because it fires-and-forgets enqueueUpdateCacheWrite; change
clearUpdateCache to return the Promise from enqueueUpdateCacheWrite (i.e., make
it async or return Promise<void>) so callers can await the cache-clear
completing, update any callers/tests accordingly, and add a Vitest regression
that does await clearUpdateCache(); await checkForUpdates(); to assert the cache
is empty immediately. Locate the call site using the function name
clearUpdateCache and the queue function enqueueUpdateCacheWrite, ensure the
enqueue implementation still handles EBUSY/429 retry semantics per existing
queue behavior, and update tests referenced in test/auto-update-checker.test.ts
to cover the immediate clear-then-check flow.

In `@test/auto-update-checker.test.ts`:
- Around line 502-526: The test currently only checks calls but not ordering or
final cache contents; update the test around checkForUpdates to record the
sequence of fs.writeFileSync and fs.renameSync calls (e.g., by inspecting
vi.mocked(fs.writeFileSync).mock.calls and vi.mocked(fs.renameSync).mock.calls),
determine which temp file name was the last one passed to renameSync, read/parse
the JSON payload that was written to that temp filename (from the corresponding
writeFileSync call) and assert it contains version "5.0.1"; ensure you still
assert two renames occurred and that the last rename's payload is the
authoritative final cache content to prove lib/auto-update-checker.ts
(checkForUpdates) serialized writes correctly.
- Around line 477-500: Add a deterministic Vitest case in
test/auto-update-checker.test.ts that simulates persistent rename failures to
exercise the permanent-failure branch in checkForUpdates: mock fs.renameSync to
always throw a NodeJS.ErrnoException with code "EBUSY" or "EPERM", ensure
globalThis.fetch returns an ok response, call checkForUpdates(true), and then
assert that fs.unlinkSync was called to clean up the temp file and that the
module logger.warn (or the logger used in lib/auto-update-checker.ts) was
invoked with a warning about the failure; keep the test isolated and use
vi.mocked to clear/restore mocks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 34fec1e4-2d2d-4ddc-826c-41ab1d49a025

📥 Commits

Reviewing files that changed from the base of the PR and between 372da70 and 9a3c5b2.

📒 Files selected for processing (8)
  • lib/auto-update-checker.ts
  • lib/codex-manager.ts
  • lib/config.ts
  • lib/request/fetch-helpers.ts
  • test/auto-update-checker.test.ts
  • test/codex-manager-cli.test.ts
  • test/config-explain.test.ts
  • test/fetch-helpers.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/config-explain.test.ts
  • test/fetch-helpers.test.ts
  • test/auto-update-checker.test.ts
  • test/codex-manager-cli.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/request/fetch-helpers.ts
  • lib/config.ts
  • lib/auto-update-checker.ts
  • lib/codex-manager.ts
🔇 Additional comments (4)
lib/request/fetch-helpers.ts (1)

980-1024: lgtm, the signal-based 404 remapping logic is sound.

the tightening from free-text regex to structured error.code/error.type signals is a good hardening change. the .includes("usage_limit") at lib/request/fetch-helpers.ts:1012 correctly matches both "usage_limit" and "usage_limit_reached" variants.

one minor observation: the function now silently drops numeric error.code values by coercing them via .toString() at line 990-991 — this is fine since the upstream api uses string codes, but worth noting if that ever changes.

test/fetch-helpers.test.ts (2)

511-526: good coverage for entitlement detection via error.type.

this test at test/fetch-helpers.test.ts:511-526 correctly verifies that usage_not_included in error.type triggers the 403 entitlement path and suppresses rate-limit info. this is a key regression case for the signal-based parsing.


467-495: lgtm, the signal extraction tests are well-structured.

tests at test/fetch-helpers.test.ts:467-495 properly cover:

  • error.code carrying the usage_limit signal (line 467)
  • error.type carrying the usage_limit signal (line 482)

both verify the 404→429 remap and retryAfterMs > 0. deterministic and no mocking of real secrets.

test/codex-manager-cli.test.ts (1)

7155-7200: good regression coverage for persisted-only fallback

this looks correct now. test/codex-manager-cli.test.ts:7199 explicitly proves no live probe happens, so the fallback to persisted rate-limit state stays enforced.

Comment thread test/config-explain.test.ts
Comment thread test/fetch-helpers.test.ts
Comment thread test/fetch-helpers.test.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/runtime-services.test.ts (1)

72-110: ⚠️ Potential issue | 🟡 Minor

add eperm variant test to cover windows filesystem lock edge case.

lib/runtime/runtime-services.ts:79 explicitly handles both EBUSY and EPERM with identical retry logic, but test/runtime-services.test.ts only covers EBUSY exhaustion at lines 72-110. add a parallel test case with error.code = "EPERM" to ensure antivirus/indexing file locks on windows are handled the same way—this is a required regression case for windows behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/runtime-services.test.ts` around lines 72 - 110, Add a parallel test to
the existing "warns and keeps the previous path when busy retries are exhausted"
spec that uses error.code = "EPERM" instead of "EBUSY" to exercise the same
retry logic in ensureLiveAccountSyncState; copy the existing test structure
(vi.useFakeTimers(), create NodeJS.ErrnoException with code "EPERM", mock
syncToPath to throw, currentSync, logWarn, call ensureLiveAccountSyncState with
enabled:true and targetPath "/tmp/new", run timers, await the result, then
assert syncToPath was called 3 times, logWarn was called with the same "[plugin]
Live account sync path switch failed due to transient filesystem locks; keeping
previous watcher." message, and the returned object preserves
currentSync/currentPath/currentConfigKey), and ensure you restore timers in a
finally block with vi.useRealTimers().
index.ts (1)

2403-2412: ⚠️ Potential issue | 🟠 Major

stream failover 429s still bypass the stable backoff key.

index.ts:2071-2076 was updated to pass a stable identity into getRateLimitBackoff(), but index.ts:2408-2412 drops back to the mutable fallbackAccount.index inside the stream-failover 429 branch. that means a streamed fallback that gets rate-limited can still split the same account across different backoff buckets after reindexing or account removal, which undoes the stable-account-key fix on one of the new 429 paths. please pass the resolved fallback identity here too, and add a vitest regression in test/index-retry.test.ts for a streamed fallback 429.

suggested fix
 																		const { delayMs } = getRateLimitBackoff(
 																			fallbackAccount.index,
 																			fallbackQuotaKey,
 																			retryAfterMs,
+																			fallbackAccountId ?? fallbackResolvedEmail ?? null,
 																		);

As per coding guidelines, lib/**: focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@index.ts` around lines 2403 - 2412, The stream-failover 429 branch uses the
mutable fallbackAccount.index when calling getRateLimitBackoff, which breaks the
stable backoff key fix; change the call to use the previously resolved stable
fallback identity (the same key used at index.ts:2071-2076) instead of
fallbackAccount.index — construct fallbackQuotaKey the same way (use modelFamily
and model) and pass the resolved fallback identity into getRateLimitBackoff
rather than fallbackAccount.index; then add a vitest regression in
test/index-retry.test.ts that simulates a streamed fallback response returning
429 and asserts that requests for the same resolved fallback identity map to the
same backoff bucket (i.e., consistent backoff key) after reindexing/removal.
♻️ Duplicate comments (1)
test/session-affinity.test.ts (1)

161-173: ⚠️ Potential issue | 🟠 Major

this regression still passes with the original overlap bug.

test/session-affinity.test.ts:166-169 only proves that the second sequential write wins. if lib/session-affinity.ts:233-240 regressed back to a Date.now()-derived default, this test would still pass because both writes would share the same version and the later call would overwrite the earlier one anyway. please make this a true same-timestamp overlap: let the newer request write first, let the older request complete later at the same mocked timestamp, and assert that both getPreferredAccountIndex() and getLastResponseId() stay on the newer write.

As per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/session-affinity.test.ts` around lines 161 - 173, The test currently
writes two versions sequentially and can miss same-timestamp overlap bugs;
change it to simulate a real same-timestamp race by keeping the mocked clock
fixed (vi.useFakeTimers + vi.setSystemTime) then invoke the newer request first
(call SessionAffinityStore.rememberWithVersion for the newer version), schedule
the older request to complete asynchronously at the same mocked timestamp
(enqueue a setTimeout(fn, 0) or queueMicrotask that calls rememberWithVersion
for the older version while not advancing time), then advance timers to run the
older completion and assert that SessionAffinityStore.getPreferredAccountIndex
and getLastResponseId still reflect the newer write; ensure you reference the
SessionAffinityStore class and the rememberWithVersion,
getPreferredAccountIndex, and getLastResponseId methods when updating the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@index.ts`:
- Around line 2726-2740: The user-facing cooldown message is built from
waitMs/waitLabel in index.ts but armPoolExhaustionCooldown (from
lib/request/request-resilience.ts) may clamp/extend the actual cooldown to a
longer minimum; update the branch that calls armPoolExhaustionCooldown so it
captures and uses the armed cooldown value (use the return value or expose the
effective cooldown from armPoolExhaustionCooldown) instead of the original
waitMs when composing message/runtimeMetrics.lastError and before calling
syncRuntimeObservability; ensure armPoolExhaustionCooldown’s API is adjusted to
return the effective cooldown if needed, update the test at
test/index-retry.test.ts:444-482 to expect the longer/clamped cooldown, and
verify no sensitive data (tokens/emails) is logged.

In `@lib/runtime/runtime-observability.ts`:
- Around line 199-202: pendingWrite chains into writeSnapshot(nextSnapshot)
without a terminal catch, so permanent write failures propagate as unhandled
rejections; add a terminal .catch(() => undefined) to the promise returned by
writeSnapshot in the pendingWrite assignment (the chain that references
pendingWrite and writeSnapshot) to swallow errors and keep pendingWrite settled,
and add a vitest that mocks fs.rename or fs.writeFile to always throw (simulate
non-retryable error / Windows contention) and asserts no unhandledRejection is
emitted during the test run.

In `@lib/runtime/runtime-services.ts`:
- Around line 49-59: The guard in runtime-services.ts uses nextConfigKey !==
null so a passed configKey of undefined does not reset liveAccountSyncConfigKey;
add a regression test in test/runtime-services.test.ts that sets
liveAccountSyncConfigKey = "25:250", invokes the update path with configKey:
undefined (so nextConfigKey becomes undefined) while liveAccountSync is
non-null, and assert the expected behavior (either that the sync is reset or
that the previous key/path is kept) to document the contract; alternatively, if
the intended behavior is to treat undefined like null, change the conditional in
the liveAccountSync reset block (referencing variables nextConfigKey,
liveAccountSync, liveAccountSyncConfigKey) to use a nullish check (e.g.,
nextConfigKey == null) so undefined triggers the reset, and update/add the test
accordingly.

In `@lib/storage.ts`:
- Around line 1272-1369: Tests are missing for the corrupt branches in
inspectStorageHealth; add regression tests that simulate (1) malformed JSON in
the primary storage file and (2) JSON that fails schema normalization so the
catch/invalid-normalize branches run, then assert inspectStorageHealth returns
"corrupt" when WAL recovery is not present and "recoverable" when
loadAccountsFromJournal (called inside inspectStorageHealth) finds entries; use
the inspectStorageHealth, loadAccountsFromJournal and loadAccountsFromPath
behavior to create test fixtures (write bad content to the storage path returned
by getStoragePath or mock it) and verify the details/error message path and
recoverySource fields match expectations.

In `@test/runtime-services.test.ts`:
- Around line 86-89: Indentation of the object literal properties currentSync,
currentPath, currentConfigKey, and createSync in the test
runtime-services.test.ts is inconsistent with surrounding lines; update their
leading whitespace to match the file's existing indentation style (use the same
tabs/spaces and alignment as surrounding object entries) so the block aligns
with neighboring lines and removes the lint warning.

In `@test/storage-health-inspection.test.ts`:
- Around line 11-69: Add new test cases alongside the existing
inspectStorageHealth spec to cover the missing regressions: (1) corrupt primary
storage — write malformed JSON to the path set via
storageModule.setStoragePathDirect(...) and ensure inspectStorageHealth()
returns summary.state === "corrupt" when the WAL also fails; (2) healthy storage
— write a valid accounts.json and assert summary.state === "healthy"; (3)
intentional-reset — create the reset marker file (same directory/name convention
used by storageModule) and assert summary.state === "intentional-reset"; (4)
empty storage — write valid JSON with accounts: [] and assert summary.state ===
"empty"; and (5) Windows EBUSY simulation — mock fs/promises (or vi.mock the fs
module used by storageModule) to throw an EBUSY error on read/open and assert
inspectStorageHealth() handles it gracefully (expect the appropriate state or
error handling per existing logic). Use the same helpers in the test
(setStoragePathDirect, inspectStorageHealth) and clean up workDir/reset marker
between tests.

---

Outside diff comments:
In `@index.ts`:
- Around line 2403-2412: The stream-failover 429 branch uses the mutable
fallbackAccount.index when calling getRateLimitBackoff, which breaks the stable
backoff key fix; change the call to use the previously resolved stable fallback
identity (the same key used at index.ts:2071-2076) instead of
fallbackAccount.index — construct fallbackQuotaKey the same way (use modelFamily
and model) and pass the resolved fallback identity into getRateLimitBackoff
rather than fallbackAccount.index; then add a vitest regression in
test/index-retry.test.ts that simulates a streamed fallback response returning
429 and asserts that requests for the same resolved fallback identity map to the
same backoff bucket (i.e., consistent backoff key) after reindexing/removal.

In `@test/runtime-services.test.ts`:
- Around line 72-110: Add a parallel test to the existing "warns and keeps the
previous path when busy retries are exhausted" spec that uses error.code =
"EPERM" instead of "EBUSY" to exercise the same retry logic in
ensureLiveAccountSyncState; copy the existing test structure
(vi.useFakeTimers(), create NodeJS.ErrnoException with code "EPERM", mock
syncToPath to throw, currentSync, logWarn, call ensureLiveAccountSyncState with
enabled:true and targetPath "/tmp/new", run timers, await the result, then
assert syncToPath was called 3 times, logWarn was called with the same "[plugin]
Live account sync path switch failed due to transient filesystem locks; keeping
previous watcher." message, and the returned object preserves
currentSync/currentPath/currentConfigKey), and ensure you restore timers in a
finally block with vi.useRealTimers().

---

Duplicate comments:
In `@test/session-affinity.test.ts`:
- Around line 161-173: The test currently writes two versions sequentially and
can miss same-timestamp overlap bugs; change it to simulate a real
same-timestamp race by keeping the mocked clock fixed (vi.useFakeTimers +
vi.setSystemTime) then invoke the newer request first (call
SessionAffinityStore.rememberWithVersion for the newer version), schedule the
older request to complete asynchronously at the same mocked timestamp (enqueue a
setTimeout(fn, 0) or queueMicrotask that calls rememberWithVersion for the older
version while not advancing time), then advance timers to run the older
completion and assert that SessionAffinityStore.getPreferredAccountIndex and
getLastResponseId still reflect the newer write; ensure you reference the
SessionAffinityStore class and the rememberWithVersion,
getPreferredAccountIndex, and getLastResponseId methods when updating the test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0484a041-dea5-4bff-aefc-7ab2a0f02521

📥 Commits

Reviewing files that changed from the base of the PR and between 9a3c5b2 and 91d310f.

📒 Files selected for processing (12)
  • index.ts
  • lib/request/request-resilience.ts
  • lib/runtime/runtime-observability.ts
  • lib/runtime/runtime-services.ts
  • lib/session-affinity.ts
  • lib/storage.ts
  • test/index-retry.test.ts
  • test/request-resilience.test.ts
  • test/runtime-observability.test.ts
  • test/runtime-services.test.ts
  • test/session-affinity.test.ts
  • test/storage-health-inspection.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/session-affinity.test.ts
  • test/storage-health-inspection.test.ts
  • test/runtime-services.test.ts
  • test/runtime-observability.test.ts
  • test/request-resilience.test.ts
  • test/index-retry.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/runtime/runtime-services.ts
  • lib/storage.ts
  • lib/session-affinity.ts
  • lib/request/request-resilience.ts
  • lib/runtime/runtime-observability.ts
🪛 Biome (2.4.10)
index.ts

[error] 941-941: The assignment should not be in an expression.

(lint/suspicious/noAssignInExpressions)

🔇 Additional comments (10)
lib/storage.ts (3)

54-57: lgtm on import additions.

the new health summary imports are properly scoped and used by inspectStorageHealth.


134-143: export addition is correct.

StorageHealthSummary is properly re-exported for consumers.


1371-1419: silent option implementation is clean.

the { silent?: boolean } parameter correctly guards all three logging points:

  • lib/storage.ts:1393-1395 (checksum mismatch)
  • lib/storage.ts:1405-1407 (recovery success)
  • lib/storage.ts:1411 (non-ENOENT errors)

backward-compatible default {} preserves existing caller behavior.

test/storage-health-inspection.test.ts (1)

28-50: temp directory cleanup is safe but consider using vi.fn() for stricter mock verification.

at test/storage-health-inspection.test.ts:17-26, the logger mock captures calls but doesn't verify debug and error aren't called with sensitive data. consider asserting logWarn.mock.calls contents don't contain tokens or emails if you want to guard against future regressions in log statements.

minor: the workDir at line 28 uses tmpdir() which is /tmp on linux but %TEMP% on windows. the test should work cross-platform but if ci runs windows, ensure path handling is correct (it should be, since join() is used).

lib/runtime/runtime-services.ts (3)

39-46: past issue addressed: disabled branch now unconditionally clears all state.

the fix at lib/runtime/runtime-services.ts:40-45 uses optional chaining to stop the watcher and returns null for all three fields regardless of whether currentSync existed. this correctly handles the stale path/configKey scenario flagged previously.

test coverage confirmed at test/runtime-services.test.ts:31-48 ("clears stale live sync path and config key even when no watcher exists").


48-59: past issue addressed: null config key now triggers recreation.

the condition at lib/runtime/runtime-services.ts:52-53 now treats liveAccountSyncConfigKey === null as a recreate case when nextConfigKey !== null. this prevents silently reusing a watcher with unknown debounce/poll settings after upgrade.

test coverage confirmed at test/runtime-services.test.ts:137-159 ("recreates live sync when the current config key is unknown").


71-84: ebusy/eperm retry logic looks solid for windows filesystem edge cases.

the retry loop at lib/runtime/runtime-services.ts:71-84 correctly handles transient filesystem locks with exponential backoff (25ms, 50ms, 100ms). three attempts is reasonable for brief windows file-system contention.

minor note: the delay formula 25 * 2 ** attempt yields 25, 50, 100ms for attempts 0, 1, 2 - confirm this timing is acceptable for your windows CI environment.

test/runtime-services.test.ts (3)

31-48: good regression test for the stale state bug.

test/runtime-services.test.ts:31-48 covers the edge case where currentSync: null but currentPath and currentConfigKey have stale values. this ensures the disabled branch clears all state unconditionally, preventing the skip-syncToPath bug flagged in the prior review.


112-135: config key change test is well-structured.

test/runtime-services.test.ts:112-135 verifies that when currentConfigKey differs from configKey, the old sync is stopped, a new one is created, and syncToPath is called. assertions are complete.


137-159: unknown config key (null → non-null) transition test is solid.

test/runtime-services.test.ts:137-159 covers the scenario where currentConfigKey: null and a valid configKey is provided, ensuring recreation occurs. this directly addresses the past review concern about legacy/partially committed state.

Comment thread index.ts
Comment thread lib/runtime/runtime-observability.ts
Comment thread lib/runtime/runtime-services.ts
Comment thread lib/storage.ts
Comment thread test/runtime-services.test.ts Outdated
Comment thread test/storage-health-inspection.test.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/codex-manager-report-command.test.ts (1)

29-69: 🧹 Nitpick | 🔵 Trivial

missing test coverage for loadRuntimeObservabilitySnapshot.

test/codex-manager-report-command.test.ts:29-69 defines createDeps but doesn't include loadRuntimeObservabilitySnapshot. no test exercises the report.runtime field population added in lib/codex-manager/commands/report.ts:524. consider adding a test case that mocks loadRuntimeObservabilitySnapshot to return a snapshot and asserts it appears in the JSON output.

#!/bin/bash
# verify no existing test covers loadRuntimeObservabilitySnapshot
rg -n "loadRuntimeObservabilitySnapshot" test/
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/codex-manager-report-command.test.ts` around lines 29 - 69, The tests
are missing coverage for loadRuntimeObservabilitySnapshot so update the test
helper createDeps to include a mock implementation for
loadRuntimeObservabilitySnapshot (on the ReportCommandDeps object) that returns
a representative snapshot, then add or modify a test that runs the report
command and asserts the produced JSON includes the expected snapshot under the
report.runtime field; reference the mock by name
loadRuntimeObservabilitySnapshot and assert the output contains the mocked
snapshot values.
lib/codex-manager/commands/report.ts (1)

258-285: ⚠️ Potential issue | 🟡 Minor

defaultWriteFile lacks vitest test coverage for windows EBUSY/EPERM retry scenarios.

lib/codex-manager/commands/report.ts:264-275 retries rename 5 times with 10ms exponential backoff (total ~310ms). this matches lib/storage.ts backup copy logic but falls short of lib/storage/import-export.ts which uses 25ms base (375ms total) for similar operations. windows antivirus or file indexing can hold locks beyond 310ms, but the bigger issue is that defaultWriteFile has no direct vitest tests for EBUSY/EPERM scenarios—the test file only mocks writeFile at the dependency level. per coding guidelines for lib/**, add a regression test simulating EBUSY/EPERM errors and verify the retry delay is sufficient for windows file locking edge cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager/commands/report.ts` around lines 258 - 285, Add a vitest
regression test for defaultWriteFile (in lib/codex-manager/commands/report.ts)
that mocks fs.rename to throw transient EBUSY/EPERM errors for the first few
attempts and then succeed, asserting that defaultWriteFile eventually writes the
file, cleans up the temp file on failure paths, and respects the retry/backoff
behavior (simulate delays so test proves the retry loop retries and succeeds
within the configured attempts); ensure the test covers both EBUSY and EPERM
cases and fails if rename is not retried or temp files are left behind.
♻️ Duplicate comments (4)
test/storage.test.ts (2)

2967-2981: ⚠️ Potential issue | 🟠 Major

restore fs.unlink spies in finally to prevent test leakage.

both tests restore the spy only on the happy path, so a failed assertion can leak the mock into later cases. this impacts deterministic behavior in test/storage.test.ts:2967-2981 and test/storage.test.ts:4149-4158.

proposed fix
- const unlinkSpy = vi
+ const unlinkSpy = vi
   .spyOn(fs, "unlink")
   .mockImplementation(async (targetPath) => {
     if (String(targetPath) === testStoragePath) {
       const error = Object.assign(new Error("locked"), { code: "EPERM" });
       throw error;
     }
     return Promise.resolve();
   });

- await expect(clearAccounts()).rejects.toMatchObject({ code: "EPERM" });
- expect(existsSync(testStoragePath)).toBe(true);
- expect(existsSync(resetMarkerPath)).toBe(false);
-
- unlinkSpy.mockRestore();
+ try {
+   await expect(clearAccounts()).rejects.toMatchObject({ code: "EPERM" });
+   expect(existsSync(testStoragePath)).toBe(true);
+   expect(existsSync(resetMarkerPath)).toBe(false);
+ } finally {
+   unlinkSpy.mockRestore();
+ }

as per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.

Also applies to: 4149-4158

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/storage.test.ts` around lines 2967 - 2981, The fs.unlink spy in the
tests around clearAccounts is only restored on the success path, which can leak
mocks; wrap the spy restoration in a finally block so unlinkSpy.mockRestore()
always runs (for the tests referencing unlinkSpy, clearAccounts,
testStoragePath, resetMarkerPath and existsSync), and apply the same
finally-based restore fix to the similar test at the other location that uses
unlinkSpy (the block spanning lines testing EPERM behavior) to ensure
deterministic Vitest behavior across failures.

2984-3001: ⚠️ Potential issue | 🟠 Major

add the missing branch: primary already absent + wal unlink ebusy.

this test covers wal failure, but it does not exercise the specific branch where the primary path is already gone before wal removal fails. that branch was the windows-regression target tied to lib/storage/account-clear.ts:35-41. current setup in test/storage.test.ts:2984-3001 still starts with a present primary file.

as per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/storage.test.ts` around lines 2984 - 3001, Modify the test to exercise
the branch where the primary file is already absent before the WAL unlink fails:
create the WAL file (`${testStoragePath}.wal`) but ensure the primary
(`testStoragePath`) is removed (e.g., await fs.unlink(testStoragePath) or do not
create it) before calling clearAccounts(); keep the fs.unlink mock (vi.spyOn(fs,
"unlink")) that throws an error with code "EBUSY" only when targetPath matches
`${testStoragePath}.wal`; call await
expect(clearAccounts()).rejects.toMatchObject({ code: "EBUSY" }) and then assert
existsSync(getIntentionalResetMarkerPath(testStoragePath)) is false; finally
restore the unlinkSpy—this will exercise the primary-absent + wal EBUSY branch
referenced by getIntentionalResetMarkerPath and clearAccounts.
test/config-explain.test.ts (1)

162-170: ⚠️ Potential issue | 🟡 Minor

assert exact per-key values, not just “defined” and source.

test/config-explain.test.ts:162-170 and test/config-explain.test.ts:179-181 can miss key-to-env mixups in lib/config.ts because every var is set and only defined/source is checked. assert exact numeric values per key (especially in the env phase).

suggested tightening
 		const envReport = envMod.getPluginConfigExplainReport();
+		const expectedEnvValues: Record<(typeof keys)[number], number> = {
+			rateLimitDedupWindowMs: 3210,
+			rateLimitStateResetMs: 654321,
+			rateLimitMaxBackoffMs: 12345,
+			rateLimitShortRetryThresholdMs: 250,
+		};
 		for (const key of keys) {
 			const entry = expectEntry(envReport, key);
 			expect(entry?.source).toBe("env");
+			expect(entry?.value).toBe(expectedEnvValues[key]);
 		}

Also applies to: 179-181

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/config-explain.test.ts` around lines 162 - 170, The test currently only
checks that each serialized entry for keys (variable serialized, report, keys)
is defined and has a defined source, which allows key-to-env mismatches to pass;
update the assertions in test/config-explain.test.ts to assert exact expected
values per key (compare entry.value and entry.defaultValue to the known expected
numbers/strings for each key) instead of using only toBeDefined, and include
explicit checks for the expected source when relevant (e.g., ensure env-derived
keys have the env values), referencing the serialized.entries lookup and the
keys array to map expected values to each entry and fail fast on mismatches.
lib/codex-manager/commands/report.ts (1)

340-345: ⚠️ Potential issue | 🟠 Major

email addresses still leak via formatAccountLabel in probeErrors.

lib/codex-manager/commands/report.ts:342,405,417,442 push messages containing formatAccountLabel(account, i) into probeErrors, which is serialized to JSON output and stdout. per past review, formatAccountLabel includes full email addresses. this violates the coding guideline to "check for logging that leaks tokens or emails". use a sanitized label or redact the email portion before including in error messages.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/codex-manager/commands/report.ts` around lines 340 - 345,
formatAccountLabel(account, i) currently includes full email addresses and is
being passed into probeErrors.push (e.g., the push calls around the
options.cachedOnly check and other probeErrors pushes), causing emails to be
serialized to JSON/stdout; fix by replacing use of formatAccountLabel(account,
i) in all probeErrors.push calls with a sanitized label that redacts the email
portion (e.g., mask local-part or replace with "<redacted>@domain" or use an
existing sanitized helper), or implement a small helper
sanitizeAccountLabel(account) and call that before pushing to probeErrors so no
full email is emitted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Around line 278-279: Add a Windows-specific troubleshooting bullet to
README.md under the "Common symptoms" section (around lines 274-283) that reads:
"Storage cleanup fails with `EBUSY` / `EPERM` (Windows): run `codex auth doctor
--fix` to retry, or manually remove `~/.codex/multi-auth/<project-key>/` and
re-login". In lib/storage/account-clear.ts, update the unlink/remove logic
(where unlink is called around line 12) to handle transient Windows file-lock
errors by implementing exponential-backoff retries for EBUSY and EPERM: catch
the unlink/fs.rm errors in the function that performs account cleanup, loop with
a small base delay doubling each attempt, stop after a bounded max attempts, and
only rethrow non-transient errors; mirror the same retry/backoff parameters and
error checks used in lib/quota-cache.ts (the pattern at lines 154-171) so
behavior is consistent with quota-cache test expectations.
- Around line 160-167: Add regression tests to verify the JSON contract emitted
by the report command: update or extend
test/codex-manager-report-command.test.ts to call the report JSON path (the code
that reads lib/runtime/runtime-observability.ts via report.ts snapshot logic)
and assert that when a runtime snapshot is present the output includes
runtime.poolExhaustionCooldownUntil, runtime.serverBurstCooldownUntil, and
runtime.runtimeMetrics fields; mock or seed the runtime-observability snapshot
as needed to make the fields present. Also update the README note to clarify
that codex auth status is text-only (no --json) to avoid claiming JSON
availability. Reference symbols: report.ts snapshot emission,
lib/runtime/runtime-observability.ts snapshot fields, and the test file
test/codex-manager-report-command.test.ts.

In `@test/codex-manager-report-command.test.ts`:
- Around line 287-289: The test currently uses a loose assertion for
jsonOutput.forecast.recommendation.selectedReason (expect.any(String)); replace
this with a deterministic assertion: determine the exact expected reason string
for the test scenario and assert equality (e.g.,
expect(jsonOutput.forecast.recommendation.selectedReason).toEqual('EXPECTED_REASON')),
or if multiple valid reasons exist, assert membership against the allowed list
or use a more specific matcher (e.g., regex or expect.arrayContaining) so the
test verifies the actual expected value for selectedReason rather than just its
type.
- Around line 460-512: Add an assertion that saveAccounts was retried the
expected number of times when EPERM persistence fails: in the test that calls
runReportCommand(["--live","--json"]) with deps.saveAccounts throwing EPERM,
assert the mock saveAccounts call count (e.g.,
expect(deps.saveAccounts).toHaveBeenCalledTimes(...)) to confirm the command
attempted the configured retry attempts before giving up; locate the mock named
saveAccounts in the test and add the call-count expectation alongside the
existing checks (after the runReportCommand and before final storage
assertions).

In `@test/config-explain.test.ts`:
- Around line 173-176: The test mutates several environment keys
(CODEX_AUTH_RATE_LIMIT_DEDUP_WINDOW_MS, CODEX_AUTH_RATE_LIMIT_STATE_RESET_MS,
CODEX_AUTH_RATE_LIMIT_MAX_BACKOFF_MS,
CODEX_AUTH_RATE_LIMIT_SHORT_RETRY_THRESHOLD_MS) but the existing cleanup does
not remove them, causing test bleed; update test/config-explain.test.ts to
either save and restore the original values or explicitly delete these four
process.env keys in the test teardown (afterEach or afterAll) so other tests
remain deterministic and order-independent.

In `@test/fetch-helpers.test.ts`:
- Around line 472-476: Normalize the indentation in the test blocks that create
the mock response so they match the project's 2-tab (8-space) style: locate the
test code around the Response creation and assertions (references: the Response
constructor usage, the const resp = new Response(...), and the await
handleErrorResponse(...) call in the test) and change the overly deep 16-space
indentation on the object literal and subsequent lines to the standard
2-tab/8-space indentation used by neighboring tests (also apply the same fix to
the similar block around the other occurrence with the same pattern).

---

Outside diff comments:
In `@lib/codex-manager/commands/report.ts`:
- Around line 258-285: Add a vitest regression test for defaultWriteFile (in
lib/codex-manager/commands/report.ts) that mocks fs.rename to throw transient
EBUSY/EPERM errors for the first few attempts and then succeed, asserting that
defaultWriteFile eventually writes the file, cleans up the temp file on failure
paths, and respects the retry/backoff behavior (simulate delays so test proves
the retry loop retries and succeeds within the configured attempts); ensure the
test covers both EBUSY and EPERM cases and fails if rename is not retried or
temp files are left behind.

In `@test/codex-manager-report-command.test.ts`:
- Around line 29-69: The tests are missing coverage for
loadRuntimeObservabilitySnapshot so update the test helper createDeps to include
a mock implementation for loadRuntimeObservabilitySnapshot (on the
ReportCommandDeps object) that returns a representative snapshot, then add or
modify a test that runs the report command and asserts the produced JSON
includes the expected snapshot under the report.runtime field; reference the
mock by name loadRuntimeObservabilitySnapshot and assert the output contains the
mocked snapshot values.

---

Duplicate comments:
In `@lib/codex-manager/commands/report.ts`:
- Around line 340-345: formatAccountLabel(account, i) currently includes full
email addresses and is being passed into probeErrors.push (e.g., the push calls
around the options.cachedOnly check and other probeErrors pushes), causing
emails to be serialized to JSON/stdout; fix by replacing use of
formatAccountLabel(account, i) in all probeErrors.push calls with a sanitized
label that redacts the email portion (e.g., mask local-part or replace with
"<redacted>@domain" or use an existing sanitized helper), or implement a small
helper sanitizeAccountLabel(account) and call that before pushing to probeErrors
so no full email is emitted.

In `@test/config-explain.test.ts`:
- Around line 162-170: The test currently only checks that each serialized entry
for keys (variable serialized, report, keys) is defined and has a defined
source, which allows key-to-env mismatches to pass; update the assertions in
test/config-explain.test.ts to assert exact expected values per key (compare
entry.value and entry.defaultValue to the known expected numbers/strings for
each key) instead of using only toBeDefined, and include explicit checks for the
expected source when relevant (e.g., ensure env-derived keys have the env
values), referencing the serialized.entries lookup and the keys array to map
expected values to each entry and fail fast on mismatches.

In `@test/storage.test.ts`:
- Around line 2967-2981: The fs.unlink spy in the tests around clearAccounts is
only restored on the success path, which can leak mocks; wrap the spy
restoration in a finally block so unlinkSpy.mockRestore() always runs (for the
tests referencing unlinkSpy, clearAccounts, testStoragePath, resetMarkerPath and
existsSync), and apply the same finally-based restore fix to the similar test at
the other location that uses unlinkSpy (the block spanning lines testing EPERM
behavior) to ensure deterministic Vitest behavior across failures.
- Around line 2984-3001: Modify the test to exercise the branch where the
primary file is already absent before the WAL unlink fails: create the WAL file
(`${testStoragePath}.wal`) but ensure the primary (`testStoragePath`) is removed
(e.g., await fs.unlink(testStoragePath) or do not create it) before calling
clearAccounts(); keep the fs.unlink mock (vi.spyOn(fs, "unlink")) that throws an
error with code "EBUSY" only when targetPath matches `${testStoragePath}.wal`;
call await expect(clearAccounts()).rejects.toMatchObject({ code: "EBUSY" }) and
then assert existsSync(getIntentionalResetMarkerPath(testStoragePath)) is false;
finally restore the unlinkSpy—this will exercise the primary-absent + wal EBUSY
branch referenced by getIntentionalResetMarkerPath and clearAccounts.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b58ef145-73bb-4756-a33f-6800c3a0bf73

📥 Commits

Reviewing files that changed from the base of the PR and between 91d310f and f035e7b.

📒 Files selected for processing (6)
  • README.md
  • lib/codex-manager/commands/report.ts
  • test/codex-manager-report-command.test.ts
  • test/config-explain.test.ts
  • test/fetch-helpers.test.ts
  • test/storage.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (2)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/config-explain.test.ts
  • test/storage.test.ts
  • test/codex-manager-report-command.test.ts
  • test/fetch-helpers.test.ts
lib/**

⚙️ CodeRabbit configuration file

focus on auth rotation, windows filesystem IO, and concurrency. verify every change cites affected tests (vitest) and that new queues handle EBUSY/429 scenarios. check for logging that leaks tokens or emails.

Files:

  • lib/codex-manager/commands/report.ts
🔇 Additional comments (16)
test/fetch-helpers.test.ts (5)

528-540: lgtm - this addresses the past review request for mixed entitlement + rate-limit signal test.

the test at lines 528-540 verifies that usage_not_included (entitlement) in error.code takes precedence over usage_limit_reached (rate-limit) in error.type, resulting in 403 with no rateLimit. this documents the priority behavior in the implementation.


1044-1077: lgtm - edge cases for free-text, structured rate-limit codes, and malformed json 404s are now covered.

  • lines 1044-1051: free-text "usage limit exceeded" body stays 404 (no structured signal)
  • lines 1053-1068: structured rate_limit_exceeded code remaps to 429
  • lines 1070-1077: malformed json body safely falls through to 404 without throwing

the malformed json test at lines 1070-1077 addresses the past review request for regression coverage on the catch block in lib/request/fetch-helpers.ts:992-994.


467-480: lgtm - test covers error.code signal path for usage-limit 404 → 429 remapping.

the test properly verifies that a 404 with error.code: 'usage_limit_reached' gets remapped to 429 with a positive retryAfterMs. assertions are complete.


482-509: lgtm - covers error.type signal and code/type disagreement scenarios.

lines 482-495 test when only error.type carries the rate-limit signal, and lines 497-509 verify behavior when code and type disagree (both still indicate rate-limit). good coverage for the signal normalization logic in lib/request/fetch-helpers.ts.


511-526: lgtm - entitlement-style 404 → 403 and non-usage 404 unchanged tests are solid.

lines 511-526 verify that error.type: 'usage_not_included' remaps to 403 with the expected payload. lines 542-550 confirm that generic not_found codes stay as 404.

Also applies to: 542-550

test/config-explain.test.ts (1)

131-146: good coverage for json-safe retry max-retries output.

this test in test/config-explain.test.ts:131-146 correctly locks in numeric 0 behavior and verifies round-trip serialization.

README.md (1)

202-203: good fix: storage table now includes the runtime observability artifact.

line 202 now matches the persisted snapshot path in lib/runtime/runtime-observability.ts:48-57. this closes the local-state audit gap cleanly.

lib/codex-manager/commands/report.ts (5)

63-75: parsePositiveIntegerOption correctly rejects malformed inputs.

this addresses the past review concern about Number.parseInt truncating partial numeric input. the /^\d+$/ regex properly rejects "1.9", "1e3", "2foo" before parsing. the Number.isSafeInteger check and < 1 guard are correct.


160-213: cli flag parsing for budget limits looks correct.

--cached-only, --max-accounts, and --max-probes parsing follows the existing pattern and uses the new parsePositiveIntegerOption helper. error messages are consistent.


406-412: good: in-memory patch applied only after persistence succeeds.

lib/codex-manager/commands/report.ts:408 calls applyRefreshedAccountPatch only after persistRefreshedAccountPatch at line 393 succeeds. this ensures the in-memory snapshot isn't mutated if persistence fails, matching the new regression test in the test file.


509-515: selectedReason derivation looks correct.

the fallback chain forecastResults[...]?.reasons[0] ?? recommendation.reason handles missing forecast entries gracefully.


592-596: runtime traffic logging looks fine.

no sensitive data exposed - just request counters.

test/codex-manager-report-command.test.ts (4)

53-60: inspectStorageHealth mock is deterministic.

returns a hardcoded StorageHealthSummary with no randomness or timing dependencies.


103-117: good regression coverage for malformed numeric flags.

the parameterized tests for "1.9", "1e3", "2foo" directly address the past review concern. both --max-accounts and --max-probes variants are covered.


143-173: budget enforcement test is well-structured.

verifies both account consideration limit and probe execution limit with appropriate assertions on the JSON output.


175-193: cached-only mode test correctly verifies refresh is skipped.

assertions confirm no refresh attempt and appropriate error message in probeErrors.

Comment thread README.md
Comment thread README.md
Comment on lines +278 to +279
- Requests fail fast with a pool cooldown message: wait for the cooldown window or inspect `codex auth status`
- Requests fail fast after repeated upstream 5xx errors: inspect `codex auth report --json` for runtime traffic and cooldown details
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# verify whether windows-specific EBUSY/EPERM handling and messaging exists in storage clear paths/docs.
# expected: code has explicit windows-safe retries/handling and docs mention user action.

rg -n -C3 --type=ts 'EBUSY|EPERM|rename|unlink|rm|account-clear|clearAccountStorageArtifacts'
rg -n -C3 --type=md 'EBUSY|EPERM|windows|storage clear|account clear|cleanup'

Repository: ndycode/codex-multi-auth

Length of output: 50382


🏁 Script executed:

# Find the storage/account-clear file
fd -type f -name "*account*clear*" -o -name "*clear*account*" lib/

# Check README.md around the fail-fast bullets
sed -n '270,290p' README.md

# Search specifically for storage-related error handling docs
rg -n "windows|cleanup|storage.*clear|account.*clear" README.md -A2 -B2

Repository: ndycode/codex-multi-auth

Length of output: 1505


🏁 Script executed:

# Look for the actual account-clear implementation
find lib/ -type f -name "*.ts" | xargs grep -l "EBUSY\|EPERM\|clearAccountStorage" | head -10

Repository: ndycode/codex-multi-auth

Length of output: 372


🏁 Script executed:

# Check what the documentation test expects for storage/windows
rg -B5 -A5 "eperm.*ebusy|ebusy.*eperm" test/documentation.test.ts

Repository: ndycode/codex-multi-auth

Length of output: 50


🏁 Script executed:

# Search for account-clear related functions more broadly
rg -n "clearAccountStorageArtifacts|account.*clear|storage.*clear|AccountClear" lib/ --type ts -A3 -B1 | head -80

Repository: ndycode/codex-multi-auth

Length of output: 1781


🏁 Script executed:

# Find and view the documentation test that checks for eperm/ebusy
sed -n '370,380p' test/documentation.test.ts

Repository: ndycode/codex-multi-auth

Length of output: 553


🏁 Script executed:

# Search README for any windows-specific troubleshooting
rg -n -i "windows|ebusy|eperm|file.*lock|access.*denied" README.md

Repository: ndycode/codex-multi-auth

Length of output: 191


🏁 Script executed:

# Read the account-clear.ts file to see EBUSY/EPERM handling
cat lib/storage/account-clear.ts

Repository: ndycode/codex-multi-auth

Length of output: 1293


🏁 Script executed:

# Find where fieldInventory is built in documentation.test.ts
rg -B20 'expect.*fieldInventory.*eperm' test/documentation.test.ts | head -60

Repository: ndycode/codex-multi-auth

Length of output: 957


🏁 Script executed:

# Check if README mentions account storage or storage clearing
rg -n -i "account storage|storage.*clear|clean.*storage" README.md -B2 -A2

Repository: ndycode/codex-multi-auth

Length of output: 477


🏁 Script executed:

# Check docs/development/CONFIG_FIELDS.md to see what Windows notes exist
rg -B5 -A5 "eperm|ebusy|windows.*note|concurrency.*windows" docs/development/CONFIG_FIELDS.md | head -80

Repository: ndycode/codex-multi-auth

Length of output: 50


🏁 Script executed:

# Check if CONFIG_FIELDS.md exists and has content
cat docs/development/CONFIG_FIELDS.md | head -100

Repository: ndycode/codex-multi-auth

Length of output: 3139


🏁 Script executed:

# Search for where eperm and ebusy are mentioned in docs
fd --type f -path "docs/*" | xargs rg -l "eperm|ebusy" 2>/dev/null

Repository: ndycode/codex-multi-auth

Length of output: 311


🏁 Script executed:

# View the troubleshooting section of README more broadly
rg -n "troubleshooting|common symptoms|diagnostics" README.md -i -A 30 | head -100

Repository: ndycode/codex-multi-auth

Length of output: 3603


🏁 Script executed:

# View more of CONFIG_FIELDS.md to find eperm/ebusy sections
cat docs/development/CONFIG_FIELDS.md | grep -B5 -A5 -i "eperm\|ebusy\|windows"

Repository: ndycode/codex-multi-auth

Length of output: 709


🏁 Script executed:

# Check if there are any docs about storage cleanup or Windows issues
fd -type f docs/ | xargs rg -l "storage.*clear\|account.*clear\|ebusy\|eperm" 2>/dev/null

Repository: ndycode/codex-multi-auth

Length of output: 240


🏁 Script executed:

# Check docs/troubleshooting.md to see if Windows storage is documented there
cat docs/troubleshooting.md 2>/dev/null | head -150

Repository: ndycode/codex-multi-auth

Length of output: 3692


Add Windows file-locking symptom and recovery path to README troubleshooting.

The lib/storage/account-clear.ts:12 unlink operations can fail with EBUSY or EPERM on Windows during concurrent file access or antivirus scanning. Storage cleanup failures leave stale state and block subsequent codex auth operations. Add one explicit bullet to README.md lines 274-283 under "Common symptoms" so Windows users have a direct recovery path:

- Storage cleanup fails with `EBUSY` / `EPERM` (Windows): run `codex auth doctor --fix` to retry, or manually remove `~/.codex/multi-auth/<project-key>/` and re-login

Also add retry logic with exponential backoff to lib/storage/account-clear.ts for transient EBUSY/EPERM errors during file removal, mirroring the pattern already used in lib/quota-cache.ts:154-171 and test/quota-cache.test.ts:136-176.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 278 - 279, Add a Windows-specific troubleshooting
bullet to README.md under the "Common symptoms" section (around lines 274-283)
that reads: "Storage cleanup fails with `EBUSY` / `EPERM` (Windows): run `codex
auth doctor --fix` to retry, or manually remove
`~/.codex/multi-auth/<project-key>/` and re-login". In
lib/storage/account-clear.ts, update the unlink/remove logic (where unlink is
called around line 12) to handle transient Windows file-lock errors by
implementing exponential-backoff retries for EBUSY and EPERM: catch the
unlink/fs.rm errors in the function that performs account cleanup, loop with a
small base delay doubling each attempt, stop after a bounded max attempts, and
only rethrow non-transient errors; mirror the same retry/backoff parameters and
error checks used in lib/quota-cache.ts (the pattern at lines 154-171) so
behavior is consistent with quota-cache test expectations.

Comment thread test/codex-manager-report-command.test.ts
Comment thread test/codex-manager-report-command.test.ts
Comment thread test/config-explain.test.ts
Comment thread test/fetch-helpers.test.ts Outdated
Comment on lines +58 to +64
await clearPath(params.path, true);
await clearPath(params.walPath, true);
await fs.writeFile(
params.resetMarkerPath,
JSON.stringify({ version: 1, createdAt: Date.now() }),
{ encoding: "utf-8", mode: 0o600 },
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 reset marker written after file deletion

the old code wrote the reset marker first to fence off stale re-use, then deleted the files. the new order is: delete path → delete walPath → write marker. if clearPath(params.walPath, true) throws after all 5 retries — which happens on windows when EBUSY/EPERM persists — the function propagates the error and the marker is never written. on the next startup, path is gone but walPath still exists with no reset marker, so inspectStorageHealth detects a recoverable wal and restores stale data the caller intended to wipe. the original first-marker ordering was deliberate crash-safety for this exact scenario.

Suggested change
await clearPath(params.path, true);
await clearPath(params.walPath, true);
await fs.writeFile(
params.resetMarkerPath,
JSON.stringify({ version: 1, createdAt: Date.now() }),
{ encoding: "utf-8", mode: 0o600 },
);
await fs.writeFile(
params.resetMarkerPath,
JSON.stringify({ version: 1, createdAt: Date.now() }),
{ encoding: "utf-8", mode: 0o600 },
);
await clearPath(params.path, true);
await clearPath(params.walPath, true);
Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/storage/account-clear.ts
Line: 58-64

Comment:
**reset marker written after file deletion**

the old code wrote the reset marker *first* to fence off stale re-use, then deleted the files. the new order is: delete `path` → delete `walPath` → write marker. if `clearPath(params.walPath, true)` throws after all 5 retries — which happens on windows when `EBUSY`/`EPERM` persists — the function propagates the error and the marker is never written. on the next startup, `path` is gone but `walPath` still exists with no reset marker, so `inspectStorageHealth` detects a recoverable wal and restores stale data the caller intended to wipe. the original first-marker ordering was deliberate crash-safety for this exact scenario.

```suggestion
	await fs.writeFile(
		params.resetMarkerPath,
		JSON.stringify({ version: 1, createdAt: Date.now() }),
		{ encoding: "utf-8", mode: 0o600 },
	);
	await clearPath(params.path, true);
	await clearPath(params.walPath, true);
```

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

@@ -1,4 +1,4 @@
const MAX_RETRY_HINT_MS = 5 * 60 * 1000;
const MAX_RETRY_HINT_MS = 24 * 60 * 60 * 1000;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 24 h retry-after cap can lock accounts for a full day

MAX_RETRY_HINT_MS was raised from 5 minutes to 24 hours. in getRateLimitBackoff, the final delayMs is Math.max(baseDelay, jitteredDelay). baseDelay comes directly from serverRetryAfterMs (the parsed Retry-After header value), while jitteredDelay is capped at maxBackoffMs (default 60 s). so Math.max(86_400_000, 60_000) = 86,400,000 ms — the full baseDelay wins and is passed straight to accountManager.setAccountCooldown, locking the account for 24 hours. one misbehaving upstream (or a misconfigured header) silently makes every matching account unusable for a full day with no operator warning. the 5-minute cap was an effective upper bound. if intentionally large server hints need to be honored, add an explicit opt-in config knob rather than raising the shared ceiling.

Suggested change
const MAX_RETRY_HINT_MS = 24 * 60 * 60 * 1000;
const MAX_RETRY_HINT_MS = 5 * 60 * 1000;
Prompt To Fix With AI
This is a comment left during a code review.
Path: lib/request/response-metadata.ts
Line: 1

Comment:
**24 h retry-after cap can lock accounts for a full day**

`MAX_RETRY_HINT_MS` was raised from 5 minutes to 24 hours. in `getRateLimitBackoff`, the final `delayMs` is `Math.max(baseDelay, jitteredDelay)`. `baseDelay` comes directly from `serverRetryAfterMs` (the parsed `Retry-After` header value), while `jitteredDelay` is capped at `maxBackoffMs` (default 60 s). so `Math.max(86_400_000, 60_000)` = 86,400,000 ms — the full `baseDelay` wins and is passed straight to `accountManager.setAccountCooldown`, locking the account for 24 hours. one misbehaving upstream (or a misconfigured header) silently makes every matching account unusable for a full day with no operator warning. the 5-minute cap was an effective upper bound. if intentionally large server hints need to be honored, add an explicit opt-in config knob rather than raising the shared ceiling.

```suggestion
const MAX_RETRY_HINT_MS = 5 * 60 * 1000;
```

How can I resolve this? If you propose a fix, please make it concise.

Fix in Codex

@ndycode ndycode merged commit 519e9df into main Apr 6, 2026
2 checks passed
ndycode added a commit that referenced this pull request Apr 6, 2026
release: rebuild open main PR wave with review fixes
@ndycode ndycode deleted the rebuild/main-open-prs branch April 12, 2026 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant