fix(codex-limits): harden usage fetch and dedup refresh#72
Conversation
Address the PR 70 review findings around codex-limits by sanitizing upstream error output, adding a request timeout, preserving active-account display across deduped credentials, and propagating refreshed credentials across duplicate records. Expand tests to cover missing refresh tokens, duplicate refresh propagation, and redacted error rendering. Co-authored-by: Codex <noreply@openai.com>
📝 WalkthroughWalkthroughEnhanced credential refresh and persistence with atomic updates; added usage-fetch timeouts and sanitized error messages; replaced disk reloads with in-memory cache invalidation; refined deduplication and active-account display for shared credentials; tests moved to clone-based storage and expanded to cover refresh, dedupe, timeout, and redaction behaviors. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Plugin
participant Upstream
participant Storage
participant AccountManager
Client->>Plugin: perform operation / fetch usage
Plugin->>Plugin: create AbortController (usageFetchTimeoutMs)
Plugin->>Upstream: fetchUsage (with abort signal)
alt Upstream 200 OK
Upstream-->>Plugin: usage payload
Plugin->>Client: return payload
else Upstream non-OK or error
Upstream-->>Plugin: error body / status
Plugin->>Plugin: sanitizeUsageErrorMessage (redact sensitive fields)
Plugin->>Client: throw sanitized error
end
alt Token refresh required
Plugin->>Upstream: refresh token request
Upstream-->>Plugin: new access/refresh tokens + expiresAt
Plugin->>Plugin: applyRefreshedCredentials (update in-memory accounts)
Plugin->>Storage: persistRefreshedCredentials (within transaction)
Storage-->>Plugin: commit
Plugin->>AccountManager: invalidateAccountManagerCache()
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@index.ts`:
- Around line 4360-4371: The label is taken from the first deduped record even
when isActive is set by matching activeRefreshToken, causing the wrong row to be
shown as active; modify the loop to determine a single canonical active index
(e.g., compute activeAccountIndex = typeof activeIndex === "number" &&
activeIndex >= 0 ? activeIndex : storage.accounts.findIndex(a => a.refreshToken
=== activeRefreshToken)) and then use that canonical index when formatting
labels and computing isActive (use formatCommandAccountLabel(account, i) but
compare i to activeAccountIndex or use activeAccountIndex to pick the label to
mark), ensuring formatCommandAccountLabel and isActive both reference the same
resolved active index rather than inferring active via refresh token per-row.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3bf15bd1-8f0e-4a0a-97f2-7d6ffe8a5c12
📒 Files selected for processing (2)
index.tstest/index.test.ts
Hoist the per-run usage timeout, align the active badge with the active account label, harden timeout handling during response body reads, and add the missing timeout and refresh-fallback regression coverage. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
index.ts (1)
4360-4371:⚠️ Potential issue | 🟡 MinorKeep the active badge and label aligned for deduped credentials.
Line 4369 still formats the first deduped record, while Line 4370 can mark that row active by matching
activeRefreshTokenfrom a different stored entry. When duplicate records carry different workspace labels or account IDs,codex-limitscan still show the wrong label as active.💡 One way to keep the label and active marker aligned
- const label = formatCommandAccountLabel(account, i); - const isActive = i === activeIndex || (!!activeRefreshToken && account.refreshToken === activeRefreshToken); + const sharesActiveCredential = + !!activeRefreshToken && account.refreshToken === activeRefreshToken; + const displayIndex = sharesActiveCredential ? activeIndex : i; + const displayAccount = sharesActiveCredential + ? storage.accounts[activeIndex] ?? account + : account; + const label = formatCommandAccountLabel(displayAccount, displayIndex); + const isActive = i === activeIndex || sharesActiveCredential; const activeSuffix = isActive ? (ui.v2Enabled ? ` ${formatUiBadge(ui, "active", "accent")}` : " [active]") : "";🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@index.ts` around lines 4360 - 4371, The active badge can be applied to a different deduped row than the one whose label was formatted: when isActive is true because account.refreshToken === activeRefreshToken, recompute the label from the actual matching stored account rather than using the current `account`/`label`; inside the loop over `uniqueIndices` (where `account`, `label`, `isActive`, `activeSuffix` are computed) detect if the activation came from `activeRefreshToken` and call `formatCommandAccountLabel` with the matched stored account (and its index) so the displayed label and active marker align.
🧹 Nitpick comments (2)
test/index.test.ts (2)
929-952: Please cover the timeout branch too.The PR also hardens
codex-limitswith a usage-fetch timeout, but this block only exercises an HTTP 401 response. A timed-out fetch rejection can still regress unnoticed, so I'd add a companion test next to this redaction case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/index.test.ts` around lines 929 - 952, Add a new test case alongside "redacts upstream auth material from usage fetch errors" that simulates a fetch timeout/rejection and asserts the same redaction behavior: mock globalThis.fetch to return a rejected Promise (e.g. vi.fn().mockRejectedValue(new Error("timeout")) or a timeout-like error), call plugin.tool["codex-limits"].execute(), and verify the result contains the generic error text and redacted tokens (expect result toContain "Bearer [redacted]" and not contain the raw secrets/jwt), mirroring the existing expectations in the 401 test to cover the timeout branch.
880-927: Also assert the refresh input and propagated expiry.This still passes if
codex-limitscallsqueuedRefreshwith the wrong token and only fans outaccess/refreshwhile leavingexpiresAtstale. Inlib/refresh-queue.ts, Lines 345-347 show the queue is keyed solely by the refresh token, so I'd pin both behaviors here.🧪 Tighten the regression
await plugin.tool["codex-limits"].execute(); expect(vi.mocked(queuedRefresh)).toHaveBeenCalledTimes(1); + expect(vi.mocked(queuedRefresh)).toHaveBeenCalledWith("stale-refresh"); expect(mockStorage.accounts.map((account) => account.refreshToken)).toEqual([ "rotated-refresh", "rotated-refresh", ]); expect(mockStorage.accounts.map((account) => account.accessToken)).toEqual([ "rotated-access", "rotated-access", ]); + expect( + mockStorage.accounts.every((account) => (account.expiresAt ?? 0) > Date.now()), + ).toBe(true); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/index.test.ts` around lines 880 - 927, Assert that queuedRefresh was invoked with the stale refresh token and that the refreshed expiry is propagated to all duplicate accounts: add an expectation that vi.mocked(queuedRefresh) was called with the original "stale-refresh" (or the exact refreshToken value from mockStorage.accounts) and assert mockStorage.accounts.map(a => a.expiresAt) equals the mocked resolved expires value (or its numeric representation) returned by queuedRefresh so both token input and expiresAt are validated after plugin.tool["codex-limits"].execute() runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@index.ts`:
- Around line 4360-4371: The active badge can be applied to a different deduped
row than the one whose label was formatted: when isActive is true because
account.refreshToken === activeRefreshToken, recompute the label from the actual
matching stored account rather than using the current `account`/`label`; inside
the loop over `uniqueIndices` (where `account`, `label`, `isActive`,
`activeSuffix` are computed) detect if the activation came from
`activeRefreshToken` and call `formatCommandAccountLabel` with the matched
stored account (and its index) so the displayed label and active marker align.
---
Nitpick comments:
In `@test/index.test.ts`:
- Around line 929-952: Add a new test case alongside "redacts upstream auth
material from usage fetch errors" that simulates a fetch timeout/rejection and
asserts the same redaction behavior: mock globalThis.fetch to return a rejected
Promise (e.g. vi.fn().mockRejectedValue(new Error("timeout")) or a timeout-like
error), call plugin.tool["codex-limits"].execute(), and verify the result
contains the generic error text and redacted tokens (expect result toContain
"Bearer [redacted]" and not contain the raw secrets/jwt), mirroring the existing
expectations in the 401 test to cover the timeout branch.
- Around line 880-927: Assert that queuedRefresh was invoked with the stale
refresh token and that the refreshed expiry is propagated to all duplicate
accounts: add an expectation that vi.mocked(queuedRefresh) was called with the
original "stale-refresh" (or the exact refreshToken value from
mockStorage.accounts) and assert mockStorage.accounts.map(a => a.expiresAt)
equals the mocked resolved expires value (or its numeric representation)
returned by queuedRefresh so both token input and expiresAt are validated after
plugin.tool["codex-limits"].execute() runs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f618852a-39b7-4d2e-8d41-684b5246f3c3
📒 Files selected for processing (2)
index.tstest/index.test.ts
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/index.test.ts`:
- Around line 967-990: Extend the "redacts upstream auth material from usage
fetch errors" test in test/index.test.ts to also cover the other sanitizer
patterns by mocking the upstream response body to include an sk- token and a
long hex token in addition to the Bearer/JWT strings, call
plugin.tool["codex-limits"].execute() as before, and add expectations that the
result contains the redacted placeholders (e.g., "sk-[redacted]" or "[redacted]"
as your helper uses) and does not contain the raw sk- token or the long hex
token; ensure the mocked fetch response string includes representative examples
of sk- prefixed keys and a long hex string so those branches in the sanitizer
helper are exercised.
- Around line 881-928: The test is relying on the live mockStorage object and
directly asserting against saveAccounts, which masks whether account changes
were actually persisted; update the spec to assert persisted state by calling
the public loadAccounts/read accessor to obtain a fresh clone before and after
plugin.tool["codex-limits"].execute() (or via withAccountStorageTransaction read
path), and compare the reloaded accounts' tokens (refreshToken/accessToken)
rather than inspecting the original mockStorage object or directly using
saveAccounts; ensure both the fan-out case and the fallback case use the same
cloned load + verify reload pattern and avoid asserting against the concrete
mockStorage reference.
- Around line 992-1024: The test "surfaces usage fetch timeouts without leaking
raw abort errors" currently only rejects the fetch() call on abort, missing the
case where abort happens while reading the response body; change the mock so
fetch resolves to a Response-like object whose text() (and/or json()) returns a
Promise that rejects with an AbortError when the signal aborts (so body-read
abort path is exercised), keep the call that triggers
plugin.tool["codex-limits"].execute() and the timer-driven abort behavior, and
move vi.useRealTimers() into a finally block to ensure real timers are restored
even if assertions fail.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c28628b2-01dc-45ce-b8f4-4ed683d247c5
📒 Files selected for processing (2)
index.tstest/index.test.ts
Make storage mocks clone-on-read so refresh regressions assert persisted state, extend sanitizer coverage to sk-* and long-hex tokens, and exercise the body-read abort timeout path with timer cleanup in finally. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@index.ts`:
- Around line 4301-4309: sanitizeUsageErrorMessage's current sk- regex only
matches bare alphanumerics and misses secrets containing separators; update the
sk- redaction pattern in sanitizeUsageErrorMessage to match any typical secret
characters after the prefix (e.g., non-whitespace or a wider allowed charset and
a reasonable minimum length) so tokens like sk-foo.bar_baz:123 are caught;
locate the replace call that uses /\bsk-[A-Za-z0-9]{20,}\b/g and broaden it
(keep global/ignore-case flags) to cover separators and shorter realistic token
lengths while still avoiding false positives.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 4a415250-9999-4552-84cf-fcdf66fea58e
📒 Files selected for processing (2)
index.tstest/index.test.ts
Catch sk-prefixed secrets that contain separators in sanitized upstream usage errors and extend the regression test to cover the widened pattern. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
Re-throw non-abort response body read errors instead of silently converting them into empty HTTP bodies so timeout handling stays precise and unexpected upstream failures are not masked. Co-authored-by: Codex <noreply@openai.com>
Use the displayed active account metadata when deduped credentials share a refresh token, reset active selection state in the empty-token regression, and add coverage for aborts during successful response body reads. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
♻️ Duplicate comments (1)
test/index.test.ts (1)
218-222:⚠️ Potential issue | 🟠 MajorDeep-clone nested account metadata in these storage mocks.
These helpers still only do
{ ...account }, so nested values likerateLimitResetTimesremain shared withmockStorage. A code path that mutates a loaded snapshot without callingsaveAccounts/persistcan still make the new reload-based assertions pass, which weakens the persistence-boundary fix this PR is adding.Suggested fix
+const cloneAccount = (account: (typeof mockStorage.accounts)[number]) => ({ + ...account, + rateLimitResetTimes: account.rateLimitResetTimes + ? { ...account.rateLimitResetTimes } + : undefined, +}); + const cloneMockStorage = () => ({ ...mockStorage, - accounts: mockStorage.accounts.map((account) => ({ ...account })), + accounts: mockStorage.accounts.map(cloneAccount), activeIndexByFamily: { ...mockStorage.activeIndexByFamily }, }); // ... saveAccounts: vi.fn(async (nextStorage: typeof mockStorage) => { mockStorage.version = nextStorage.version; - mockStorage.accounts = nextStorage.accounts.map((account) => ({ ...account })); + mockStorage.accounts = nextStorage.accounts.map(cloneAccount); mockStorage.activeIndex = nextStorage.activeIndex; mockStorage.activeIndexByFamily = { ...nextStorage.activeIndexByFamily }; }), // ... const persist = async (nextStorage: typeof mockStorage) => { mockStorage.version = nextStorage.version; - mockStorage.accounts = nextStorage.accounts.map((account) => ({ ...account })); + mockStorage.accounts = nextStorage.accounts.map(cloneAccount); mockStorage.activeIndex = nextStorage.activeIndex; mockStorage.activeIndexByFamily = { ...nextStorage.activeIndexByFamily }; };Also applies to: 227-232, 240-245
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/index.test.ts` around lines 218 - 222, The storage mock helpers currently shallow-copy accounts using ({ ...account }) which leaves nested fields (e.g., rateLimitResetTimes) shared; update cloneMockStorage and the other helpers that map over mockStorage.accounts to produce deep copies of each account (e.g., use structuredClone(account) or JSON.parse(JSON.stringify(account))) so nested metadata is fully cloned, and ensure activeIndexByFamily is also deeply copied if it may contain nested objects.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/index.test.ts`:
- Around line 218-222: The storage mock helpers currently shallow-copy accounts
using ({ ...account }) which leaves nested fields (e.g., rateLimitResetTimes)
shared; update cloneMockStorage and the other helpers that map over
mockStorage.accounts to produce deep copies of each account (e.g., use
structuredClone(account) or JSON.parse(JSON.stringify(account))) so nested
metadata is fully cloned, and ensure activeIndexByFamily is also deeply copied
if it may contain nested objects.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: cbe589f2-f653-4ada-aae3-981dd5f53c53
📒 Files selected for processing (2)
index.tstest/index.test.ts
Clone nested rate-limit metadata in the storage test harness so reload-based persistence assertions cannot pass via shared nested references. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
index.ts (1)
4409-4431: Consider hoistingapplyRefreshedCredentialsto reduce per-iteration closure allocation.The helper is currently defined inside the loop iteration (within the try block starting at Line 4395). Each refresh creates a new closure capturing
refreshResult. While unlikely to cause issues with typical account counts, hoisting it outside the loop and passingrefreshResultas a parameter would reduce GC pressure.♻️ Proposed refactor to hoist the helper
+const applyRefreshedCredentials = ( + target: { + refreshToken: string; + accessToken?: string; + expiresAt?: number; + }, + result: { refresh: string; access: string; expires?: number }, +): void => { + target.refreshToken = result.refresh; + target.accessToken = result.access; + target.expiresAt = result.expires; +}; + for (const i of uniqueIndices) { // ... existing code ... try { // ... refresh logic ... const refreshResult = await queuedRefresh(account.refreshToken); // ... - const applyRefreshedCredentials = ( - target: { - refreshToken: string; - accessToken?: string; - expiresAt?: number; - }, - ): void => { - target.refreshToken = refreshResult.refresh; - target.accessToken = refreshResult.access; - target.expiresAt = refreshResult.expires; - }; let refreshedCount = 0; for (const storedAccount of storage.accounts) { if (!storedAccount) continue; if (previousRefreshToken && storedAccount.refreshToken === previousRefreshToken) { - applyRefreshedCredentials(storedAccount); + applyRefreshedCredentials(storedAccount, refreshResult); refreshedCount += 1; } } if (refreshedCount === 0) { - applyRefreshedCredentials(account); + applyRefreshedCredentials(account, refreshResult); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@index.ts` around lines 4409 - 4431, The helper applyRefreshedCredentials is created inside the refresh try block and closes over refreshResult each iteration; hoist it outside the loop and change its signature to accept refreshResult (e.g., applyRefreshedCredentials(target, refreshResult)) so it no longer allocates a new closure per refresh; update all call sites (where it is invoked for storedAccount or account) to pass the refreshResult argument and keep refreshedCount/logic with storage.accounts, previousRefreshToken, account unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@index.ts`:
- Around line 4386-4393: When sharesActiveCredential is true and displayIndex is
used to pick from storage.accounts, validate that storage.accounts[displayIndex]
is defined instead of silently falling back to account; in the block computing
displayIndex/displayAccount (variables sharesActiveCredential, displayIndex,
displayAccount) add an explicit check when typeof displayIndex === "number" &&
sharesActiveCredential to ensure storage.accounts[displayIndex] exists and if
not either log a warning/error (via the existing logger) or throw so corruption
isn’t masked, then proceed to call formatCommandAccountLabel/display the
activeSuffix as before using the validated displayAccount and
activeIndex/i/ui.v2Enabled/formatUiBadge.
In `@test/index.test.ts`:
- Around line 801-905: Reset shared test state for codex-limits by initializing
mockStorage.activeIndex = 0 and mockStorage.activeIndexByFamily = {} in the
codex-limits describe block's beforeEach (the same beforeEach that currently
restores originalFetch) so tests that call plugin.tool["codex-limits"].execute
don't leak activeIndex between cases; ensure this setup runs before each test so
tests that later set mockStorage.activeIndex or mockStorage.activeIndexByFamily
start from a known clean state.
---
Nitpick comments:
In `@index.ts`:
- Around line 4409-4431: The helper applyRefreshedCredentials is created inside
the refresh try block and closes over refreshResult each iteration; hoist it
outside the loop and change its signature to accept refreshResult (e.g.,
applyRefreshedCredentials(target, refreshResult)) so it no longer allocates a
new closure per refresh; update all call sites (where it is invoked for
storedAccount or account) to pass the refreshResult argument and keep
refreshedCount/logic with storage.accounts, previousRefreshToken, account
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7a68bb9a-6cf4-4181-b628-9eef0471643b
📒 Files selected for processing (2)
index.tstest/index.test.ts
Throw when an active dedup display index points at a missing stored account, reset codex-limits test state in describe setup, and hoist the refreshed-credential helper out of the per-iteration refresh path. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
Warn when refreshed credentials cannot be matched back to persisted storage, cap upstream error-body text before sanitization, and add a regression for the silent persist-miss path. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/index.test.ts (1)
240-255:⚠️ Potential issue | 🟠 MajorReturn the callback result from
withAccountStorageTransaction.The default mock currently swallows the handler’s return value. Since Line 1064 already has to patch this behavior to exercise the persist-miss path, the base mock is no longer matching the real transaction boundary closely enough. That can let success-path tests drift whenever the caller branches on the transaction result.
Suggested fix
withAccountStorageTransaction: vi.fn( - async ( + async <T>( callback: ( loadedStorage: typeof mockStorage, persist: (nextStorage: typeof mockStorage) => Promise<void>, - ) => Promise<void>, + ) => Promise<T>, ) => { const loadedStorage = cloneMockStorage(); const persist = async (nextStorage: typeof mockStorage) => { mockStorage.version = nextStorage.version; mockStorage.accounts = nextStorage.accounts.map(cloneAccount); mockStorage.activeIndex = nextStorage.activeIndex; mockStorage.activeIndexByFamily = { ...nextStorage.activeIndexByFamily }; }; - await callback(loadedStorage, persist); + return await callback(loadedStorage, persist); }, ),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/index.test.ts` around lines 240 - 255, The mock for withAccountStorageTransaction currently calls the provided callback but discards its return value; update the async mock implementation of withAccountStorageTransaction to return the callback's result (e.g., capture and return await callback(loadedStorage, persist)) so callers receive the same transaction return behavior as the real implementation; ensure this change is made in the withAccountStorageTransaction mock where loadedStorage is created via cloneMockStorage and persist updates mockStorage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/index.test.ts`:
- Around line 218-239: The cloneAccount helper currently only shallow-copies
most fields (only deep-cloning rateLimitResetTimes), leaving mutable properties
like accountTags aliased to mockStorage; update cloneAccount to deep-clone the
entire account object (all nested fields) so any mutation on a loaded account
cannot leak back to mockStorage, and ensure cloneMockStorage and the mocked
loadAccounts/saveAccounts use this deep-cloning (refer to cloneAccount,
cloneMockStorage, loadAccounts, saveAccounts and mockStorage) so persistence
tests exercise real copy semantics rather than accidental shared references.
- Around line 1148-1228: Add tests that cover non-abort body-read failures so we
don't regress into rewriting all read errors as timeouts: in the existing test
cases that mock globalThis.fetch for plugin.tool["codex-limits"].execute(), add
parallel tests where the mocked response.text() or response.json() rejects with
a non-AbortError (e.g., new Error("body read failed")) and assert the final
result preserves that original error message (expect(result).toContain("body
read failed")) rather than "Usage request timed out"; ensure these new tests
mirror the abort-vs-non-abort structure used in the current text/json tests and
use vi.useFakeTimers()/vi.runAllTimersAsync() similarly.
---
Outside diff comments:
In `@test/index.test.ts`:
- Around line 240-255: The mock for withAccountStorageTransaction currently
calls the provided callback but discards its return value; update the async mock
implementation of withAccountStorageTransaction to return the callback's result
(e.g., capture and return await callback(loadedStorage, persist)) so callers
receive the same transaction return behavior as the real implementation; ensure
this change is made in the withAccountStorageTransaction mock where
loadedStorage is created via cloneMockStorage and persist updates mockStorage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b588dae9-e006-4f03-bee3-71b4205de74a
📒 Files selected for processing (2)
index.tstest/index.test.ts
Make the transaction mock return handler results, deep-clone loaded account objects, and add non-abort body-read regressions so codex-limits tests match real storage and fetch behavior more closely. Co-authored-by: Codex <noreply@openai.com>
Replace the obsolete blank-token fallback regression with the real zero-token-match transactional fallback case so the refreshed credential persistence path is exercised under current behavior. Co-authored-by: Codex <noreply@openai.com>
Prevent blank-token refresh attempts, remove email from persistence warnings, degrade gracefully when the active dedup slot is missing, and only invalidate caches when refreshed credentials were actually persisted. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
Use the validated previous refresh token consistently during refresh, remove the redundant fan-out guard, and add pre-response fetch abort coverage for codex-limits. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
|
@greptile |
Verify transactional refresh fallback does not overwrite a different account that happens to share the same email when accountId differs. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
Restrict transactional email fallback so it cannot overwrite a different account with the same email but a different accountId, and pin that behavior with a regression test. Co-authored-by: Codex <noreply@openai.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
index.ts (1)
4430-4445: The error-body cap still buffers the whole response.
response.text()reads the full body beforeslice(0, usageErrorBodyMaxChars)runs, so a very large upstream error page can still blow past the intended cap. If the limit is meant to be hard, read the body stream incrementally and stop once the preview buffer is full.♻️ Example of a bounded preview read
+ const readErrorPreview = async ( + response: Response, + maxChars: number, + signal: AbortSignal, + ): Promise<string> => { + const reader = response.body?.getReader(); + if (!reader) return ""; + const decoder = new TextDecoder(); + let text = ""; + + while (text.length < maxChars) { + if (signal.aborted) { + throw new Error("Usage request timed out"); + } + const { done, value } = await reader.read(); + if (done || !value) break; + text += decoder.decode(value, { stream: true }); + } + + return text.slice(0, maxChars); + }; + if (!response.ok) { let bodyText = ""; try { - bodyText = (await response.text()).slice(0, usageErrorBodyMaxChars); + bodyText = await readErrorPreview( + response, + usageErrorBodyMaxChars, + controller.signal, + ); } catch (error) { if (isAbortError(error) || controller.signal.aborted) { throw new Error("Usage request timed out"); } throw error; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@index.ts` around lines 4430 - 4445, The current error handling calls response.text() which fully buffers the body before slicing, so replace that with a bounded stream read: use response.body (or response.body.getReader()) to read chunks incrementally into a UTF-8 TextDecoder and append until you reach usageErrorBodyMaxChars or the stream ends, then stop reading and close the reader; ensure you still handle aborts via controller.signal and isAbortError, and pass the truncated bodyText into sanitizeUsageErrorMessage as before (references: response, usageErrorBodyMaxChars, sanitizeUsageErrorMessage, isAbortError, controller).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@index.ts`:
- Around line 4364-4385: The current fallback in the if (!updated) block builds
fallbackTarget by picking the first matching account (by
accountId+organizationId, then accountId, then email) which can incorrectly
overwrite another workspace's newer token when previousRefreshToken is missing;
instead, gather candidates from latestStorage.accounts using the same matching
rules (use normalizedOrganizationId and normalizedEmail), deduplicate to a
single list, and only persist the rotated token if exactly one candidate is
found; if candidates.length !== 1, call processLogger.warn (or existing logger)
with context and skip the disk write—do this logic around the fallbackTarget
selection and before any persistence that uses fallbackTarget so you never write
when the identity is ambiguous.
- Around line 4462-4475: The dedupe logic treats whitespace-only refresh tokens
as valid, collapsing distinct blank-token accounts; trim and normalize refresh
tokens before checking/adding to seenTokens and when computing
activeRefreshToken. Specifically, when reading acct.refreshToken in the loop
that updates seenTokens and uniqueIndices, normalize via trim() (or treat
non-string/trimmed-empty as ""), only dedupe non-empty trimmed tokens, and add
only non-empty trimmed tokens to seenTokens; likewise, when resolving
activeRefreshToken (after resolveActiveIndex(storage, "codex")), use the trimmed
value or undefined for empty/whitespace-only tokens so comparisons are
consistent.
In `@test/index.test.ts`:
- Around line 1078-1166: The test is missing the ambiguous same-email fixture:
modify the transactional fixture used in the test (the
transactionStorage.accounts array) to include two stored accounts that share the
same email but have different accountId/organizationId (e.g., add a second
account object with email "user@example.com" and accountId "acc-other2" /
refreshToken "different-refresh-2"), so the code path in
persistRefreshedCredentials is exercised where identity is ambiguous and it must
not match by email; keep the rest of the setup (queuedRefresh mock,
withAccountStorageTransaction stub, loggerModule.logWarn check, and assertions
on transactionStorage and loadAccounts) unchanged so the test verifies the
warning is emitted and that no email-based match is chosen.
---
Nitpick comments:
In `@index.ts`:
- Around line 4430-4445: The current error handling calls response.text() which
fully buffers the body before slicing, so replace that with a bounded stream
read: use response.body (or response.body.getReader()) to read chunks
incrementally into a UTF-8 TextDecoder and append until you reach
usageErrorBodyMaxChars or the stream ends, then stop reading and close the
reader; ensure you still handle aborts via controller.signal and isAbortError,
and pass the truncated bodyText into sanitizeUsageErrorMessage as before
(references: response, usageErrorBodyMaxChars, sanitizeUsageErrorMessage,
isAbortError, controller).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f59a8d27-9187-448d-9fbc-1a7621ac6f52
📒 Files selected for processing (2)
index.tstest/index.test.ts
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
Require unique fallback matches when persisting refreshed credentials and normalize refresh tokens before dedup so ambiguous same-email accounts cannot be overwritten during transactional refresh fallback. Co-authored-by: Codex <noreply@openai.com>
|
@coderabbitai review |
|
✅ Actions performedReview triggered.
|
Summary
codex-limitsusage fetching with timeout handling, capped/redacted upstream errors, and safer refresh persistenceWhy
PR #70 and the follow-up review cycles surfaced correctness, token-safety, and Windows concurrency gaps around deduplication, refresh propagation, stale-storage writes, blank-token refresh attempts, timeout handling, and ambiguous identity fallback. This PR carries the full remediation set on a clean branch from
main.Verification
npm installnpx vitest run test/index.test.tsnpm run typechecknpm run buildnpm run lintnote: greptile review for oc-chatgpt-multi-auth. cite files like
lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.Greptile Summary
this PR hardens
codex-limitsacross three axes: token safety (error-body redaction, blank-token refresh guard), timeout correctness (layeredAbortControllerhandling covering pre-header, mid-body, and post-body abort paths), and refresh propagation correctness (transactional fan-out to all duplicate stored records, active-label preservation across deduped credentials, empty-refreshToken bypass). storage writes are now fully transactional viapersistRefreshedCredentialsrather than directsaveAccounts, addressing the windows concurrency gap from PR #70 where stale in-memory state could overwrite a concurrent on-disk write.key changes:
usageFetchTimeoutMshoisted outsidefetchUsage(addresses prior review finding)isAbortErrorcovers bothErrorandDOMExceptionAbortErrorvariants for cross-platform safetyrefreshTokenrather than collapsing distinct no-token accountspersistRefreshedCredentialsopens a fresh storage transaction per refresh, avoiding stale-write races on windowsinvalidateAccountManagerCache()replaces the explicitsaveAccounts+AccountManager.loadFromDisk()sequencevi.useFakeTimers()timeout scenariostwo findings worth addressing:
if (refreshedCount === 0)fallback in the main loop is unreachable dead code (the non-emptypreviousRefreshTokenguard above guaranteesaccountitself always matches the in-memory loop, sorefreshedCount >= 1always) — safe to remove, its intent is covered bypersistRefreshedCredentialsif (controller.signal.aborted)check after a successfulresponse.text()read can replace a real HTTP error (e.g. 401) with"Usage request timed out"in a tight-timer race; on windows VPN connections where reads complete slowly, this discards useful error contextConfidence Score: 4/5
refreshedCount === 0branch, misleading post-read abort check) are low-severity and don't affect the primary code paths, but the dead-code branch could mislead future refactors and the abort check can produce confusing output on windows VPN setupsrefreshedCount === 0dead branch inindex.tsaround the in-memory refresh propagation loop warrants a second look before mergeImportant Files Changed
sanitizeUsageErrorMessage,isAbortError,applyRefreshedCredentials,persistRefreshedCredentialshelpers; hoistsusageFetchTimeoutMsoutsidefetchUsage; addsAbortControllertimeout to usage fetch with layered abort handling; fixes dedup to preserve accounts with emptyrefreshToken; propagates refreshed credentials across duplicate stored records via a fresh transaction; preserves active-account label across deduped entries; replaces directsaveAccountswithinvalidateAccountManagerCache()— one dead-code path (refreshedCount === 0) is unreachable, and a post-successful-readcontroller.signal.abortedcheck can mask real HTTP errors with a misleading "timed out" messagecloneAccount/cloneMockStoragehelpers for isolated test state; upgradeswithAccountStorageTransactionmock to return genericT; addssaveAccountsmock that actually mutatesmockStorage; expands codex-limits coverage with 9 new tests covering dedup-with-active-label, empty-token dedup bypass, credential fan-out, transactional identity fallback, blank-token guard, error redaction, and three timeout scenarios; the post-successful-read abort branch (text()succeeds but signal already aborted) has no dedicated testSequence Diagram
sequenceDiagram participant Iter as "uniqueIndices iteration" participant Mem as "in-memory storage" participant Q as "queuedRefresh" participant Tx as "withAccountStorageTransaction" participant Disk as "disk storage" Iter->>Mem: "read account.refreshToken (previousRefreshToken)" Iter->>Q: "queuedRefresh(previousRefreshToken)" Q-->>Iter: "refreshResult {access, refresh, expires}" Iter->>Mem: "applyRefreshedCredentials to all in-memory matches" Iter->>Tx: "persistRefreshedCredentials(previousRefreshToken, ...)" Tx->>Disk: "read latest storage (fresh transaction)" alt "previousRefreshToken matched" Tx->>Disk: "update all matching accounts, persist" else "fallback: match by accountId or email" Tx->>Disk: "update fallback account, persist" else "no match found" Tx-->>Iter: "logWarn, return false" end Tx-->>Iter: "persistedRefresh boolean" Iter->>Iter: "fetchUsage with AbortController" alt "abort fires before headers" Iter-->>Iter: "Usage request timed out" else "abort fires during body read" Iter-->>Iter: "Usage request timed out" else "non-OK response" Iter-->>Iter: "sanitizeUsageErrorMessage" endComments Outside Diff (1)
index.ts, line 4524-4527 (link)refreshedCount === 0fallback is unreachable dead codethe guard directly above ensures
previousRefreshTokenis always a non-empty string before reaching this block. sinceaccountisstorage.accounts[i]andpreviousRefreshTokenisaccount.refreshToken, the in-memory loop always findsaccountitself as a matching entry — sorefreshedCountis guaranteed ≥ 1 and the fallback branch never executes.this is not a correctness bug today, but the dead code creates false confidence for future maintainers who may assume this path is exercised in tests, and then restructure surrounding guard logic in ways that introduce regressions (concurrency or token-rotation edge cases are the most likely failure mode on windows). the
persistRefreshedCredentialstransactional fallback already handles the disk-side case where the stored token rotated between the in-memory load and the write, so the intent is fully covered without the dead branch.Last reviewed commit: 404961b
Context used: