fix: gracefully handle deactivated_workspace — remove only the dead workspace, preserve siblings#85
Conversation
…d workspace entry - Add isDeactivatedWorkspaceError() detector in fetch-helpers.ts that matches HTTP 402 + detail.code/error.code === 'deactivated_workspace' - On deactivated_workspace response: flag the specific workspace entry (keyed by organizationId > accountId > refreshToken), remove it from rotation and failover to the next healthy workspace/account - Preserve sibling workspaces sharing the same refreshToken — only the dead workspace-entry is removed, not the entire user - Extend codex-health deep probe to flag deactivated workspaces using the same identity key (not refreshToken) so multi-workspace users aren't penalized for a single bad workspace - Add test: handleErrorResponse normalizes 402 deactivated_workspace - Add test: plugin removes only the dead workspace and succeeds on the live sibling in the same request
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughImplements workspace-deactivation detection and identity-keyed flagged-account handling. Deactivated-workspace responses (402 with Changes
Sequence DiagramsequenceDiagram
participant Client
participant RotationManager as Rotation Manager
participant FetchHelpers as Fetch Helpers
participant CodexAPI as Codex API
participant Storage as Flagged Storage
Client->>RotationManager: sendRequest(req)
RotationManager->>CodexAPI: forward request with account token
CodexAPI-->>RotationManager: 402 { detail: { code: "deactivated_workspace" } }
RotationManager->>FetchHelpers: isDeactivatedWorkspaceError(response)
FetchHelpers-->>RotationManager: true
RotationManager->>RotationManager: refund token, record failure, update metrics
RotationManager->>Storage: withFlaggedAccountStorageTransaction(persist flagged record by workspace identity)
Storage-->>RotationManager: persisted
RotationManager->>RotationManager: remove account from active rotation (or mark cooldown)
RotationManager->>CodexAPI: retry request with next account
CodexAPI-->>RotationManager: 200 OK
RotationManager-->>Client: return success response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
test/index.test.ts (1)
189-191: Mock only checks one of four error code locations.The mock implementation only checks
error.code, but the realgetStructuredErrorCodefunction (context snippet 1, lines 284-303) extracts codes from four locations in priority order:
- Top-level
errorBody.codeerrorBody.detail.codeerrorBody.error.codeorerrorBody.error.typeThis test works because the test data in line 2563 uses
{ error: { code: "deactivated_workspace" } }, which matches location 3. However, if future tests use different payload shapes (e.g.,{ detail: { code: "deactivated_workspace" } }as the fetch-helpers test does), this mock would incorrectly returnfalse.Consider aligning the mock more closely with the real implementation for better test fidelity:
♻️ Suggested mock improvement
- isDeactivatedWorkspaceError: vi.fn((errorBody: unknown) => - (errorBody as { error?: { code?: string } })?.error?.code === "deactivated_workspace", - ), + isDeactivatedWorkspaceError: vi.fn((errorBody: unknown) => { + const body = errorBody as Record<string, unknown> | undefined; + if (!body || typeof body !== "object") return false; + const code = + (typeof body.code === "string" && body.code) || + (typeof (body.detail as Record<string, unknown>)?.code === "string" && (body.detail as Record<string, unknown>).code) || + (typeof (body.error as Record<string, unknown>)?.code === "string" && (body.error as Record<string, unknown>).code); + return code === "deactivated_workspace"; + }),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/index.test.ts` around lines 189 - 191, The test mock is too narrow: update the isDeactivatedWorkspaceError mock in index.test.ts to mirror getStructuredErrorCode’s behavior by checking the four locations in priority (top-level errorBody.code, errorBody.detail.code, then errorBody.error.code, and errorBody.error.type) and return true if any equals "deactivated_workspace"; reference the mock name isDeactivatedWorkspaceError and the real extractor getStructuredErrorCode so the mock returns the same boolean for all payload shapes the real function would handle.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@index.ts`:
- Line 3199: The pruning logic only adds the refresh-token key
(removeFromActive.add(`refreshToken:${account.refreshToken}`)) but later only
filters by the workspace identity (getWorkspaceIdentityKey(account)), so
accounts with workspace metadata stay active; fix by adding the workspace key to
removals as well (call removeFromActive.add(getWorkspaceIdentityKey(account))
wherever you add the refreshToken removal) and update any matching/filtering
logic to consider both keys (i.e., check membership for either
`refreshToken:${account.refreshToken}` or getWorkspaceIdentityKey(account) when
deciding to prune).
- Around line 2395-2411: The code is adding "workspace-deactivated" records into
the generic flagged_accounts pool which verifyFlaggedAccounts() later may
restore; adjust the flow so deactivated workspaces are not placed into the
generic flagged pool or are marked as non-restorable. Specifically, in the block
that builds FlaggedAccountMetadataV1 and calls
loadFlaggedAccounts()/saveFlaggedAccounts(), skip pushing or updating entries
whose flaggedReason === "workspace-deactivated" (or alternatively add a
nonRestorable=true flag to the record) and update verifyFlaggedAccounts() to
ignore records with that marker or reason (referencing loadFlaggedAccounts,
saveFlaggedAccounts, FlaggedAccountMetadataV1, matchesWorkspaceIdentity, and
verifyFlaggedAccounts to locate the related logic).
- Around line 3045-3047: When isDeactivatedWorkspaceError(errorBody,
response.status) is true you must short-circuit the quota-probe loop instead of
letting later errors overwrite it; locate the catch that currently sets
lastError and change the behavior so that upon detecting
isDeactivatedWorkspaceError(...) you immediately rethrow the sentinel
Error("deactivated_workspace") (or set lastError to that sentinel and
break/return from the outer probe loop) so the deactivated-workspace condition
is preserved for the caller; references: isDeactivatedWorkspaceError,
Error("deactivated_workspace"), lastError.
---
Nitpick comments:
In `@test/index.test.ts`:
- Around line 189-191: The test mock is too narrow: update the
isDeactivatedWorkspaceError mock in index.test.ts to mirror
getStructuredErrorCode’s behavior by checking the four locations in priority
(top-level errorBody.code, errorBody.detail.code, then errorBody.error.code, and
errorBody.error.type) and return true if any equals "deactivated_workspace";
reference the mock name isDeactivatedWorkspaceError and the real extractor
getStructuredErrorCode so the mock returns the same boolean for all payload
shapes the real function would handle.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 10a3e0cf-7e92-451e-b7c6-87c39b8b1326
📒 Files selected for processing (5)
index.tslib/request/fetch-helpers.tstest/fetch-helpers.test.tstest/index-retry.test.tstest/index.test.ts
There was a problem hiding this comment.
🧹 Nitpick comments (1)
lib/storage.ts (1)
924-943: Identity key computation differs intentionally from normal accounts.The flagged account identity key uses a compound format (
organizationId:${org}|accountId:${acc}) when both fields are present, whereas normal accounts (toAccountIdentityKey) use onlyorganizationId:${org}. This difference is intentional—flagged storage needs to track individual workspaces that have been deactivated, rather than collapsing all workspaces under the same organization.Consider adding a brief inline comment explaining this design decision to prevent future confusion during maintenance.
📝 Suggested documentation
+ // Unlike normal account deduplication which uses only organizationId as the top-level key, + // flagged accounts use a compound key to preserve distinct workspace entries that share + // an organization but have different accountIds (sibling workspaces). const getFlaggedIdentityKey = (account: { organizationId?: string; accountId?: string; refreshToken: string; }): string => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/storage.ts` around lines 924 - 943, getFlaggedIdentityKey intentionally uses a compound key (`organizationId:${org}|accountId:${acc}`) when both org and account are present to track individual flagged workspaces rather than collapsing them by organization; add a short inline comment inside or immediately above the getFlaggedIdentityKey function (and/or near normalizeFlaggedIdentityPart) stating this design decision and why it differs from toAccountIdentityKey so future maintainers understand the rationale.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@lib/storage.ts`:
- Around line 924-943: getFlaggedIdentityKey intentionally uses a compound key
(`organizationId:${org}|accountId:${acc}`) when both org and account are present
to track individual flagged workspaces rather than collapsing them by
organization; add a short inline comment inside or immediately above the
getFlaggedIdentityKey function (and/or near normalizeFlaggedIdentityPart)
stating this design decision and why it differs from toAccountIdentityKey so
future maintainers understand the rationale.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fefabee6-49c9-482a-b8f7-b6a522b96b54
📒 Files selected for processing (5)
index.tslib/storage.tstest/fetch-helpers.test.tstest/index.test.tstest/storage.test.ts
✅ Files skipped from review due to trivial changes (1)
- index.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- test/fetch-helpers.test.ts
- test/index.test.ts
fix: gracefully handle deactivated_workspace — remove only the dead workspace, preserve siblings
fix: gracefully handle deactivated_workspace — remove only the dead workspace, preserve siblings
Problem
When one workspace in the pool returns
402 deactivated_workspace, the plugin needs to remove only that dead workspace and keep healthy siblings alive.The first pass fixed the basic failover, but review found three follow-up bugs:
organizationIdappeared under differentaccountIdvaluescodex-doctor/codex-healthstill removedtoken-invalidentries by a stalerefreshToken:key formatFix
lib/request/fetch-helpers.tsisDeactivatedWorkspaceError(errorBody, status)detection forHTTP 402deactivated workspacesworkspace_deactivatederror mapping for clearer user-facing handlingindex.tsdeactivated_workspacebefore the unsupported-model / rate-limit branches and fail over to the next healthy account/workspaceorganizationIdis shared butaccountIddifferstoken-invalidcleanup path so org-scoped accounts are actually removed from active storagelib/storage.tswithFlaggedAccountStorageTransaction(...)so flagged-account updates run under the shared storage lockrefreshToken, so sibling workspaces are preserved in flagged storage tooTests
test/fetch-helpers.test.tstest/index.test.tsisDeactivatedWorkspaceErrormock, deactivated-workspace sibling failover, and deep-check removal of only the dead org-scoped workspacetest/storage.test.tstest/index-retry.test.tsValidation
npm run typechecknpx eslint index.ts lib/storage.ts test/fetch-helpers.test.ts test/index.test.ts test/storage.test.tsnpx vitest run test/storage.test.ts test/index.test.ts test/fetch-helpers.test.ts test/index-retry.test.tsnote: greptile review for oc-chatgpt-multi-auth. cite files like
lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.Greptile Summary
this PR surgically fixes three follow-up bugs from the first
deactivated_workspacepass: workspace identity is now keyed byorganizationId|accountIdso siblings sharing arefreshTokenare never collapsed, flagged storage updates go throughwithFlaggedAccountStorageTransactionto eliminate the read-modify-write race under concurrent deactivated-workspace failures, andremoveFromActivenow uses the workspace identity key so org-scoped accounts are actually filtered out. the fetch-helpersisDeactivatedWorkspaceErrorexport and thenormalizeErrorPayloadearly-return are clean and well-isolated.key observations:
getWorkspaceIdentityKeyinindex.tsandgetFlaggedIdentityKeyinlib/storage.tsimplement identical logic independently — if key format drifts between them, active filtering and flagged dedup will silently disagree on which account is whichdeactivated_workspacebranch in the deep-check probe loop (checkAccountHealth, ~line 3300) has no vitest coverage; the test suite covers request-failover andtoken-invaliddeep-check but not this third pathvi.mocked(storageModule.saveFlaggedAccounts)rather thanwithFlaggedAccountStorageTransaction, coupling it to mock wiring rather than the production APIrenameWithWindowsRetryexhausts its retries insidewithFlaggedAccountStorageTransaction, the workspace is removed from active storage but never written to flagged storage — a known inconsistency window with no regression test or code comment acknowledging itConfidence Score: 3/5
Important Files Changed
deactivated_workspacefailover before unsupported-model branch, useswithFlaggedAccountStorageTransactionfor atomic flagged writes, and fixesremoveFromActiveto use workspace identity keys — but duplicatesgetWorkspaceIdentityKeylogic fromlib/storage.tsand the deep-checkdeactivated_workspaceprobe path has no test coveragewithFlaggedAccountStorageTransactionunderwithStorageLock, splitssaveFlaggedAccountsUnlockedout of the public API, and fixes flagged dedup to use workspace identity key —renameWithWindowsRetryis used correctly but no test covers the EBUSY-exhaustion inconsistency window for the new transaction pathisDeactivatedWorkspaceErrorexport with multi-path code extraction (top-level,detail,errornesting) and normalizes 402 responses to aworkspace_deactivatedpayload — clean, well-scoped, no issuesdeactivated_workspaceprobe branch is missing; test assertion onsaveFlaggedAccountsspy is coupled to mock internals rather than thewithFlaggedAccountStorageTransactionproduction APIwithFlaggedAccountStorageTransaction— well structured and directly targets the new transaction APIisDeactivatedWorkspaceError: () => falseto keep retry mock compile-compatible with the new import — no issuesSequence Diagram
sequenceDiagram participant FH as fetch handler participant AM as AccountManager participant FW as withFlaggedAccountStorageTransaction participant FS as flagged-accounts.json participant AS as accounts.json FH->>AM: getCurrentOrNextForFamilyHybrid() AM-->>FH: account (dead workspace) FH->>FH: isDeactivatedWorkspaceError(errorBody, 402) Note over FH: workspaceDeactivated = true FH->>FW: withFlaggedAccountStorageTransaction(handler) FW->>FS: loadFlaggedAccountsUnlocked() [under lock] FS-->>FW: current flagged storage FW->>FW: upsertFlaggedAccountRecord(nextStorage, flaggedRecord) FW->>FS: saveFlaggedAccountsUnlocked(nextStorage) [renameWithWindowsRetry] FW-->>FH: done FH->>AM: removeAccount(deadWorkspace) AM->>AS: saveToDiskDebounced() AM-->>FH: removed=true FH->>FH: attempted.clear(), break Note over FH: retry loop → live sibling workspace FH->>AM: getCurrentOrNextForFamilyHybrid() AM-->>FH: account (live workspace) FH-->>FH: 200 OKPrompt To Fix All With AI
Last reviewed commit: "fix: keep deactivate..."
Context used: