Skip to content

fix: harden wrapper exits and matrix regressions#30

Closed
ndycode wants to merge 9 commits intodevfrom
fix/wrapper-matrix-review-followups
Closed

fix: harden wrapper exits and matrix regressions#30
ndycode wants to merge 9 commits intodevfrom
fix/wrapper-matrix-review-followups

Conversation

@ndycode
Copy link
Owner

@ndycode ndycode commented Mar 3, 2026

Summary - update scripts/codex.js to use structured main() flow and process.exitCode normalization for multi-auth paths - update scripts/codex-multi-auth.js to normalize non-integer exit codes to 1 while preserving integer exit propagation - remove stale matrix runner plumbing and align matrix execution with current CLI args in scripts/test-model-matrix.js - expand regression coverage for wrapper exit behavior, Windows shim guard stability, matrix command args, and .cmd fallback paths ## Risk Notes - low runtime risk: wrapper change is limited to exit handling and startup guard behavior - matrix harness changes are scoped to script utilities and tests ## Validation Evidence - npm test from D:\backup\projects\projects-nobuild\repo-projects\codex-multi-auth passed (100 files, 2325 tests) - npx vitest run test/test-model-matrix-script.test.ts test/codex-bin-wrapper.test.ts test/codex-multi-auth-bin-wrapper.test.ts passed (3 files, 20 tests) ## Worktree - worktree path: D:\backup\projects\projects-nobuild\repo-projects\codex-multi-auth-wt-wrapper-matrix-review - base: origin/main at 46510fed4d25945f43c61226976ac59698d24a2a - branch: fix/wrapper-matrix-review-followups ## PR Resolution Update (2026-03-05, follow-up) ### What changed - Deduplicated wrapper test process-spawn helpers in test/codex-bin-wrapper.test.ts by introducing shared runNodeScript(...) and routing runWrapper(...) and runWrapperScript(...) through it. ### How to test - npm test -- test/codex-bin-wrapper.test.ts - npm run typecheck - npm run lint ### Risk / rollout notes - Test-only refactor; no runtime behavior change. ## PR Resolution Update (2026-03-05, follow-up 2) ### What changed - Fixed PowerShell profile guard idempotency in scripts/codex.js by avoiding unconditional leading/trailing separators when replacing an existing guard block. - Added deterministic retry-path coverage in test/codex-bin-wrapper.test.ts for transient shim write failure (EBUSY) using a preload hook + syncBuiltinESMExports(). - Added repeated-startup idempotency regression in test/codex-bin-wrapper.test.ts to ensure profile content is stable on subsequent runs. ### How to test - npm test -- test/codex-bin-wrapper.test.ts - npm run typecheck - npm run lint ### Risk / rollout notes - Runtime change is scoped to Windows PowerShell profile guard rewrite formatting/idempotency. - Main risk is profile-content normalization edge cases; covered by repeated-startup regression.

PR Resolution Update (2026-03-05, follow-up 3)

What changed

  • Hardened scripts/codex-multi-auth.js exit normalization to fail closed for null / empty-string returns from runCodexMultiAuthCli.
  • Preserved integer and numeric-string propagation behavior.
  • Added regressions in test/codex-multi-auth-bin-wrapper.test.ts for null/empty-string exit paths.

How to test

  • npm test -- test/codex-multi-auth-bin-wrapper.test.ts test/codex-bin-wrapper.test.ts
  • npm run typecheck
  • npm run lint

Risk / rollout notes

  • Behavior change is intentional fail-closed normalization for invalid wrapper return values.

Follow-up Review Resolution (2026-03-05)

  • Consolidated duplicated Windows bat/cmd shim content generation behind a shared helper in scripts/codex.js.
  • Expanded transient Windows shim-write retry regression coverage to EBUSY, EPERM, and EACCES in test/codex-bin-wrapper.test.ts.
  • Added clean-exit regression for numeric-string 0 return values from runCodexMultiAuthCli.

How to test

  • npm test -- test/codex-bin-wrapper.test.ts
  • npm run typecheck
  • npm run lint

Risk / rollout notes

  • No external behavior change; hardening and regression coverage only.

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

this pr hardens wrapper exit normalization, adds windows shim/powershell profile guard installation with retry-backed writes, and expands regression coverage across three test files. the runtime changes are low-risk and well-scoped; the main gaps are in test/codex-bin-wrapper.test.ts cleanup hygiene.

  • scripts/codex-multi-auth.js: parseCliExitCode replaces the old single-line guard; regex /^\d+$/ correctly rejects signed, hex, and octal strings — all prior review iterations fully addressed
  • scripts/codex.js: windows shim install logic uses runWithWindowsFsRetry (EBUSY/EPERM/EACCES, exponential backoff) for all file writes; upsertPowerShellProfileGuard idempotency fix prevents redundant profile rewrites on second startup; buildWindowsBatchShimContent / buildWindowsCmdShimContent consolidated behind shared helper
  • scripts/test-model-matrix.js: buildModelCaseArgs extraction and portStart removal are clean; no behavioral change
  • test/codex-multi-auth-bin-wrapper.test.ts: new file with correct per-dir try-catch in afterEach and EACCES-aware retry — the right pattern
  • test/codex-bin-wrapper.test.ts: two cleanup issues relative to the companion file: afterEach lacks per-dir try-catch (first failure leaks remaining dirs) and isRetriableFsError omits EACCES, inconsistent with scripts/codex.js and the companion test file added in this same PR

Confidence Score: 4/5

  • safe to merge; two test-only cleanup inconsistencies in test/codex-bin-wrapper.test.ts are the only remaining gaps
  • runtime changes are well-guarded and regression-covered; the two issues (afterEach missing try-catch and isRetriableFsError missing EACCES) are test-only and can cause leaked temp dirs / flaky cleanup on windows, but do not affect production behavior
  • test/codex-bin-wrapper.test.ts — afterEach cleanup and isRetriableFsError need alignment with test/codex-multi-auth-bin-wrapper.test.ts

Important Files Changed

Filename Overview
scripts/codex-multi-auth.js adds version env hydration and replaces single-line exit guard with parseCliExitCode; regex /^\d+$/ correctly rejects signed, hex, and octal strings, addressing all prior review iterations
scripts/codex.js adds windows shim install/retry logic, powershell profile guard with idempotency fix, and shared buildWindowsCmdLikeShimContent helper; async chain is correct and all writes are routed through retry-backed helpers
scripts/test-model-matrix.js moves variable declarations before runQuiet, extracts buildModelCaseArgs / __buildModelCaseArgsForTests, and removes unused portStart; clean refactor with no behavioral change
test/codex-bin-wrapper.test.ts good addition of shim, retry, and idempotency tests; two cleanup issues: afterEach lacks per-dir try-catch (leaks dirs on first failure) and isRetriableFsError omits EACCES, inconsistent with the companion file added in the same PR
test/codex-multi-auth-bin-wrapper.test.ts new test file; robust afterEach with per-dir try-catch and EACCES-aware retry; comprehensive exit-code coverage including null, empty, signed, hex/octal, and whitespace-only cases
test/test-model-matrix-script.test.ts adds .cmd fallback test and matrix arg shape test against the new __buildModelCaseArgsForTests export; straightforward regression coverage

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[codex.js main] --> B[hydrateCliVersionEnv]
    B --> C{win32?}
    C -->|no| E[normalizeAuthAlias / route args]
    C -->|yes| D[ensureWindowsShellShimGuards]
    D --> D1{CODEX_MULTI_AUTH_WINDOWS_BATCH_SHIM_GUARD != 0?}
    D1 -->|no| E
    D1 -->|yes| D2[resolveWindowsShimDirectoryFromPath]
    D2 --> D3[ensureWindowsShellShim codex.bat / .cmd / .ps1]
    D3 --> D4[runWithWindowsFsRetry\nEBUSY/EPERM/EACCES\nexponential backoff]
    D4 --> D5[ensurePowerShellProfileGuard]
    D5 --> D6[upsertPowerShellProfileGuard\nidempotent - skips if content unchanged]
    D6 --> E
    E --> F{shouldHandleMultiAuthAuth?}
    F -->|yes| G[runCodexMultiAuthCli]
    G --> H[parseCliExitCode\nregex /^\\d+$/ - rejects signed/hex/octal]
    H --> I[process.exitCode]
    F -->|no| J[spawn real codex bin]
    J --> I
Loading

Fix All in Codex

Last reviewed commit: cee89c7

Comments Outside Diff (2)

  1. test/codex-bin-wrapper.test.ts, line 154-158 (link)

    afterEach leaks dirs if first removal throws

    the loop calls removeDirectoryWithRetry without a try-catch, so a single failure (after exhausting retries) throws, exits the loop, and abandons every remaining dir in createdDirs. on windows this leaves temp dirs with open handles that poison subsequent test runs.

    codex-multi-auth-bin-wrapper.test.ts (the companion file added in this same PR) uses the correct pattern: wrap each removal in try-catch, accumulate the first error, rethrow after all dirs are attempted.

  2. test/codex-bin-wrapper.test.ts, line undefined (link)

    isRetriableFsError missing EACCES

    this helper only retries on EBUSY and EPERM, but scripts/codex.js retries on all three codes (EBUSY, EPERM, EACCES) and the companion codex-multi-auth-bin-wrapper.test.ts (added in this PR) also handles EACCES. on windows, antivirus or acl-based locks emit EACCES on directory deletion — missing the case means cleanup silently fails after one attempt instead of backing off.

Additional Follow-up Resolution (2026-03-05)

  • Hardened codex-multi-auth wrapper exit parsing to accept only unsigned decimal string exit codes.
  • Added regressions for signed values (+0, -0), whitespace-only strings, and explicit zero-string success behavior.
  • Improved test cleanup resiliency to continue cleanup attempts across all fixture directories even if one removal fails.

How to test

  • npm test -- test/codex-bin-wrapper.test.ts test/codex-multi-auth-bin-wrapper.test.ts
  • npm run typecheck
  • npm run lint

Risk / rollout notes

  • No intended behavior change beyond stricter fail-closed exit parsing for ambiguous string formats.

Co-authored-by: Codex <noreply@openai.com>
@ndycode ndycode closed this Mar 3, 2026
@ndycode ndycode deleted the fix/wrapper-matrix-review-followups branch March 3, 2026 12:12
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@ndycode ndycode restored the fix/wrapper-matrix-review-followups branch March 3, 2026 12:18
@ndycode ndycode reopened this Mar 3, 2026
Co-authored-by: Codex <noreply@openai.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 3, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

adds windows shim guards and cli-version hydration to the wrapper, normalizes propagated exit codes, refactors model-matrix arg construction, and centralizes codex server cleanup state. (50 words)

Changes

Cohort / File(s) Summary
cli version hydration
lib/scripts/codex-multi-auth.js:1, lib/scripts/codex.js:1
reads package.json via createRequire inside try/catch and sets CODEX_MULTI_AUTH_CLI_VERSION; adds parseCliExitCode to normalize integer, numeric-string, and invalid exit values (defaults to 1).
windows shell guard infrastructure
lib/scripts/codex.js:1
adds retryable fs helpers, write/mkdir retry logic, batch/cmd/powershell shim guards, powershell profile insertion/updating, shim detection/installation, and invokes guard installation before normal CLI processing.
model-matrix arg extraction & cleanup state
lib/scripts/test-model-matrix.js:1
extracts buildModelCaseArgs() and exposes __buildModelCaseArgsForTests; adds --skip-git-repo-check flag; hoists stopCodexServersQueue and spawnedCodexPids to module top and serializes cleanup.
wrapper integration & exit-code tests
test/codex-bin-wrapper.test.ts:1, test/codex-multi-auth-bin-wrapper.test.ts:1
centralizes node script runner helpers, adds fixture creation and retryable FS cleanup, and adds tests asserting propagation of integer and numeric-string exit codes and normalization of non-integer values.
model-matrix script tests
test/test-model-matrix-script.test.ts:1
adds tests for .cmd fallback to shell mode and for __buildModelCaseArgsForTests covering json output, git-check skip, and variant -c flags.

Sequence Diagram(s)

sequenceDiagram
  participant user as User
  participant wrapper as Codex Multi-Auth Wrapper
  participant shim as Windows Shim Installer
  participant fs as Filesystem
  participant codex as Real Codex Binary

  user->>wrapper: invoke cli
  wrapper->>wrapper: hydrate cli version (read package.json)
  wrapper->>shim: ensureWindowsShellShimGuards()
  shim->>fs: write shims / update powershell profile
  fs-->>shim: ack
  shim-->>wrapper: shim ready
  wrapper->>codex: resolve real codex executable
  wrapper->>codex: forward args
  codex-->>wrapper: exitCode (number|string)
  wrapper->>wrapper: normalize exit code -> process.exitCode
  wrapper-->>user: process exits with propagated code
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

flagged concerns:

  • missing regression tests for package.json-read failure modes. see lib/scripts/codex-multi-auth.js:1.
  • windows edge cases not fully covered (transient fs errors, antivirus locks, permission races). see lib/scripts/codex.js:1, test/codex-multi-auth-bin-wrapper.test.ts:1.
  • concurrency risks around powershell profile mutation and shim installation (parallel invocations may corrupt profiles). see lib/scripts/codex.js:1.
  • verify removal of portStart call sites does not break scenario assumptions. see lib/scripts/test-model-matrix.js:1.

Suggested labels

bug

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed Title follows conventional commits format (fix: lowercase summary, 48 chars ≤ 72 limit) and accurately describes the main changes: hardening exit-code propagation and adding Windows shim guard regressions.
Description check ✅ Passed PR description provides clear summary of changes, detailed follow-up resolutions with file citations, validation evidence, and risk assessment across all modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/wrapper-matrix-review-followups

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/codex-multi-auth.js`:
- Around line 17-18: The wrapper currently treats only integer-typed values as
valid (using Number.isInteger on exitCode), so numeric strings like "7" are
downgraded to 1; fix this in the codex-multi-auth wrapper by coercing the
returned exitCode from runCodexMultiAuthCli into a Number first (e.g., const
numeric = Number(exitCode)) and then setting process.exitCode to numeric if
Number.isInteger(numeric) otherwise fallback to 1; update the logic around the
exitCode variable and the process.exitCode assignment to mirror the
normalization used in codex.js while referencing runCodexMultiAuthCli and the
exitCode variable.

In `@test/codex-bin-wrapper.test.ts`:
- Around line 188-289: These two Windows-only tests ("installs Windows codex
shell guards to survive shim takeover" and "prefers invocation-derived shim
directory over PATH-decoy shim entries") currently return early on non-win32 and
can pass with zero assertions; add a guard to force a failure (or explicit skip)
when not running on Windows by inserting expect.hasAssertions() at the top of
each test (before the current if (process.platform !== "win32") return;) so a
non-Windows run cannot silently succeed, or alternatively replace the
early-return with an explicit test skip (vitest.test.skip for the whole case) —
update the test bodies that call runWrapper and runWrapperScript accordingly so
the unique test names and calls to runWrapper/runWrapperScript remain covered.

In `@test/codex-multi-auth-bin-wrapper.test.ts`:
- Around line 14-20: The isRetriableFsError function currently returns true only
for "EBUSY" and "EPERM" and should also treat "EACCES" as retriable to handle
transient Windows FS locks; update the function (isRetriableFsError) to include
code === "EACCES" in the return condition (or add it to the list of retriable
codes) so cleanup tests become deterministic on Windows while keeping the same
type-guard logic and error shape check.
- Around line 91-107: Add a regression test that covers the numeric-string exit
path: create a new test in test/codex-multi-auth-bin-wrapper.test.ts that
mirrors the existing non-integer test but writes codex-manager.js to export
runCodexMultiAuthCli returning the string "7" (instead of a number or
non-integer), then call runWrapper(fixtureRoot, ["auth","status"]) and assert
result.status is 7; this ensures the behavior referenced at
lib/scripts/codex-multi-auth.js:18 correctly propagates numeric-string exits.
Ensure you use createWrapperFixture and runWrapper just like the existing tests
so it remains deterministic under vitest.

ℹ️ Review info

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between d36b04f and 8ae183f.

📒 Files selected for processing (6)
  • scripts/codex-multi-auth.js
  • scripts/codex.js
  • scripts/test-model-matrix.js
  • test/codex-bin-wrapper.test.ts
  • test/codex-multi-auth-bin-wrapper.test.ts
  • test/test-model-matrix-script.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/test-model-matrix-script.test.ts
  • test/codex-multi-auth-bin-wrapper.test.ts
  • test/codex-bin-wrapper.test.ts
🧬 Code graph analysis (4)
scripts/codex-multi-auth.js (1)
scripts/codex.js (4)
  • require (13-13)
  • require (78-78)
  • pkg (14-14)
  • version (15-15)
test/codex-multi-auth-bin-wrapper.test.ts (2)
scripts/codex.js (3)
  • code (209-209)
  • code (222-222)
  • attempt (218-218)
lib/utils.ts (1)
  • sleep (65-67)
scripts/codex.js (2)
scripts/codex-multi-auth.js (3)
  • require (7-7)
  • pkg (8-8)
  • version (9-9)
lib/utils.ts (1)
  • sleep (65-67)
test/codex-bin-wrapper.test.ts (1)
scripts/codex.js (4)
  • shimDir (271-271)
  • codexBatchPath (446-446)
  • scriptDir (86-86)
  • scriptDir (265-265)
🔇 Additional comments (7)
scripts/codex-multi-auth.js (1)

6-15: good best-effort version hydration.

the startup env hydration is safe and non-blocking in lib/scripts/codex-multi-auth.js:6.

scripts/test-model-matrix.js (2)

133-182: cleanup serialization looks solid.

the queueing in lib/scripts/test-model-matrix.js:176 prevents overlapping pid cleanup and reduces race risk for concurrent runs.


247-269: good extraction for model-case arg construction.

lib/scripts/test-model-matrix.js:247 cleanly centralizes args/token building and supports deterministic verification through the test hook.

test/test-model-matrix-script.test.ts (1)

66-119: nice regression coverage expansion here.

the new assertions in test/test-model-matrix-script.test.ts:66 and test/test-model-matrix-script.test.ts:91 validate fallback behavior and command-arg construction directly.

test/codex-bin-wrapper.test.ts (1)

348-367: good regression for numeric-string exit propagation.

the assertion in test/codex-bin-wrapper.test.ts:348 closes an important wrapper-exit edge case.

scripts/codex.js (2)

194-243: windows fs retry hardening is well done.

the retry wrapper in lib/scripts/codex.js:213 covers transient lock codes and uses bounded backoff, which is the right guardrail for shim/profile writes.


516-553: startup sequencing for guard install is strong.

running guard installation before command routing in lib/scripts/codex.js:550 reduces post-update shim drift and keeps wrapper behavior stable.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/codex-bin-wrapper.test.ts`:
- Around line 82-94: Two identical spawn helper functions are duplicated in this
test file; consolidate them by creating a single shared helper (e.g.,
runWrapperScript) and replace the duplicated implementation (the other spawn
helper at lines 64-80) with calls to that single function. Ensure the single
helper accepts (scriptPath: string, args: string[], extraEnv?:
NodeJS.ProcessEnv) and returns SpawnSyncReturns<string>, preserves the same
spawnSync options (encoding: "utf8", env merged from process.env and extraEnv),
and update all tests that referenced the duplicated helper to import or call
this unified runWrapperScript function so behavior stays aligned as tests
evolve.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ff894183-7127-41a6-827d-c7589c129a89

📥 Commits

Reviewing files that changed from the base of the PR and between 8ae183f and 0231eab.

📒 Files selected for processing (3)
  • scripts/codex-multi-auth.js
  • test/codex-bin-wrapper.test.ts
  • test/codex-multi-auth-bin-wrapper.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/codex-bin-wrapper.test.ts
  • test/codex-multi-auth-bin-wrapper.test.ts
🧬 Code graph analysis (3)
test/codex-bin-wrapper.test.ts (1)
scripts/codex.js (4)
  • shimDir (271-271)
  • codexBatchPath (446-446)
  • scriptDir (86-86)
  • scriptDir (265-265)
test/codex-multi-auth-bin-wrapper.test.ts (3)
lib/logger.ts (1)
  • error (389-393)
scripts/codex.js (3)
  • code (209-209)
  • code (222-222)
  • attempt (218-218)
lib/utils.ts (1)
  • sleep (65-67)
scripts/codex-multi-auth.js (1)
scripts/codex.js (4)
  • require (13-13)
  • require (78-78)
  • pkg (14-14)
  • version (15-15)
🔇 Additional comments (4)
scripts/codex-multi-auth.js (1)

6-15: good hardening of wrapper bootstrap and exit normalization.

the best-effort version hydration and numeric-string propagation are both in place and consistent with expected wrapper behavior in lib/scripts/codex-multi-auth.js:6-20.

Also applies to: 18-20

test/codex-bin-wrapper.test.ts (2)

188-281: solid windows shim regression coverage.

the assertions on .bat/.cmd/.ps1 guard content and invocation-path precedence give good protection for windows-specific shim behavior in test/codex-bin-wrapper.test.ts:188-281.

as per coding guidelines "test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions."


340-359: good regression for numeric-string exit propagation.

this closes the exit-code gap by explicitly asserting "7" propagates as status 7 in test/codex-bin-wrapper.test.ts:340-359.

test/codex-multi-auth-bin-wrapper.test.ts (1)

71-125: exit-behavior regression set is complete for this wrapper path.

the integer, non-integer, and numeric-string cases in test/codex-multi-auth-bin-wrapper.test.ts:71-125 are targeted and deterministic.

as per coding guidelines "test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions."

- extract shared runNodeScript helper for wrapper test process spawning
- route runWrapper and runWrapperScript through the shared helper to avoid drift

Co-authored-by: Codex <noreply@openai.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/codex-bin-wrapper.test.ts`:
- Around line 186-280: Add a deterministic regression test in
test/codex-bin-wrapper.test.ts that simulates a transient Windows fs write
failure (EBUSY/EPERM/EACCES) on the first attempted shim write and verifies the
wrapper's retry logic eventually creates the guard; to do this, use vitest's
vi.spyOn or vi.stub to intercept fs.writeFileSync (or the wrapper-internal write
helper) used by runWrapper/runWrapperScript and make it throw a single error
with code "EBUSY" for the first matching path (e.g., the
codex.bat/codex.cmd/codex.ps1 target) then behave normally, invoke runWrapper or
runWrapperScript with the same environment variables as the existing Windows
tests (CODEX_MULTI_AUTH_WINDOWS_BATCH_SHIM_GUARD etc.), and assert result.status
=== 0 plus the same expect(readFileSync(...)).toContain("codex-multi-auth
windows shim guardian v1") checks; ensure the spy is restored/cleared after the
test so other tests remain deterministic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 774186a4-31a2-42f5-8d9e-85e08b896af5

📥 Commits

Reviewing files that changed from the base of the PR and between 0231eab and a6ccb6c.

📒 Files selected for processing (1)
  • test/codex-bin-wrapper.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/codex-bin-wrapper.test.ts
🧬 Code graph analysis (1)
test/codex-bin-wrapper.test.ts (1)
scripts/codex.js (4)
  • shimDir (271-271)
  • codexBatchPath (446-446)
  • scriptDir (86-86)
  • scriptDir (265-265)
🔇 Additional comments (2)
test/codex-bin-wrapper.test.ts (2)

69-92: good helper consolidation for spawn behavior.

test/codex-bin-wrapper.test.ts:69-92 cleanly centralizes process spawning, so env merge and encoding stay consistent across wrapper tests.


338-357: good regression for numeric-string exit propagation.

test/codex-bin-wrapper.test.ts:338-357 directly validates "7" exit propagation through the wrapper and closes the matrix exit normalization gap.

- prevent extra leading separators when rewriting existing PowerShell guard blocks
- add deterministic Windows retry regression for transient shim write failure
- add repeated-startup idempotency regression for PowerShell profile guard content

Co-authored-by: Codex <noreply@openai.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/codex.js`:
- Around line 292-336: The two functions buildWindowsBatchShimContent and
buildWindowsCmdShimContent are identical; remove the duplicate by creating a
single shared builder (e.g., buildWindowsShimContent) that uses
WINDOWS_SHIM_MARKER and returns the joined CRLF string, then update any callers
or exports that referenced buildWindowsBatchShimContent and
buildWindowsCmdShimContent to call the single buildWindowsShimContent (or
re-export it under both names if backward compatibility is needed) so the
bootstrap logic is maintained in one place.

In `@test/codex-bin-wrapper.test.ts`:
- Around line 460-479: Add a second deterministic test alongside the existing
numeric-string case that verifies clean-exit propagation for the explicit
numeric-string "0": in test/codex-bin-wrapper.test.ts create a fixture similar
to the existing it("propagates numeric-string multi-auth exit codes") block but
have runCodexMultiAuthCli return "0" and then call runWrapper(fixtureRoot,
["auth","status"], { CODEX_MULTI_AUTH_BYPASS: "",
CODEX_MULTI_AUTH_REAL_CODEX_BIN: "" }) and assert expect(result.status).toBe(0);
this pins the zero-exit path and prevents normalization regressions while using
the same helper symbols (createWrapperFixture, runCodexMultiAuthCli,
runWrapper).
- Around line 247-316: The test injects only EBUSY but the retry logic in
scripts/codex.js also treats EPERM and EACCES as retriable; update the test
"retries transient shim write failures and still installs guards" to
parameterize over the three error codes (EBUSY, EPERM, EACCES). For each code,
create the preload hook (hookPath) that sets error.code to the current code, set
NODE_OPTIONS to require that hook, call runWrapper(fixtureRoot, ["--version"],
{...}) and assert status 0 and that codex.bat/cmd/ps1 contain the guardian
string; reuse existing helpers createWrapperFixture, createFakeCodexBin and
shimDir creation, and use test.each or a small loop so the same expectations are
executed for each error code.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 3e36ce10-f5d3-4483-84ca-1d515b1a7e1c

📥 Commits

Reviewing files that changed from the base of the PR and between a6ccb6c and 0a7d9ea.

📒 Files selected for processing (2)
  • scripts/codex.js
  • test/codex-bin-wrapper.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/codex-bin-wrapper.test.ts
🧬 Code graph analysis (2)
scripts/codex.js (2)
scripts/codex-multi-auth.js (1)
  • require (7-7)
lib/utils.ts (1)
  • sleep (65-67)
test/codex-bin-wrapper.test.ts (1)
scripts/codex.js (4)
  • shimDir (271-271)
  • codexBatchPath (446-446)
  • scriptDir (86-86)
  • scriptDir (265-265)

ndycode and others added 2 commits March 5, 2026 11:21
- treat only integer numbers or non-empty numeric strings as valid exit codes
- prevent null/empty-string coercion to success via Number()
- add regressions for null and empty-string exit code normalization

Co-authored-by: Codex <noreply@openai.com>
Deduplicate identical Windows cmd/batch shim content builders, expand transient shim-write retry regression coverage to EBUSY/EPERM/EACCES, and add numeric-string "0" clean-exit coverage.

Co-authored-by: Codex <noreply@openai.com>
Prevent codex-multi-auth wrapper from accepting hex/octal-like string exit codes via Number coercion. Parse only signed decimal digit strings and keep non-decimal strings fail-closed. Add regressions for 0xff and 0o7 inputs.

Co-authored-by: Codex <noreply@openai.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/codex.js (1)

183-192: ⚠️ Potential issue | 🟡 Minor

scripts/codex.js:183 should restrict exit code parsing to decimal integers, matching scripts/codex-multi-auth.js:17.

normalizeExitCode accepts Number("0xff") = 255, but parseCliExitCode rejects "0xff" via regex /^[+-]?\d+$/. this violates the stated intent to "restrict wrapper string exit codes to decimal integers". test/codex-bin-wrapper.test.ts also lacks the regression test case present in test/codex-multi-auth-bin-wrapper.test.ts:127 that validates hex/octal strings are treated as failures.

alignment required
 function normalizeExitCode(value) {
 	if (typeof value === "number" && Number.isInteger(value)) {
 		return value;
 	}
-	const parsed = Number(value);
-	if (Number.isInteger(parsed)) {
-		return parsed;
+	if (typeof value === "string") {
+		const trimmed = value.trim();
+		if (/^[+-]?\d+$/.test(trimmed)) {
+			const parsed = Number.parseInt(trimmed, 10);
+			if (Number.isInteger(parsed)) {
+				return parsed;
+			}
+		}
 	}
 	return 1;
 }

add test case to test/codex-bin-wrapper.test.ts matching test/codex-multi-auth-bin-wrapper.test.ts:127 for "0xff" and "0o7" inputs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/codex.js` around lines 183 - 192, normalizeExitCode currently coerces
non-decimal numeric strings (e.g., "0xff") via Number(...) which allows
hex/octal; change normalizeExitCode to only accept decimal integer strings by
first checking the string against the same regex used by parseCliExitCode
(/^[+-]?\d+$/) or by using a decimal-only parse path, and only then converting
to Number; update logic in function normalizeExitCode to mirror parseCliExitCode
behavior so "0xff" and "0o7" are rejected and fall through to returning 1. Also
add a regression test to test/codex-bin-wrapper.test.ts (matching the case in
test/codex-multi-auth-bin-wrapper.test.ts:127) verifying that "0xff" and "0o7"
inputs are treated as failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/codex.js`:
- Around line 213-231: The function runWithWindowsFsRetry contains an
unreachable return (return undefined) after the for-loop because any failure on
the final attempt throws from the catch, so remove the trailing "return
undefined" to clean up dead code; locate the runWithWindowsFsRetry function and
delete the final return statement (no other logic changes needed).

In `@test/codex-multi-auth-bin-wrapper.test.ts`:
- Around line 71-181: Add two regression tests to the codex-multi-auth bin
wrapper suite: one where runCodexMultiAuthCli returns a whitespace-only string
("   ") and assert runWrapper(...).status === 1, and one where it returns the
numeric-string "0" and assert runWrapper(...).status === 0; mirror the existing
test pattern (use createWrapperFixture(), mkdirSync(join(fixtureRoot, "dist",
"lib"), { recursive: true }), writeFileSync(join(distLibDir,
"codex-manager.js"), ...) exporting async function runCodexMultiAuthCli() that
returns the desired value, then call runWrapper(fixtureRoot, ["auth","status"])
and add the two expects to the describe block.

---

Outside diff comments:
In `@scripts/codex.js`:
- Around line 183-192: normalizeExitCode currently coerces non-decimal numeric
strings (e.g., "0xff") via Number(...) which allows hex/octal; change
normalizeExitCode to only accept decimal integer strings by first checking the
string against the same regex used by parseCliExitCode (/^[+-]?\d+$/) or by
using a decimal-only parse path, and only then converting to Number; update
logic in function normalizeExitCode to mirror parseCliExitCode behavior so
"0xff" and "0o7" are rejected and fall through to returning 1. Also add a
regression test to test/codex-bin-wrapper.test.ts (matching the case in
test/codex-multi-auth-bin-wrapper.test.ts:127) verifying that "0xff" and "0o7"
inputs are treated as failures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 065f8f65-0243-4167-9083-852d83007e81

📥 Commits

Reviewing files that changed from the base of the PR and between 0a7d9ea and fe6ff87.

📒 Files selected for processing (4)
  • scripts/codex-multi-auth.js
  • scripts/codex.js
  • test/codex-bin-wrapper.test.ts
  • test/codex-multi-auth-bin-wrapper.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/codex-multi-auth-bin-wrapper.test.ts
  • test/codex-bin-wrapper.test.ts
🧬 Code graph analysis (4)
scripts/codex.js (2)
scripts/codex-multi-auth.js (3)
  • require (7-7)
  • pkg (8-8)
  • version (9-9)
lib/utils.ts (1)
  • sleep (65-67)
scripts/codex-multi-auth.js (1)
scripts/codex.js (8)
  • require (13-13)
  • require (78-78)
  • pkg (14-14)
  • version (15-15)
  • parsed (187-187)
  • exitCode (551-551)
  • exitCode (578-578)
  • runCodexMultiAuthCli (547-547)
test/codex-multi-auth-bin-wrapper.test.ts (3)
lib/logger.ts (1)
  • error (389-393)
scripts/codex.js (2)
  • code (209-209)
  • code (222-222)
lib/utils.ts (1)
  • sleep (65-67)
test/codex-bin-wrapper.test.ts (1)
scripts/codex.js (4)
  • shimDir (271-271)
  • codexBatchPath (431-431)
  • scriptDir (86-86)
  • scriptDir (265-265)
🔇 Additional comments (19)
scripts/codex-multi-auth.js (3)

6-15: version hydration looks good.

mirrors the pattern in scripts/codex.js:11-22 with best-effort error suppression. the defensive try/catch ensures startup isn't blocked if package.json is unreadable.


17-31: parseCliExitCode correctly restricts to decimal integers.

the /^[+-]?\d+$/ regex rejects hex/octal-like strings ("0xff", "0o7") per commit message, and Number.parseInt(trimmed, 10) ensures decimal parsing. the fallback to 1 on invalid input implements fail-closed semantics.


33-34: exit code propagation correctly normalized.

the past review comment about "7" being downgraded to 1 is addressed - parseCliExitCode now handles numeric strings properly.

test/codex-bin-wrapper.test.ts (6)

64-92: spawn helper consolidation looks correct.

runNodeScript is now the single source for spawn/env wiring, with both runWrapper and runWrapperScript delegating cleanly. this addresses the past nitpick about avoiding drift.


186-244: windows shim guard test covers the critical paths.

it.skipIf prevents silent no-assert passes on non-win32. assertions verify all three shim files (codex.bat, codex.cmd, codex.ps1) and the powershell profile guard. deterministic fixture isolation with cleanup in afterEach.


246-318: retry regression test correctly parameterizes all retriable codes.

the loop over ["EBUSY", "EPERM", "EACCES"] exercises each code path in scripts/codex.js:197 (RETRYABLE_WINDOWS_FS_CODES). the preload hook with syncBuiltinESMExports() ensures the esm binding sees the injected fault.


320-368: profile guard idempotency test is solid.

the double invocation with content comparison at test/codex-bin-wrapper.test.ts:362 locks the idempotency behavior. the startsWith("\r\n") === false check at line 363 catches the separator regression noted in pr objectives.


370-403: shim directory precedence test correctly validates invocation-derived path.

directly exercises resolveWindowsShimDirectoryFromInvocation in scripts/codex.js:261-276. the assertion that decoy dir has no codex.bat (line 402) confirms precedence logic.


462-502: numeric-string exit code regressions are complete.

covers both "7" (non-zero propagation) and "0" (clean-exit propagation). these lock the behavior of parseCliExitCode in scripts/codex-multi-auth.js:17-31.

scripts/codex.js (6)

11-22: version hydration pattern is consistent with codex-multi-auth.js.

both wrappers now propagate CODEX_MULTI_AUTH_CLI_VERSION from package.json at startup. the try/catch ensures startup isn't blocked on read failures.


199-203: local sleep helper duplicates lib/utils.ts:64-66.

minor duplication, but acceptable here to avoid adding an import dependency in the bin wrapper script. no action needed.


292-321: shim builder consolidation correctly deduplicates batch/cmd content.

buildWindowsCmdLikeShimContent is the shared base, and both buildWindowsBatchShimContent and buildWindowsCmdShimContent delegate to it. this addresses the past review nitpick.


355-410: ensureWindowsShellShim handles idempotency and stock shim detection correctly.

the marker check at line 368 prevents unnecessary rewrites. stock openai shim detection at lines 382-395 allows takeover without overwriteCustomShim.


444-485: profile guard upsert handles separator idempotency correctly.

the whitespace normalization at lines 460-464 and the trimEnd() prevent accumulation of extra separators on repeated runs. this addresses the pr note about "corrected separator handling."


537-539: startup order is correct: hydration then guards before cli processing.

hydrateCliVersionEnv() and ensureWindowsShellShimGuards() run before argument processing, ensuring version metadata and shim protection are established early.

test/codex-multi-auth-bin-wrapper.test.ts (4)

14-38: retriable fs error handling includes EACCES.

line 19 now includes code === "EACCES" alongside EBUSY and EPERM, addressing the past review about cleanup flakiness on windows.


71-125: exit code propagation tests cover the main paths.

integer (line 88), non-integer (line 106), and decimal numeric-string (line 124) all have assertions. these lock the behavior of scripts/codex-multi-auth.js:17-31.


127-145: hex/octal regression test locks decimal-only restriction.

"0xff" and "0o7" both normalize to exit 1, as expected per the /^[+-]?\d+$/ regex in scripts/codex-multi-auth.js:23. addresses the greptile flag in pr objectives.


147-181: null and empty-string regressions ensure fail-closed behavior.

both null (line 162) and "" (line 180) normalize to exit 1, implementing the fail-closed semantics from the pr commit message.

Restrict codex-multi-auth string exit parsing to unsigned decimal values, add regressions for signed/hex/octal/whitespace edge cases, and keep zero-string success coverage. Also ensure test fixture cleanup attempts all directories even when one removal fails, and remove unreachable retry helper dead code.

Co-authored-by: Codex <noreply@openai.com>
@ndycode ndycode changed the base branch from main to dev March 6, 2026 01:10
ndycode added a commit that referenced this pull request Mar 6, 2026
Co-authored-by: Codex <noreply@openai.com>
@ndycode
Copy link
Owner Author

ndycode commented Mar 6, 2026

Merged via integration PR to safely combine multiple approved changes.

@ndycode ndycode closed this Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant