Skip to content

fix(windows): resolve .cmd shim path for Node spawn#21

Closed
ElRaxy wants to merge 1 commit into
chorus-codes:mainfrom
ElRaxy:fix/windows-cmd-shim-spawn
Closed

fix(windows): resolve .cmd shim path for Node spawn#21
ElRaxy wants to merge 1 commit into
chorus-codes:mainfrom
ElRaxy:fix/windows-cmd-shim-spawn

Conversation

@ElRaxy
Copy link
Copy Markdown

@ElRaxy ElRaxy commented May 9, 2026

PR #1 — fix(windows): resolve .cmd shim path for Node spawn

Branch suggested: fix/windows-cmd-shim-spawn
File: src/daemon/headless.ts
Severity: Critical — every CLI reviewer/doer fails on Windows
Affects: Windows + Node 18.20+ / 20.x / 22+ / 24+


Summary

spawnHeadless invokes CLI shims by their bare name ('claude', 'codex', 'gemini'). On Windows these are .cmd files in the npm global prefix (claude.cmd, codex.cmd, gemini.cmd). Node's child_process.spawn cannot resolve those without shell: true — and Node 18.20+ added CVE-2024-27980 mitigations that block spawn of .cmd/.bat outright (returning EINVAL). The result: the daemon emits [cli_not_in_path] ENOENT spawn claude for every CLI on Windows, even though the binaries are present and discoverable via where claude.

Reproduction

Fresh Windows 10/11 + Node 24, with all three official CLIs installed globally:

npm i -g @anthropic-ai/claude-code @openai/codex @google/gemini-cli
where claude   # C:\Users\...\npm\claude
               # C:\Users\...\npm\claude.cmd  ← actual binary

chorus init
chorus start
# Submit any chat → daemon log:
# [cli_not_in_path] claude not found on the daemon's PATH
# (ENOENT: spawn claude ENOENT)

The daemon's PATH includes C:\Users\...\npm\ (verified by inspecting getSpawnPath() output), but child_process.spawn('claude', ...) resolves the bash shim sibling (no extension, not executable by Node) before the .cmd and crashes.

On Node 24 specifically, the same code path produces:

Error: spawn EINVAL
  at ChildProcess.spawn (node:internal/child_process:421:11)

That's the CVE-2024-27980 hard-block on .cmd files outside of shell:true.

Root cause

Two stacked Windows-specific quirks:

  1. where claude returns multiple paths, in PATHEXT-resolution order. npm globals on Windows ship a Bash shim (claude — POSIX shell, no extension) alongside the Windows shim (claude.cmd). Node spawn picks the first one.
  2. Node's spawn refuses .cmd files without shell: true. The legacy work-around (always shell:true on Windows) trips DEP0190 in newer Node versions and has shell-escaping caveats; the modern work-around requires the explicit .cmd/.bat/.exe extension on the executable path.

The shim files themselves don't know about this; every runHeadless implementation just calls spawnHeadless({ command: 'claude' | 'codex' | 'gemini' | ... }) and assumes spawn will resolve PATH.

Fix

Resolve the binary's full path with extension via where on Windows before calling spawn. Cache per command name (one shell-out per daemon boot per CLI). Set shell: true only on Windows. Args are static CLI flags (not user-controlled) so the DEP0190 risk class doesn't apply.

On Unix this is a no-op — process.platform !== 'win32' short-circuits both helpers.

Diff

diff --git a/src/daemon/headless.ts b/src/daemon/headless.ts
index c0c36f0..2a1ea39 100644
--- a/src/daemon/headless.ts
+++ b/src/daemon/headless.ts
@@ -20,6 +20,7 @@
 
 import {
   spawn as spawnChild,
+  spawnSync,
   type ChildProcessWithoutNullStreams,
 } from 'child_process';
 import * as fs from 'fs';
@@ -28,6 +29,36 @@ import * as path from 'path';
 import type { AgentEvent } from './agents/types.js';
 import { cliPaths } from '../lib/cli-paths.js';
 
+// Cache binary→path resolutions. `where` shells out — don't repeat per spawn.
+const binaryPathCache = new Map<string, string>();
+
+/**
+ * Resolve a binary name to a full path with extension. Critical on Windows
+ * where Node's spawn won't resolve `.cmd` shims (npm globals like
+ * claude.cmd, codex.cmd, gemini.cmd) without shell:true (DEP0190 in
+ * Node 22+). On Unix returns the name unchanged — spawn handles PATH
+ * resolution natively for ELF/script files with shebangs.
+ */
+function resolveBinaryPath(command: string): string {
+  if (process.platform !== 'win32') return command;
+  if (path.isAbsolute(command)) return command;
+  const cached = binaryPathCache.get(command);
+  if (cached) return cached;
+  const r = spawnSync('where', [command], { encoding: 'utf-8', timeout: 3000 });
+  if (r.status !== 0 || !r.stdout) {
+    binaryPathCache.set(command, command);
+    return command; // fallback — daemon will surface ENOENT cleanly.
+  }
+  // npm globals on Windows ship two siblings: `claude` (Bash shim, not
+  // executable by Node spawn) and `claude.cmd` (Windows shim). `where`
+  // returns both; we must pick the .cmd/.bat/.exe variant for Node.
+  const lines = r.stdout.split(/\r?\n/).map((s) => s.trim()).filter(Boolean);
+  const preferred = lines.find((l) => /\.(cmd|bat|exe)$/i.test(l));
+  const resolved = preferred ?? lines[0] ?? command;
+  binaryPathCache.set(command, resolved);
+  return resolved;
+}
+
@@ -253,10 +284,23 @@ export function spawnHeadless(opts: SpawnHeadlessOptions): HeadlessRun {
   const timeoutMs = opts.timeoutMs ?? DEFAULT_TIMEOUT_MS;
   const startedAt = Date.now();
 
-  const child: ChildProcessWithoutNullStreams = spawnChild(opts.command, [...opts.args], {
+  // Windows CLIs ship as .cmd shims (npm globals: claude.cmd, codex.cmd,
+  // gemini.cmd). Node 18.20+/20.x/22+ added CVE-2024-27980 mitigations that
+  // *block* spawn of .cmd/.bat without shell:true (EINVAL). We:
+  //   1. Resolve to the full .cmd path via `where` (cached) so PATH lookups
+  //      always pick the .cmd shim, never the Bash sibling.
+  //   2. Set shell:true ONLY on Windows. DEP0190 fires informationally; our
+  //      args are fixed CLI flags (not user-controlled) so the shell-escaping
+  //      class of risks doesn't apply here.
+  // On Unix this is a no-op: spawn handles ELF/shebang scripts natively.
+  const isWindows = process.platform === 'win32';
+  const resolvedCommand = resolveBinaryPath(opts.command);
+
+  const child: ChildProcessWithoutNullStreams = spawnChild(resolvedCommand, [...opts.args], {
     cwd: opts.cwd,
     env: spawnEnv(opts.env),
     stdio: ['pipe', 'pipe', 'pipe'],
+    shell: isWindows,
   });

Testing

Manual repro on Windows 10 + Node 24.11 + npm globals @anthropic-ai/claude-code 2.1.138, @openai/codex 0.130.0, @google/gemini-cli 0.41.2:

Before:

[cli_not_in_path] claude not found on the daemon's PATH
(ENOENT: spawn claude ENOENT)

After:

  • Doer spawns successfully, writes ~/.chorus/chats/<id>/round-1/doer-claude-code/answer.md.
  • Reviewer spawns successfully (codex tested), writes answer.md with verdict text.
  • Verdict computed correctly (approved / request_changes).

Linux/Mac unaffected (verified: process.platform !== 'win32' returns the command unchanged, no shell:true).

Considerations for the maintainer

  • shell:true triggers DEP0190 informational warning on Node 22+. Since args here are static CLI flags (--print, --output-format stream-json, etc.) the shell-escaping risk doesn't apply. If the project wants to silence the warning, an alternative is wrapping with cmd.exe /c <bin> <args> — but that introduces a level of indirection and is harder to debug.
  • where may not be on PATH in some minimal Windows containers; consider falling back to process.env.PATHEXT-based search if where returns non-zero.
  • Tests: a Vitest suite that mocks spawnSync and spawnChild could verify the resolution logic + Windows-only branching. We didn't add it in the fork to keep the diff minimal — happy to do that as a follow-up.

Node 18.20+/20.x/22+ added CVE-2024-27980 mitigations that block spawn
of .cmd/.bat without shell:true (EINVAL). Combined with Windows npm
globals shipping a Bash shim sibling (no extension) alongside the .cmd
shim, every reviewer/doer subprocess fails on Windows with:

  [cli_not_in_path] claude not found on the daemon's PATH
  (ENOENT: spawn claude ENOENT)

Fix:
1. resolveBinaryPath() — Windows-only helper that runs 'where <cmd>'
   and prefers the .cmd/.bat/.exe variant over the no-extension Bash
   sibling. Cached per command name (one shell-out per daemon boot).
2. shell:true on Windows only. Args here are static CLI flags, not
   user-controlled, so the DEP0190 escaping risk class doesn't apply.

On Unix this is a no-op: process.platform !== 'win32' short-circuits
both helpers and the spawn options match pre-fix behaviour.

Adds tests/headless-spawn-windows.test.ts (7 tests) covering:
- Unix no-op
- Cache hits (single 'where' invocation per binary)
- Windows .cmd preference over Bash sibling
- Fallback when 'where' fails
- shell:true conditional on platform

Discovered while developing a vendored fork on Windows + Node 24 with
@anthropic-ai/claude-code, @openai/codex, @google/gemini-cli all
installed via 'npm i -g'.
chorus-codes added a commit that referenced this pull request May 10, 2026
…#27)

* fix(windows): resolve .cmd shim path for Node spawn

Node 18.20+/20.x/22+ added CVE-2024-27980 mitigations that block spawn
of .cmd/.bat without shell:true (EINVAL). Combined with Windows npm
globals shipping a Bash shim sibling (no extension) alongside the .cmd
shim, every reviewer/doer subprocess fails on Windows with:

  [cli_not_in_path] claude not found on the daemon's PATH
  (ENOENT: spawn claude ENOENT)

Fix:
1. resolveBinaryPath() — Windows-only helper that runs 'where <cmd>'
   and prefers the .cmd/.bat/.exe variant over the no-extension Bash
   sibling. Cached per command name (one shell-out per daemon boot).
2. shell:true on Windows only. Args here are static CLI flags, not
   user-controlled, so the DEP0190 escaping risk class doesn't apply.

On Unix this is a no-op: process.platform !== 'win32' short-circuits
both helpers and the spawn options match pre-fix behaviour.

Adds tests/headless-spawn-windows.test.ts (7 tests) covering:
- Unix no-op
- Cache hits (single 'where' invocation per binary)
- Windows .cmd preference over Bash sibling
- Fallback when 'where' fails
- shell:true conditional on platform

Discovered while developing a vendored fork on Windows + Node 24 with
@anthropic-ai/claude-code, @openai/codex, @google/gemini-cli all
installed via 'npm i -g'.

* fix(codex): normalize Windows paths in config.toml trust block

preTrustCodexWorkspace writes [projects."<cwd>"] blocks to
~/.codex/config.toml. Windows paths contain backslashes (\Users\...)
which TOML basic strings interpret as Unicode escapes — \U is a
4/8-hex-digit escape, \Users trips "too few unicode value digits"
and codex CLI exits 1 with that parse error on every subsequent
invocation.

Effectively breaks the user's codex install for any path containing
\U or \u (very common: \Users\, \username\). Discovered after
chorus wrote the malformed block during a chat run on Windows;
running 'codex' manually afterwards also failed until the user
removed the bad block by hand.

Fix: normalize backslashes to forward slashes before writing the
TOML marker. Both Codex CLI and Node accept forward slashes on
Windows for path lookups, and TOML basic strings don't interpret
'/' as anything special.

Adds tests/preflight-codex-windows-paths.test.ts (8 tests):
- Windows \Users\ path normalized to /Users/
- Windows \users\ (lowercase \u) also normalized
- Unix paths unchanged
- Idempotent (no double block on second call)
- Append preserves prior content

Linux/Mac unaffected: cwd.replace(/\/g, '/') is a no-op when
there are no backslashes.

* fix(gemini): non-whitespace -p placeholder survives shell concat

The gemini headless shim passes ['-p', ' ', ...] to flip gemini-cli
into non-interactive mode. The actual prompt arrives via stdin; the
-p value is a sentinel.

On Windows with shell:true (required for .cmd shim spawn — see
companion PR fixing src/daemon/headless.ts), Node concatenates argv
with spaces and the shell collapses runs of whitespace. The
single-space placeholder vanishes:

  gemini -p   --output-format stream-json --skip-trust ...
                ^^^ space placeholder eaten ^^^

yargs interprets -p as taking a value, picks --output-format as
that value, then sees stream-json as unknown → exits 1 with help
output. Daemon log shows:

  [cli_failed] gemini exited 1: ARNING: This can be a security risk
  if the model output is untrusted.  [boolean]
  --accept-raw-output-risk    Suppress the security warning...

That's verbatim gemini's --help.

Fix: replace ' ' placeholder with '_' (any non-whitespace works).
The trailing '_' on the prompt is benign — gemini appends it after
the stdin payload, which reviewers ignore as content noise.

Adds tests/gemini-shell-collapse.test.ts (5 tests) including a
join-collapse round-trip assertion that simulates the Windows
shell:true scenario.

Linux/Mac unaffected: argv passes through as separate strings, the
placeholder character doesn't matter.

* fix(daemon): persist reviewer phase_events for successful reviewers

runReviewers.runOne never emitted phase_start / phase_done events
for reviewer participants. The persister in runner-multiplex.ts
only writes to SQLite phase_events when it sees those event types
plus cli_error / cli_warning. Net effect: a chat where a reviewer
runs successfully shows zero reviewer rows in /chats/:id.events,
even though answer.md exists on disk and the verdict is computed
correctly.

Reproduction: any chat with a successful reviewer (codex against
a simple prompt is fastest):

  curl -s http://127.0.0.1:7707/api/v1/chats/<id> | jq '.data.events'

Pre-fix: only doer events visible.
Post-fix: reviewer/codex-cli-0 drafting + submitted rows appear.

Fix: bracket runOne with phase_start (always, before runReviewer)
and phase_done (only when verdict is non-null — failed reviewers
still self-report through cli_error / cli_warning so we don't
double-count). Mirrors the pattern used by the doer in runner.ts.
Reuses existing event types and states ('drafting', 'submitted');
no schema change.

Validated E2E with codex-only template: chat events now include
reviewer/codex-cli-0 drafting + submitted in /chats/:id response,
matching the reality the runner already wrote to disk.

Independent of the 3 Windows PRs in this batch — applies to every
platform where chorus runs reviewers.

* fix: tighten Windows test types + use path.win32 for cross-host isAbsolute

Two follow-up fixes layered on the cherry-picked PR #21 from @ElRaxy so
the suite passes on the Linux CI runner used by chorus-codes/chorus:

1. **`path.win32.isAbsolute`** instead of the platform-aware
   `path.isAbsolute`. The test stubs `process.platform = 'win32'` but
   the top-level `path` module is still POSIX on a Linux runner, so
   `path.isAbsolute('C:\\Users\\...')` returned false and the resolver
   spawned the (mocked) `where` command anyway — failing the test.
   `path.win32.isAbsolute` gives the same result on real Windows AND
   on Linux CI.

2. **Tighten the spawn-spy signature** in `tests/headless-spawn-windows.test.ts`
   from `vi.fn(() => child)` to a typed signature matching `spawn`.
   Without this, TypeScript inferred `mock.calls[i]` as `[]` (length 0)
   and assertions on `[0]?.[2]` (the spawn options arg) failed
   typecheck with TS2493.

The test logic and source behaviour are unchanged on real Windows.

---------

Co-authored-by: ElRaxy <alexpedritot7@gmail.com>
Co-authored-by: chorus-codes <280607145+chorus-codes@users.noreply.github.com>
chorus-codes added a commit that referenced this pull request May 11, 2026
…robes (#32) (#33)

Reporter (@oguzkir) pasted C:\Users\...\gemini.cmd into the manual-path
validator on /onboarding and got "Validation failed: gemini.cmd
--version exited null." Root cause: Node spawn cannot execute Windows
batch shims directly (DEP0190 / CVE-2024-27980) — calling spawn with a
.cmd target returns status=null on win32.

ElRaxy's PR #21 fixed this for the headless reviewer spawn path, but
two other call sites used the same bare `spawn(binPath, ['--version'])`
pattern:
  - verifyRunnable in src/lib/cli-detect.ts:261 (manual-path validator)
  - smokeOneCli in src/cli/commands/diagnose.ts:281 (the per-CLI smoke
    I added in PR #19)

Fix introduces a shared `buildVersionSpawn(binPath, isWin)` helper that
returns the right spawn shape for the platform. On Windows .cmd/.bat,
it produces a quoted shell-invocation that handles paths with spaces
(e.g. "C:\Program Files\..."). Both call sites now use it.

Self-review (8 reviewers) caught two real blockers, both addressed:
  - Path-with-spaces: 5 reviewers flagged the original
    cmd: 'cmd.exe', args: ['/c', binPath, '--version']
    shape broke on "C:\Program Files\..." because cmd.exe's /c parser
    doesn't reliably preserve the inner quotes Node's libuv adds. Fixed
    by switching to shell:true with the bin path wrapped in `"..."`.
  - .ps1 doesn't work through cmd.exe /c: 3 reviewers flagged that
    PowerShell scripts need powershell.exe -File, not cmd.exe — silent
    fail via ExecutionPolicy or missing file-type association. Dropped
    .ps1 from the regex. Future PR can add real .ps1 support.

Shell-injection guard: shell:true only fires when binPath matches a
strict regex (drive letter + safe chars). Metacharacters fall back to
direct exec which fails cleanly rather than risk command injection
from a malicious paste in the onboarding UI.

Tests: +10 covering both branches (Linux pass-through, Windows .cmd /
.bat / .exe / .ps1 / extension-less, paths with spaces, paths with
metacharacters). isWin parameter lets every branch test on Linux CI.

Co-authored-by: chorus-codes <280607145+chorus-codes@users.noreply.github.com>
crypticpy added a commit to crypticpy/chorus that referenced this pull request May 18, 2026
…ing contradiction

- README: Status/Node badges had empty `()` link targets (MD042) and the
  Status badge still said v0.7. Point Status -> ROADMAP.md (now v0.8) and
  Node -> nodejs.org so both have real targets.
- ROADMAP: "Runner decoupling from SSE" row was still marked TODO for v0.8
  even though lines 48 and 267 record it as shipped in PR chorus-codes#21 (2026-05-02).
  Align row to the v0.7 DONE status used elsewhere.

CI URL badge and `npm i -g chorus-codes` install line still intentionally
untouched — those wait for the rebrand PR.

Caught by CodeRabbit on PR #7.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
crypticpy added a commit to crypticpy/chorus that referenced this pull request May 18, 2026
…babysit) + fork retarget (#7)

* docs: README + ROADMAP for v0.8 shipped (audit, orchestrate, verify, babysit) + fork retarget

Updates the docs to reflect what shipped in PRs #1 + #6 and decouples the
URLs from the upstream chorus-codes/chorus slot now that this fork lives
on its own at crypticpy/chorus.

README:
- Templates section: new rows for audit-* and pr-babysit + three
  explainer subsections (audit + orchestrate, verify + TDD loop,
  PR-babysit)
- Commands section: chorus quickstart + chorus babysit subcommand group
- Roadmap mini-summary: v0.8 marked done
- Repo/issue/clone/discussion URLs: chorus-codes/chorus → crypticpy/chorus
  (badges + npm install left alone — those wait for the rebrand PR)

ROADMAP:
- Status tracker: 4 new shipped rows (audit phase, orchestrate phase,
  verify phase + TDD loop, PR-babysit)
- chorus-codes/chorus migration items marked MOOT — fork is its own
  project, no contribution-back planned
- Section 7 (migration) collapsed to a short MOOT note
- "Where we are" footnote updated to 2026-05-17

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: clarify babysit daemon does not auto-merge

The in-app `chorus babysit` daemon (src/daemon/babysit/state-machine.ts) sits
in `quiet_check` after two quiet ticks and waits for an external merge — it
does not call `gh pr merge` itself and does not gate on CI. The README and
ROADMAP both overstated this, implying the daemon squash-merges on its own.
Update both to point to the `/babysit-pr` Claude Code skill (this loop) as
the surface that actually performs the merge.

Caught by ChatGPT Codex review on PR #7.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: fix empty badge links + stale v0.7 status + ROADMAP SSE-decoupling contradiction

- README: Status/Node badges had empty `()` link targets (MD042) and the
  Status badge still said v0.7. Point Status -> ROADMAP.md (now v0.8) and
  Node -> nodejs.org so both have real targets.
- ROADMAP: "Runner decoupling from SSE" row was still marked TODO for v0.8
  even though lines 48 and 267 record it as shipped in PR chorus-codes#21 (2026-05-02).
  Align row to the v0.7 DONE status used elsewhere.

CI URL badge and `npm i -g chorus-codes` install line still intentionally
untouched — those wait for the rebrand PR.

Caught by CodeRabbit on PR #7.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant