Skip to content

fix: react-router CVEs (10 Dependabot alerts) + repair pre-existing database/schema test#53

Merged
ersinkoc merged 38 commits into
mainfrom
fix/react-router-cves
Jun 4, 2026
Merged

fix: react-router CVEs (10 Dependabot alerts) + repair pre-existing database/schema test#53
ersinkoc merged 38 commits into
mainfrom
fix/react-router-cves

Conversation

@ersinkoc
Copy link
Copy Markdown
Collaborator

@ersinkoc ersinkoc commented Jun 4, 2026

Two repo-health fixes. Bundled because CI is red on every open PR until both land (audit blocks on react-router; the test step blocks on the database/schema collection error), so this PR must merge first to unblock #52 and #54.

1. react-router-dom ^7.13.1 → ^7.16.0

Resolves all 10 open Dependabot alerts (6 high, 4 moderate) — DoS, turbo-stream RCE, RSC XSS, open redirect, stored XSS. Lockfile diff limited to react-router/react-router-dom. Fixes the audit:prod CI step.

2. database/schema.test.ts child_process mock

Pre-existing failure on main: vi.mock('child_process', () => ({ spawn })) replaced the whole module, dropping exec (imported elsewhere in the chain) → No "exec" export vitest collection error. Switched to the importOriginal pattern. Fixes the CI test step (18/18 pass).

Verification

UI typecheck + 347/347 tests + build; database/schema 18/18; lockfile audit clean.

🤖 Generated with Claude Code

ersinkoc and others added 30 commits June 3, 2026 00:12
executeCycle consumed the inbox after each cycle by slicing
`min(snapshotLength, currentLength)` off the head. New inbox messages are
appended to the tail, but trimInbox evicts from the head once the inbox
hits its cap (MAX_INBOX_MESSAGES / MAX_INBOX_BYTES). When an operator
sendMessage/steer landed while a cycle was running and the inbox was at
the cap, each push evicted a snapshot message from the head, so the
snapshot-length slice over-removed and silently dropped the brand-new
operator directives. The prior `Math.min` "guard" was a no-op — JS slice
never overruns, so it changed nothing in the failing case.

Count head evictions instead: trimInbox now takes the ManagedClaw and
accumulates `inboxEvictedDuringCycle` (reset at cycle start); the
consume-slice is `consumed = max(0, snapshotLength - evicted)`, which
discards exactly the consumed messages while preserving every mid-cycle
arrival. Robust to direct session.inbox.push (no per-push instrumentation).

Regression test: fill inbox to the 100-message cap, push 3 via sendMessage
inside the cycle, assert all 3 survive.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
EdgeMqttClient.doConnect() runs on every reconnect and on a re-entrant
connect(). It overwrote `this.client` without ending the previous client.
With reconnectPeriod:0, mqtt.js never reuses or closes the old client, so
each broker drop->reconnect leaked its socket/fd + keepalive timer, and
the old client's event listeners stayed live.

Fix: capture `previous`, wire up the new client, then `previous.end(true)`.
Every handler is now identity-guarded (`if (this.client !== client) return`)
so a replaced/torn-down client's events no-op — this is also what makes
ending the previous client safe (its end()-emitted 'close' can't schedule a
reconnect against the live client). disconnect() now nulls `this.client`
before end() so the guard neutralizes the intentional-disconnect 'close'.

Regression test: reconnect calls end(true) on the old client; a stale
'message'/'close' from the replaced client does nothing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
terminateSession() set state='terminated' and disposed the PTY but never
called fireCompletionCallbacks(). Disposing the PTY detaches its onExit
listener (the normal completion path), so any waitForCompletion() waiter
was never resolved — it blocked until its own timeout (default 30 min).

This broke the orchestrator's documented fast-cancel: cancelOrchestration
terminates the in-flight session specifically so the loop sitting in
waitForCompletion returns promptly "instead of waiting for waitForCompletion
to time out". The CLI was killed (budget stopped) but the loop still hung
the full step timeout per cancel.

Fix: fire completion callbacks in terminateSession after the state event.
Safe against double-fire (the callback array is spliced empty, and dispose
detaches onExit so it cannot fire later).

Regression test: a pending waitForCompletion resolves immediately on
terminate (would otherwise hang to the test timeout).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
close() sent SIGTERM then scheduled a 5s SIGKILL fallback that could never
fire, for two compounding reasons: (1) the timer closure read the instance
field `this.process`, which close() nulls synchronously right after, so the
closure saw null; (2) the guard `!this.process.killed` is always false here
because Node sets `child.killed = true` the instant kill() *sends* a signal,
not when the process dies. A coding-agent subprocess that ignores/delays
SIGTERM was therefore never force-killed.

Fix: capture `proc` in a local before nulling the field, and guard liveness
with `proc.exitCode === null && proc.signalCode === null` (genuinely still
running — also avoids SIGKILL-ing a reused PID after a clean exit). Matches
the correct pattern in CodingAgentSessionManager.stop().

Regression tests: a live process gets SIGKILL after the timeout; an exited
process does not.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
TerminalManager.kill/release/dispose sent SIGTERM only. These terminals are
children of the GATEWAY (spawned for ACP agent command execution), so they
do NOT die when the agent process exits — a command that ignores SIGTERM
(e.g. a trapped dev server) leaked as a zombie gateway child after the
session closed.

Fix: shared hardKill(t) helper that SIGTERMs then SIGKILLs after a grace
period if the process is still alive (exitCode/signalCode null), mirroring
AcpClient.close() and CodingAgentSessionManager.stop().

Regression test: a killed terminal that never exits gets SIGKILL after the
grace period.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cing SET

PostgresAdapter applied statement_timeout and
idle_in_transaction_session_timeout in a `pool.on('connect', client =>
client.query('SET ...'))` handler. pg-pool does NOT await connect listeners,
so it handed the fresh client to a consumer whose first query ran
concurrently with the not-yet-finished SET. That both (a) emitted pg's
"client is already executing a query is deprecated" warning, and (b) let the
first query on every new pooled connection run before statement_timeout took
effect — a real protection gap.

Set the GUCs via the libpq `options` startup parameter instead
(`-c statement_timeout=<ms> -c idle_in_transaction_session_timeout=<ms>`).
These apply server-side at session start, before any query: no race, no
warning, first query protected. Values stay in ms (Postgres GUC default unit),
matching the prior `SET statement_timeout = <ms>` semantics; the option is
omitted when the constant is 0. The post-connect SET handler is removed.

Added an isolated-module test (resetModules + doMock with timeouts > 0, which
the shared timeouts=0 defaults mock can't reach) asserting the options string
is passed to Pool and no 'connect' handler is registered.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
verifyToken() did findValidToken (reads is_used = FALSE) -> findOrCreate
channel user -> markVerified -> consumeToken. The find-then-consume was not
atomic, and consumeToken's UPDATE had no is_used guard, so two concurrent
`/connect <token>` messages (different platform users, same single-use token)
both passed the read and both got verified — each linked to the token owner's
OwnPilot account off ONE single-use token.

consumeToken now does `UPDATE ... SET is_used = TRUE ... WHERE id = $2 AND
is_used = FALSE` and returns whether a row was claimed (changes > 0) — the
authoritative atomic claim, so only the first concurrent caller wins.
verifyToken performs the claim BEFORE markVerified and returns the
invalid-token error if the claim is lost, so the race loser is never linked.
Mirrors the FOR-UPDATE atomic-claim pattern already used in jobs.ts fail().

Tests: repo returns false when no row updated; service rejects and skips
markVerified when the claim is lost.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
routes/channels/auth.ts guarded ownership on GET /status but not on the
management endpoints: GET /users listed channel users across ALL owners, and
POST /users/:id/{approve,block,unblock,unverify}, DELETE /users/:id, and
POST /{block,unblock}/:platform/:platformUserId acted on any id with no owner
check. In a multi-user deployment that let one user enumerate and
approve/block/delete another owner's linked channel accounts (OWASP A01).

- channelUsersRepo.list() gains an optional ownpilotUserId filter; GET /users
  scopes to getUserId(c).
- New getOwnedChannelUserById(id, owner) helper (getById + ownership check,
  returns null -> 404 so cross-owner ids are indistinguishable from missing).
  Applied to approve/block/unblock/unverify/delete.
- Platform-based block/unblock resolve via findByPlatform then ownership-check.

All cross-owner accesses return 404, matching the existing /status guard (no
existence leak). Tests assert cross-owner -> 404 with the underlying mutator
never called, for each endpoint, plus list() scoped by ownpilotUserId.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
routes/cli/providers.ts scoped GET / and POST / by userId, but PUT /:id,
DELETE /:id, and POST /:id/test acted on the id alone via the repo's
update/delete/getById (which, unlike list/getByName/count, are not
user-scoped). cli_providers is an owner-scoped table (user_id +
UNIQUE(user_id, name)), so in a multi-user deployment any user could, by
guessing an id, read another user's provider config, modify it (e.g. repoint
`binary` to an attacker-controlled path -> RCE against the victim's next
coding run), delete it, or /test it (which runs `which <binary>` and
`<binary> --version`).

Added a route-level getOwnedProvider(id, userId) guard (getById + ownership
check; cross-owner/missing both -> 404, no existence leak) on PUT, DELETE, and
/test. Repo signatures unchanged (the only other caller, cli/tools.ts, already
passes an owned provider). Mirrors the channel-users IDOR fix.

Tests: cross-owner requests return 404 with update/delete/execFileSync never
called.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…fetch

buildSandboxContext injected `fetch: globalThis.fetch.bind(globalThis)`
whenever the `network` permission was granted — raw, with no SSRF guard (the
harden() membrane only blocks prototype-walk escapes, not internal addresses).
A network-permitted sandboxed tool could reach http://169.254.169.254/...
(cloud metadata -> credential theft), localhost, or internal services.

The dynamic-tool executor overrides this with createSafeFetch(tool.name) via
customGlobals, so the main custom-tool path was already safe. But the DEFAULT
was a live footgun for every other consumer: the worker-sandbox passes no
fetch override (and function globals can't cross to a worker thread), and
extensions / future callers that grant `network` would silently get raw fetch.

Make the default safe by construction: inject createSafeFetch('sandbox')
(manual redirect following + per-hop DNS-aware isPrivateUrlAsync, which catches
literal internal IPs synchronously and hostname->private-IP DNS rebinding). The
import is cycle-free (dynamic-tool-sandbox -> dynamic-tool-permissions pulls
only node:dns + types) and membrane-equivalent to the prior host fetch. The
dynamic-tool executor still injects its per-tool createSafeFetch for better
error labelling; this only closes the default.

Tests: the default sandbox fetch rejects 169.254.169.254 and 127.0.0.1 with no
override. Full core suite green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…DOR)

The notification routes trusted a request-supplied identity for data
access instead of the authenticated one:

- POST /send used `body.userId` (sendNotificationSchema permits it), so
  any authenticated user could push a notification onto ANOTHER user's
  connected channels — spam / phishing via the victim's Telegram,
  Discord, etc.
- GET/PUT /preferences/:userId used the `:userId` path param, allowing
  cross-user READ of notification preferences (channel priorities,
  quiet hours = info disclosure) and WRITE (e.g. raising minPriority to
  silence a victim's alerts = tampering / notification-DoS).

Scope all three handlers to `getUserId(c)` (set only from verified auth
— JWT sub / API key / session). The `:userId` param is kept for route
shape but is no longer trusted; operations always act on the
authenticated user. Non-breaking for single-user ('default') and
multi-user (JWT sub) deployments.

Tests now install an auth-simulating middleware and assert each handler
acts on the authenticated user ('alice') and never on the attacker-
supplied 'victim'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ejects

The agent tool-execution loop runs approved tool calls through
Promise.allSettled. executeToolCall is built to RETURN error results
(it catches JSON-parse, validation, and execute() err-Results
internally), so the rejected branch is a defensive net for an
unexpected throw — e.g. the dynamic getValidateToolCall() import
failing, or an unhandled rejection inside execute().

That net was buggy: it pushed the tool result with
`toolCallId: 'unknown'`. Since the assistant message already recorded
every response.toolCalls entry, the real tool_use id was then left
without a matching tool_result. Providers that require one for every
tool_use (Anthropic) reject the NEXT request, so a single unexpected
executor throw broke the entire conversation instead of degrading just
that one call.

Promise.allSettled preserves input order, so settled[i] corresponds to
approvedToolCalls[i]. Iterate by index and use approvedToolCalls[i].id
in the rejected branch.

Adds the first test for this branch: spies executeToolCall to reject,
then asserts the second provider call carries a tool result keyed to
the original id and never 'unknown' (fails without the fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ency

pollWorker is invoked concurrently for the SAME worker from three
sources: the 1 Hz pollAll tick, the immediate start poll, and each
job's finally re-poll. Each invocation read worker.activeJobs.size,
computed availableSlots = concurrency - size, then awaited claimJob and
only added to activeJobs AFTER the await. Two overlapping polls both
observed the stale pre-claim size and could collectively claim up to
2×concurrency jobs before either updated the set.

The workflow_nodes worker runs concurrency: 4 and its handler dispatches
arbitrary workflow nodes (tools, claws, LLM calls), so over-subscription
spawns more concurrent expensive work than the cap intends — the same
class of bounded-concurrency guarantee the codebase enforces elsewhere
(TOOL_CALL_CONCURRENCY, MAX_CONCURRENT_CLAWS).

Add a per-worker `polling` re-entrancy flag (set in try/finally) so only
one claim-loop runs per worker at a time; activeJobs.size is then read
consistently across the claimJob await. No deadlock — polling is always
reset in finally, and a slot freed mid-poll is backfilled by the 1 Hz
poll / the next finally re-poll.

Adds the first test file for this service: invoking pollWorker three
times concurrently keeps max in-flight ≤ cap (proven to reach 12 against
a cap of 4 without the guard).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
executeWorkflow and resumeFromApproval acquire the per-workflow
activeExecutions lock, then do DB work (createLog / updateLog) plus an
onProgress('started') callback BEFORE entering the main try whose
finally is the ONLY place the lock is released.

If createLog/updateLog throws (DB error) or a consumer's onProgress
callback throws, the lock leaked: the workflow stayed in
activeExecutions forever, so every later run threw "Workflow is already
running" until the gateway restarted (activeExecutions is in-memory).
For resumeFromApproval it was worse — the paused workflow became
permanently un-resumable.

Wrap the post-lock setup (log write + onProgress) in its own
try/catch that releases the lock and rethrows, so the lock is freed on
that path before the main try/finally takes over.

Adds regression tests for both methods: make createLog / updateLog
reject and assert isRunning(id) is false afterward (both fail without
the fix — the lock stays held).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
PluginIsolatedNetwork.fetch has a maxResponseSize (10MB) cap and checks
the content-length header against it — but then did
`await response.text()`, which buffers the ENTIRE body into memory, and
only afterward checked body.length. A response that omits or lies about
content-length sailed past the header check, and text() buffered the
whole payload (bounded only by the 30s timeout) before the post-read
check could fire. The limit existed but never actually bounded memory.

Read the body incrementally via response.body.getReader(), tracking
bytes received, and cancel the stream + return response_too_large the
moment the accumulated size exceeds the cap. The content-length
fast-path is kept (rejects honestly-oversized responses before reading),
with a response.text() fallback for responses that expose no readable
stream. Extracted as readBodyWithCap().

Adds the first test file for network.ts: a ReadableStream that counts
pulled chunks proves the reader stops after ~11 of 20 one-MB chunks with
the fix, versus draining all 20 without it. The DNS-resolving SSRF guard
(per-redirect-hop re-validation, fail-closed) was audited and is sound.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
UCPBridgeManager.bridgeMessage compiles a bridge's filterPattern and
runs RegExp.test against inbound message text synchronously on the main
thread. The pattern is owner-configured, but the text is
attacker-influenceable — anyone who can message a bridged channel
controls it. An innocently catastrophic-backtracking pattern such as
`(a+)+$` against crafted input would spin the event loop indefinitely,
hanging the gateway (ReDoS DoS). The try/catch around the RegExp only
caught compile errors, never runtime backtracking (which doesn't throw).

Add isSafeRegexPattern() to core: a compile check plus a dependency-free
"star height" heuristic that rejects an unbounded quantifier (*, +,
{n,}) nested inside an unbounded-quantified group. Wire it in two places:

- Gateway bridge create/update schema (validRegex superRefine) — a
  ReDoS-prone filterPattern is now a 400, never persisted.
- UCPBridgeManager.loadBridges re-validates every stored pattern and
  drops any unsafe one (the bridge keeps forwarding, just unfiltered),
  so a pattern persisted before this guard can't hang bridgeMessage.
  bridgeMessage itself is unchanged.

The heuristic accepts ordinary keyword/anchored/single-quantifier
filters and rejects nested unbounded quantifiers; it can over-reject an
exotic-but-safe pattern, but bridge filters are simple matchers in
practice so the trade-off favours never hanging the process.

Tests: core covers the heuristic cases and loadBridges dropping an
unsafe stored pattern; gateway asserts POST `(a+)+$` returns 400 and is
not saved (proven to 201 without the fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…'completed'

runOrchestrationLoop only checked ctrl.abort at the top of the step
loop. cancelOrchestration sets abort=true, terminates the in-flight CLI
session, and persists status 'cancelled'. But terminating the session
makes the loop's `await waitForCompletion(...)` RESOLVE (the documented
terminate -> fireCompletionCallbacks behavior), so the loop ran on past
the cancel: it marked the step 'completed', ran a paid analyzeOutput LLM
call, and reached finishRun(..., 'completed') — overwriting the
'cancelled' status and broadcasting orchestration:finished as completed.
Cancelling a run left it reported as completed and burned an extra
analysis call.

A single abort check at the loop top can't catch a cancel that lands
while the loop is parked in the very await the cancel unblocks. Check
ctrl.abort after each interruptible await instead:
- top of loop now returns (with activeRuns cleanup) rather than breaking
  (break fell through to the post-loop finishRun('completed'));
- right after waitForCompletion resolves;
- right after analyzeOutput resolves (cancel during the analysis call).

Each bails with activeRuns.delete + return, leaving cancelOrchestration's
'cancelled' status intact.

Test gates waitForCompletion on a deferred, cancels mid-step, and
asserts the persisted statuses include 'cancelled' but never 'completed'
(without the fix: ['running','cancelled','completed']). The mock repo
create must echo the generated runId since activeRuns is keyed by it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
BudgetTracker.checkBudget — the gate the heartbeat runner calls before
every cycle — only checked maxCostPerDay. getMonthlySpend() existed but
was used solely for crew reporting, never for enforcement. So
maxCostPerMonth (default $100 from soul deploy / crew templates) was
configured and shown in the UI but completely toothless: a soul spending
up to maxCostPerDay each day sails straight past it (e.g. $5/day × 30 =
$150 against a $100/month cap).

checkBudget now also blocks when monthly spend reaches maxCostPerMonth.
The daily check keeps its exact prior semantics (strict-less-than, same
zero behaviour); the monthly check is additive and only runs when
maxCostPerMonth > 0, so a cap of 0 means "no monthly limit" and skips
the extra query (it also short-circuits when the daily cap already
blocks).

Reworded the now-inaccurate "daily budget exceeded" log / error / pause
notification in heartbeat-runner to "budget exceeded (daily / monthly)"
since either cap can trigger the pause.

Tests: monthly-exceeded-while-daily-under returns false (returns true
without the fix); a monthly cap of 0 returns true and issues only the
daily query.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nnect leaks

ChannelService.connect() calls api.connect() with no already-connected
dedupe. The Telegram plugin self-guards, but Discord, Slack, and Matrix
did not — so a repeat connect() on an already-connected channel (REST
POST /channels/:id/connect, WS channel:connect, or a config-change
reconnect) leaked the previous handle:

- Discord: a second discord.js Client was built over this.client,
  leaking the old client's gateway WebSocket, heartbeat timers, and
  listeners.
- Slack: connectSocketMode() opened a second SocketModeClient, leaking
  the first Socket Mode WebSocket.
- Matrix: startSync() started a SECOND long-poll loop and orphaned the
  first syncAbortController — now unreachable, so that loop polls /sync
  forever.

Add the Telegram-style idempotency guard as the first line of each
connect(): skip when status is already 'connected' or 'connecting'.
Lowest-risk fix, matches the established pattern, no change to shared
ChannelService behavior.

Matrix is the representative regression test (its connect() is
fetch-based, so no SDK mock is needed): with status pre-set to
'connected', connect() must make no /whoami call (without the guard it
issues one). Discord and Slack get the identical one-line guard; their
real connect() paths require discord.js / @slack mocks that the existing
plugin tests don't set up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rsion

fetchJson and downloadFile followed 301/302 by recursively calling
themselves with res.headers.location and no depth limit. A redirect
loop (registry/CDN misconfig, or a custom/private registry) recurses
synchronously inside the httpsGet callback until the call stack blows
(RangeError: Maximum call stack size exceeded) — a hard crash, not just
a hang. The Location header is also unvalidated.

Thread a redirectsLeft counter (MAX_DOWNLOAD_REDIRECTS = 5) through both
functions and reject with "Too many redirects" once it is exhausted,
matching the redirect cap the plugin network-isolation layer already
uses.

Test: a persistent mock that always 301-redirects to itself makes
getPackageInfo reject with "too many redirects" instead of overflowing
the stack (RangeError without the fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
setupHandlers() wired bot.on('message:text') FIRST, then
bot.command('start'|'help'|'reset'). grammy runs middleware in
registration order and does not auto-call next() for these single-arg
handlers, so the message:text handler consumed every text update —
including the slash commands, which are text messages too — and the
command handlers never fired. /start, /help, /reset were silently
routed to the AI message handler instead of replying with their canned
text.

This is the same gotcha already fixed in the gateway Telegram plugin
(commands must precede bot.on('message')); the CLI's separate bot
implementation never got it. Reorder so all command() handlers register
before the message:text catch-all (bot.catch stays last).

Test mocks grammy's Bot to record handler registration order and asserts
each command registers before the message:text handler (fails on the old
ordering: command:start at index 1, on:message:text at 0).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
detect() built each pattern's regex from pattern.pattern.flags verbatim
and iterated matches with `while ((match = regex.exec(text)) !== null)`.
RegExp only advances lastIndex when the regex is global, so:

- a pattern supplied WITHOUT the 'g' flag re-matches the first
  occurrence forever, and
- a global pattern that can match zero-width (e.g. /\d*/g) never
  advances on the empty match.

Either case is a SYNCHRONOUS infinite loop that blocks the event loop
entirely — it can't even be interrupted by a timer — so it hard-freezes
the process (a test run hung 119s vs 0.2s with the fix). All 24
built-in patterns are global and non-zero-width, but DetectorConfig
.customPatterns is a public, documented config surface: a consumer
adding a custom PII pattern without 'g' freezes the process on the next
detect()/hasPII() over any text.

Force the global flag when building the per-call regex and skip
zero-width matches by advancing lastIndex. Tests cover a non-global
custom pattern (still finds every occurrence) and a zero-width-capable
pattern (terminates); both hang the run without the fix.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
redact() sorted matches descending by start and spliced right-to-left.
That is only correct for DISJOINT spans. The detector dedups exact
ranges only, so a generic and a specific pattern can both match
overlapping/contained spans of the same value (e.g. "api_key=sk-…"
matched as both the generic api_key span and the specific sk- token).

For the length-preserving `mask` mode the offsets stayed aligned, but
for length-CHANGING modes (category, hash, remove) the second splice
read stale offsets into the already-modified string and LEFT ORIGINAL
PII CHARACTERS in the output — e.g. outer /\d{10}/ + inner /23/ on
"0123456789" in category mode produced "[CUSTOM]456789" instead of
"[CUSTOM]", leaking 456789.

Coalesce overlapping/contained matches into non-overlapping spans before
replacing: sort ascending, merge when a match starts before the previous
span ends (keeping the highest-severity match's category for the label),
then replace the merged spans right-to-left. Disjoint matches (including
adjacent ones) are unaffected, so the common case is unchanged.

Test redacts text where a 10-digit pattern and an inner pair pattern
overlap and asserts no original digit survives in category/remove modes
(without the fix: "[CUSTOM]456789").

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
matchesTopic() handled the '+' wildcard with a bare `continue`, without
checking that the topic actually has a level at that position. Combined
with '#' (which returns true immediately), a pattern like `a/+/#`
wrongly matched the shorter topic `a`: the '+' "matched" a non-existent
level and then '#' returned true. Per the MQTT spec, '+' occupies
exactly one present level, so `a/+/#` must not match `a`.

This is not reachable with OwnPilot's own edge topics (their patterns
end in a concrete suffix and never place '+' before '#'), but
EdgeMqttClient.subscribe() is a general-purpose API; this is defensive
hardening of the matcher.

Add `if (i >= topicParts.length) return false;` to the '+' branch.
Behavior is unchanged for the real edge patterns (their '+' levels are
always present when the rest matches).

Tests: `a/+/#` does not dispatch to topic `a` (handler wrongly called
without the fix) and still dispatches to `a/b/c/d`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ty check

PluginVerifier.verify(manifest, contentHash?) treated the content-hash
integrity check as optional and silent: when a caller verified a signed
manifest WITHOUT passing the downloaded files' hash, the signature
checked out (signatureValid true) but the files were never bound to it —
a valid signature over a claimed hash says nothing about what was
actually downloaded. integrityValid stayed false and trust capped at
'community', but nothing surfaced the gap, so a caller gating on
`valid`/`signatureValid` alone could install unverified files.

Add an explicit warning in that path so the integrity gap is visible
rather than silent. Behavior is otherwise unchanged: `valid`,
`signatureValid`, `integrityValid`, and `trustLevel` are untouched (so
the legitimate "verify the manifest signature before downloading" use
and the community-trust path still work). Callers gating installs must
require integrityValid, not valid/signatureValid — now documented in the
code.

This is the marketplace distribution path, which is not yet wired into a
production install flow, so this hardens the public @ownpilot/core
verifier API for SDK consumers rather than fixing a reachable bug.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
callEmbeddingAPI returned the provider's embedding array without checking
its length against the input count. A partial or truncated response (or an
OpenAI-compatible provider that drops items) yields fewer embeddings than
inputs, leaving undefined holes in generateBatchEmbeddings' pre-sized
results array. The embedding queue dereferences those holes outside its
per-item guard, throwing an uncaught error mid-batch that permanently leaks
the affected memories' dedup keys (queuedIds) — so they can never be
re-queued and never get embeddings until restart.

Enforce the contract at the API boundary: throw when the embedding count
differs from the input count. Both callers already handle a throw safely
(the queue re-enqueues the whole batch; semantic-search falls back to
cached), turning a silent partial-poison into a clean full-batch retry.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
checkAndRunTasks runs on a fixed setInterval (checkInterval, 60s in
production) but a task may run far longer (up to its timeout, 300s
default), and nextRun is advanced only after executeTask resolves. With no
in-flight guard, every overlapping tick during a long task's execution saw
the same task still 'due' and launched it again — running it concurrently
with itself up to 5x and duplicating side effects and LLM cost.

Add a per-task runningTasks Set: skip a due task that is already executing,
claim it synchronously before the await, and release it in a finally. The
check-and-claim happens before any await, so overlapping ticks cannot both
launch the same task. Distinct due tasks still run; nextRun advance
semantics are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…uard

isPathAllowedAsync resolves symlinks with fs.realpath and range-checks the
result against the allowed roots. But realpath only resolves symlinks for
paths that already exist; for a target that doesn't exist yet (creating a
new file, delete/move destination) it threw, and the fallback used
path.normalize — which does not follow symlinks. A symlinked directory
inside the workspace (e.g. a 'cache -> /outside' link, or pnpm's symlinked
node_modules) could therefore be used to escape the sandbox: write_file
through the symlink passed the guard (normalize kept it under the workspace
prefix) and then fs.writeFile followed the link outside the workspace.

Add realpathNearestExistingAncestor: walk up to the nearest existing
ancestor, resolve its symlinks, and re-attach the missing trailing
segments, then range-check that real location. Existing-file operations and
the deny-traversal behavior are unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
isPrivateIp only special-cased ::ffff:127.0.0.1, so every other IPv4-mapped
IPv6 form (::ffff:169.254.169.254, ::ffff:10.x, ::ffff:192.168.x) fell
through to the dotted-quad regex, which a colon form fails, and was
classified as public. Since isPrivateUrlAsync runs each DNS-resolved
address through isPrivateIp, a hostname publishing an AAAA record of
::ffff:<private-ipv4> bypassed the SSRF guard — and the dual-stack OS
connects the mapped address to the embedded IPv4, i.e. the cloud-metadata
or private endpoint.

Normalize ::ffff:a.b.c.d (and the legacy ::a.b.c.d) to the embedded IPv4
before applying the IPv4 private-range rules, so all mapped private
addresses are caught. ::1 still falls through to the loopback check.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The local JS/Python/shell executors accumulated child stdout/stderr into
strings with no cap while the process ran; maxOutputSize only truncated the
returned result. A runaway writer (e.g. while-true console.log / print)
could therefore exhaust host memory before the timeout SIGKILLs it.

Add a shared collectCappedOutput helper: append each chunk, and once the
combined output crosses the cap, SIGKILL the child and drop further data so
memory stays bounded to ~cap plus one in-flight chunk. success now requires
!exceeded and a distinct 'Output exceeded N bytes' error is reported
(separate from the timeout path). Mirrors the streaming-cap pattern already
used for plugin isolated-network responses.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ersinkoc and others added 8 commits June 3, 2026 22:35
read_file/write_file enforce isPathAllowedAsync against the workspace (and
write_file is permission-blocked), but pdf-tools (read/create/info) and
image-tools (analyze/resize) did fs.readFile/writeFile/mkdir/sharp().toFile
on agent-supplied absolute paths with no sandbox check. pdf_create was
therefore an unblocked, unsandboxed arbitrary-file-write primitive, and the
read tools could leak any file's contents — defeating the file sandbox.

Export resolveFilePath from file-system.ts and, in every pdf/image I/O
executor, resolve the path against context.workspaceDir and reject with
'Access denied' when isPathAllowedAsync fails, before any fs op. URL/base64
image sources and the stub executors (generate_image, all of email-tools)
are unaffected. git-tools' read-only repoPath is intentionally left as-is.

Tests mock ./file-system.js (the sandbox is covered by file-system.test.ts)
so the pdf/image logic tests stay decoupled, plus new rejection tests that
assert Access denied and that no fs call is made.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
createScopedFs gives custom tools (local+filesystem permission) a
workspace-jailed filesystem API. resolveSafePath blocked ../ traversal with
path.resolve + startsWith but performed no realpath, and path.resolve does
not follow symlinks. A symlinked entry inside the workspace (pnpm
node_modules links, a checked-out repo, etc.) therefore passed the lexical
prefix check while the real target was outside, letting readFile/writeFile/
stat/unlink/exists escape the jail.

Make resolveSafePath async: after the cheap lexical check, realpath both the
target and the workspace (resolving the nearest existing ancestor for
not-yet-existing write/mkdir targets), re-check containment against the real
workspace, and operate on the real path. All callsites await it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
An ACP session has two completion paths: runAcpPrompt (on prompt resolution)
and the onStateChange('closed'/'error') handler. AcpClient.close() calls
setState('closed'), which synchronously invokes onStateChange. With no
re-entry guard this caused two bugs:

- terminateSession called acpClient.close() before marking the session
  terminated, so the close-triggered onStateChange ran the full completion
  path (state -> completed, fire callbacks, persistResult) and recorded a
  user-terminated run as completed with exit code 0.
- a late teardown 'error' after a clean prompt resolution flipped an already
  'completed' session to 'failed'.

Add isTerminalState() and guard both onStateChange and onError against
re-entering a terminal session, and set 'terminated' before close() in
terminateSession so the close-triggered event no-ops.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
createSanitizedEnv builds the environment for every spawned coding-agent
CLI (Codex, Claude Code, Gemini, custom). It spread all of process.env and
stripped only five OwnPilot-specific patterns, so the CLI inherited every
other secret: OPENAI_API_KEY and other providers' keys, AWS/cloud
credentials, SMTP_PASS, *_TOKEN, ENCRYPTION_KEY, REDIS_URL, etc. These CLIs
run arbitrary shell commands and are steered by the model, so a
prompt-injected or malicious task can exfiltrate them with a single ProgramFiles(x86)=C:\Program Files (x86)
3DVPATH=C:\AMD\Chipset_Software\Binaries\3D_V-Cache_Performance_Optimizer_Driver\
CommonProgramFiles(x86)=C:\Program Files (x86)\Common Files
SHELL=/bin/bash.exe
COREPACK_ENABLE_AUTO_PIN=0
NUMBER_OF_PROCESSORS=32
PROCESSOR_LEVEL=26
ACSvcPort=17532
MAX_THINKING_TOKENS=96000
AI_AGENT=claude-code_2-1-158_agent
USERDOMAIN_ROAMINGPROFILE=WHITE
WT_SESSION=e880613c-0a89-4c3c-81c9-75462b9e9b64
CLAUDE_CODE_SESSION_ID=90c1c8e2-7038-4fa8-8ea1-32301706fbe4
PROGRAMFILES=C:\Program Files
MSYSTEM=MINGW64
ChocolateyInstall=C:\ProgramData\chocolatey
PATHEXT=.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH;.MSC;.PY;.PYW;.CPL
CLAUDE_EFFORT=high
OS=Windows_NT
NVM_SYMLINK=C:\Program Files\nodejs
CLAUDE_CODE_NO_FLICKER=1
HOMEDRIVE=C:
PYENV_VIRTUALENV_DISABLE_PROMPT=1
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=1
USERDOMAIN=WHITE
BASH_MAX_TIMEOUT_MS=1800000
PWD=/d/Codebox/PROJECTS/OwnPilot
USERPROFILE=C:\Users\ersin
OneDriveConsumer=C:\Users\ersin\OneDrive
PNPM_HOME=C:\Users\ersin\AppData\Local\pnpm
CLAUDE_AUTOCOMPACT_PCT_OVERRIDE=90
TMPPREFIX=/c/Users/ersin/AppData/Local/Temp/claude/zsh
NoDefaultCurrentDirectoryInExePath=1
POSH_SHELL=pwsh
CLAUDECODE=1
CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS=64000
ALLUSERSPROFILE=C:\ProgramData
CommonProgramW6432=C:\Program Files\Common Files
ENABLE_PROMPT_CACHING_1H=1
HOME=/c/Users/ersin
USERNAME=ersin
POSH_SHELL_VERSION=5.1.26100.8521
POSH_SESSION_ID=fa836f01-be0f-49b5-8257-49ffb28546ba
PLINK_PROTOCOL=ssh
OneDrive=C:\Users\ersin\OneDrive
COMSPEC=C:\WINDOWS\system32\cmd.exe
VIRTUAL_ENV_DISABLE_PROMPT=1
CONDA_PROMPT_MODIFIER=False
TMPDIR=/c/Users/ersin/AppData/Local/Temp/claude
APPDATA=C:\Users\ersin\AppData\Roaming
SYSTEMROOT=C:\WINDOWS
LOCALAPPDATA=C:\Users\ersin\AppData\Local
CLAUDE_CODE_AUTO_COMPACT_WINDOW=800000
ACSetupSvcPort=23210
COMPUTERNAME=WHITE
TERM=xterm-256color
NVM_HOME=C:\Users\ersin\.config\herd\bin\nvm
LOGONSERVER=\\WHITE
POSH_CURSOR_COLUMN=1
PSModulePath=C:\Users\ersin\OneDrive\Belgeler\WindowsPowerShell\Modules;C:\Program Files\WindowsPowerShell\Modules;C:\WINDOWS\system32\WindowsPowerShell\v1.0\Modules
RlsSvcPort=22112
TEMP=C:\Users\ersin\AppData\Local\Temp
CLAUDE_CODE_TMPDIR=/c/Users/ersin/AppData/Local/Temp/claude
SHLVL=0
PROCESSOR_REVISION=4400
GIT_EDITOR=true
DriverData=C:\Windows\System32\Drivers\DriverData
USER_TYPE=ant
POSH_CURSOR_LINE=10
POSH_INSTALLER=manual
COMMONPROGRAMFILES=C:\Program Files\Common Files
CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY=20
EXEPATH=C:\Program Files\Git\bin
PROCESSOR_IDENTIFIER=AMD64 Family 26 Model 68 Stepping 0, AuthenticAMD
SESSIONNAME=Console
CLAUDE_CODE_ENTRYPOINT=cli
ENABLE_TOOL_SEARCH=auto
WSLENV=WT_SESSION:WT_PROFILE_ID:
API_TIMEOUT_MS=1800000
HOMEPATH=\Users\ersin
CLAUDE_CODE_EXECPATH=C:\Users\ersin\.local\bin\claude.exe
TMP=C:\Users\ersin\AppData\Local\Temp
PATH=/c/Users/ersin/bin:/mingw64/bin:/usr/local/bin:/usr/bin:/bin:/mingw64/bin:/usr/bin:/c/Users/ersin/bin:/c/Python314/Scripts:/c/Python314:/c/Program Files (x86)/Common Files/Oracle/Java/java8path:/c/Program Files (x86)/Common Files/Oracle/Java/javapath:/c/Windows/System32/AMD:/c/WINDOWS/system32:/c/WINDOWS:/c/WINDOWS/System32/Wbem:/c/WINDOWS/System32/WindowsPowerShell/v1.0:/c/WINDOWS/System32/OpenSSH:/c/Program Files/NVIDIA Corporation/NVIDIA App/NvDLISR:/c/Program Files/dotnet:/c/Program Files/nodejs:/c/ProgramData/chocolatey/bin:/c/Users/ersin/AppData/Local/Microsoft/WindowsApps:/c/Users/ersin/.lmstudio/bin:/c/Users/ersin/AppData/Local/Programs/Microsoft VS Code/bin:/c/Users/ersin/AppData/Roaming/npm:/c/Users/ersin/AppData/Local/Programs/Windsurf/bin:/c/Users/ersin/AppData/Local/Programs/Windsurf Next/bin:/c/Users/ersin/AppData/Local/Programs/Antigravity/bin:/c/Users/ersin/AppData/Local/GitHubDesktop/bin:/c/Program Files/Warp/bin:/c/Program Files/Microsoft SQL Server/170/Tools/Binn:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/170/Tools/Binn:/c/Users/ersin/.config/herd/bin/nvm:/c/Program Files/nodejs:/c/Program Files/LLVM/bin:/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/c/composer:/c/Program Files/PuTTY:/cmd:/c/Program Files/GitHub CLI:/c/Program Files/Go/bin:/c/Program Files/Docker/Docker/resources/bin:/c/Program Files/UVtools:/c/Users/ersin/AppData/Local/agy/bin:/c/Users/ersin/AppData/Local/Programs/oh-my-posh/bin:/c/Users/ersin/AppData/Local/pnpm/bin:/c/Users/ersin/AppData/Local/Programs/dfmt:/c/Users/ersin/AppData/Local/Programs/selfclaude/bin:/c/Program Files (x86)/Common Files/Oracle/Java/java8path:/c/Program Files (x86)/Common Files/Oracle/Java/javapath:/c/Windows/System32/AMD:/c/Python314/Scripts:/c/Python314:/c/WINDOWS/system32:/c/WINDOWS:/c/WINDOWS/System32/Wbem:/c/WINDOWS/System32/WindowsPowerShell/v1.0:/c/WINDOWS/System32/OpenSSH:/c/Program Files/NVIDIA Corporation/NVIDIA App/NvDLISR:/c/Program Files/dotnet:/c/Program Files/nodejs:/c/ProgramData/chocolatey/bin:/c/Users/ersin/AppData/Local/Microsoft/WindowsApps:/c/Users/ersin/.lmstudio/bin:/c/Users/ersin/AppData/Local/Programs/Microsoft VS Code/bin:/c/Users/ersin/AppData/Roaming/npm:/c/Users/ersin/AppData/Local/Programs/Windsurf/bin:/c/Users/ersin/AppData/Local/Programs/Antigravity/bin:/c/Users/ersin/AppData/Local/GitHubDesktop/bin:/c/Program Files/Warp/bin:/c/Program Files/Microsoft SQL Server/170/Tools/Binn:/c/Program Files/Microsoft SQL Server/Client SDK/ODBC/170/Tools/Binn:/c/Users/ersin/.config/herd/bin/nvm:/c/Program Files/nodejs:/c/Program Files/LLVM/bin:/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/c/Program Files/Go/bin:/cmd:/c/composer:/c/Program Files/GitHub CLI:/c/Program Files/PuTTY:/c/Program Files/Docker/Docker/resources/bin:/c/Users/ersin/.config/herd/bin:/c/Users/ersin/.local/bin:/c/Users/ersin/.cargo/bin:/c/Users/ersin/AppData/Local/Programs/Trae/bin:/c/Users/ersin/AppData/Local/pnpm:/c/Users/ersin/AppData/Local/Microsoft/WindowsApps:/c/Users/ersin/.lmstudio/bin:/c/Users/ersin/AppData/Local/Programs/Microsoft VS Code/bin:/c/Users/ersin/AppData/Roaming/npm:/c/Users/ersin/AppData/Local/Programs/Windsurf/bin:/c/Users/ersin/AppData/Local/Programs/Antigravity/bin:/c/Users/ersin/AppData/Local/GitHubDesktop/bin:/c/Users/ersin/.deno/bin:/c/Users/ersin/go/bin:/c/Users/ersin/AppData/Local/Python/bin:/c/Users/ersin/AppData/Local/Programs/Microsoft VS Code Insiders/bin:/c/Users/ersin/.dotnet/tools:/c/Users/ersin/AppData/Local/Programs/Kiro/bin:/c/Users/ersin/AppData/Roaming/Python/Python314/Scripts:/c/Users/ersin/.bun/bin:/c/Users/ersin/AppData/Local/Programs/Ollama:/c/Users/ersin/go/bin:/c/Users/ersin/.dotnet/tools:/c/Users/ersin/AppData/Local/Microsoft/WinGet/Packages/PHP.PHP.8.3_Microsoft.Winget.Source_8wekyb3d8bbwe:/c/Users/ersin/AppData/Roaming/Composer/vendor/bin:/c/Program Files (x86)/Nmap:/d/Downloads/rtk-x86_64-pc-windows-msvc:/c/Users/ersin/.dfmt:/c/Users/ersin/AppData/Local/Programs/Zed/bin:/c/Users/ersin/AppData/Local/PowerToys/DSCModules:/c/Users/ersin/go/bin:/c/Users/ersin/.dotnet/tools:/c/Users/ersin/AppData/Local/Programs/Orca/bin:/usr/bin/vendor_perl:/usr/bin/core_perl:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/claude-code-setup/1.0.0/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/claude-md-management/1.0.0/bin:/c/Users/ersin/.claude/plugins/cache/openai-codex/codex/1.0.4/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/commit-commands/unknown/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/context7/unknown/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/github/unknown/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/gopls-lsp/1.0.0/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/greptile/unknown/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/php-lsp/1.0.0/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/playwright/unknown/bin:/c/Users/ersin/.claude/plugins/marketplaces/claude-plugins-official/plugins/plugin-dev/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/rust-analyzer-lsp/1.0.0/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/security-guidance/unknown/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/typescript-lsp/1.0.0/bin:/c/Users/ersin/.claude/plugins/cache/claude-plugins-official/frontend-design/unknown/bin
ProgramW6432=C:\Program Files
CODEX_COMPANION_SESSION_ID=90c1c8e2-7038-4fa8-8ea1-32301706fbe4
POWERLINE_COMMAND=oh-my-posh
WINDIR=C:\WINDOWS
POSH_THEMES_PATH=C:\Users\ersin\AppData\Local\Programs\oh-my-posh\themes\
PROCESSOR_ARCHITECTURE=AMD64
CLAUDE_PLUGIN_DATA=C:/Users/ersin/.claude/plugins/data/codex-openai-codex
WT_PROFILE_ID={61c54bbd-c2c6-5271-96e7-009a87ff44bf}
PUBLIC=C:\Users\Public
SYSTEMDRIVE=C:
GOPATH=C:\Users\ersin\go
ProgramData=C:\ProgramData
ChocolateyLastPathUpdate=134233481594720175
_=/usr/bin/env.

Replace the narrow list with a broad secret-fragment denylist (matching the
local-executor sanitizer): API_KEY/SECRET/TOKEN/PASSWORD/CREDENTIAL/
ACCESS_KEY, the provider prefixes, and SMTP_/DB_/REDIS/ENCRYPTION/etc. The
strip runs before the targeted key injection, so the provider's own key is
still set even though the ambient one is removed; PATH/HOME/NODE_ENV and
other non-secret vars are preserved.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
runClaudeCode (the SDK execution path) built sdkEnv from { ...process.env,
ANTHROPIC_API_KEY } and passed it to the Claude Code SDK's query() env
option, bypassing createSanitizedEnv. The sibling runCodex/runGeminiCli
paths use the sanitizer, so the SDK path was the one place still leaking the
gateway's ambient secrets (other providers' keys, cloud creds, SMTP
password) into Claude Code, which runs arbitrary Bash.

Route the SDK env through createSanitizedEnv('claude-code', apiKey), which
strips secrets and injects ANTHROPIC_API_KEY. Completes the env-leak fix
across all coding-agent execution paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On Windows the CLI binaries are npm .cmd shims, which Node only runs with
shell:true (CVE-2024-27980), and passing an args array with shell:true emits
DEP0190 — so spawnAndCollect joined command+args into a single cmd string.
It wrapped each token as `"${a}"` without escaping embedded quotes, so a
double quote in the binary path or an argument (config-controlled) could
break out of the quotes and inject commands via cmd.exe (the prompt itself
is passed safely via stdin).

Add escapeWindowsArg implementing the qntm.org/cmd algorithm used by
cross-spawn: escape embedded quotes for the program's argv parser, wrap in
quotes, then caret-escape every cmd.exe metacharacter. spawnAndCollect now
maps each token through it. The non-Windows path (no shell, args array) is
unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes all 10 open Dependabot alerts (6 high, 4 moderate), all for
react-router 7.13.1:
- DoS via unbounded path expansion in __manifest (high)
- arbitrary constructor invocation in vendored turbo-stream v2 (high)
- XSS in unstable RSC redirect handling (high)
- open redirect via same-origin redirect with // path (moderate)
- stored XSS via unescaped Location header in prerender (moderate)

All ranges are closed by >= 7.15.0; bumped to ^7.16.0 (latest 7.x).
Lockfile diff is limited to react-router/react-router-dom 7.13.1 -> 7.16.0.

Verified: UI typecheck clean, 347/347 UI tests pass, UI build succeeds.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
vi.mock('child_process', () => ({ spawn })) replaced the entire module,
dropping other named exports (exec, etc.) that another module in the
import chain imports — failing vitest collection with
'No "exec" export is defined on the "child_process" mock'. This was a
pre-existing failure on main (blocks the CI test step for every PR).

Use the importOriginal pattern to keep the real module and override only
spawn. database/schema.test.ts: 18/18 pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@ersinkoc ersinkoc changed the title fix(ui): bump react-router-dom to ^7.16.0 (resolves 10 Dependabot alerts) fix: react-router CVEs (10 Dependabot alerts) + repair pre-existing database/schema test Jun 4, 2026
@ersinkoc ersinkoc merged commit 5385201 into main Jun 4, 2026
2 checks passed
@ersinkoc ersinkoc deleted the fix/react-router-cves branch June 4, 2026 08:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant