Skip to content

feat: coding agent and update PR Trigger.dev tasks#41

Merged
sweetmantech merged 16 commits intomainfrom
sweetmantech/myc-4371-recoup-coding-sandbox-task-tasks-for-coding-and-pr-review
Mar 6, 2026
Merged

feat: coding agent and update PR Trigger.dev tasks#41
sweetmantech merged 16 commits intomainfrom
sweetmantech/myc-4371-recoup-coding-sandbox-task-tasks-for-coding-and-pr-review

Conversation

@sweetmantech
Copy link
Copy Markdown
Contributor

@sweetmantech sweetmantech commented Feb 28, 2026

Summary

  • Adds coding-agent task: creates Vercel Sandbox, clones monorepo with submodules, runs OpenClaw AI agent, detects changed submodules, creates PRs per submodule (respecting base branch rules), notifies API via callback
  • Adds update-pr task: resumes sandbox from snapshot, applies user feedback via OpenClaw, pushes updates to existing PR branches, notifies API
  • Sandbox helpers: cloneRecoupMonorepo, detectChangedSubmodules, createSubmodulePR, notifyCodingAgentCallback
  • Zod schemas for both task payloads
  • 21 unit tests across 6 test files (TDD red-green-refactor)

New Environment Variables

Variable Purpose
CODING_AGENT_CALLBACK_URL URL of the API callback endpoint
CODING_AGENT_CALLBACK_SECRET Shared secret for callback auth

Test plan

  • Verify all 21 unit tests pass (pnpm test)
  • Set env vars in Trigger.dev
  • Deploy tasks to Trigger.dev (pnpm run deploy:trigger-prod)
  • Trigger coding-agent task manually → verify sandbox creation, monorepo clone, PR creation
  • Trigger update-pr task with snapshot ID → verify sandbox resume and PR update
  • Verify callback notifications reach the API endpoint

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added coding agent background task with monorepo support and automated pull request creation
    • Added pull request update task for applying feedback and updating branches
    • Implemented callback notification system for coding agent operation completion
    • Refactored GitHub authentication handling with improved token management
  • Tests

    • Added comprehensive test coverage for coding agent operations and pull request updates

sweetmantech and others added 2 commits February 25, 2026 18:54
Add `tags.add(\`account:${accountId}\`)` to runSandboxCommandTask and
customerPromptTask so runs can be queried by account via the Trigger.dev
runs.list API.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements the task-side infrastructure for the Slack-driven coding agent:
- coding-agent task: creates sandbox, clones monorepo, runs OpenClaw, detects
  changed submodules, creates PRs per submodule, and notifies via callback
- update-pr task: resumes sandbox from snapshot, applies feedback, pushes
  updates to existing PRs
- Sandbox helpers: cloneRecoupMonorepo, detectChangedSubmodules,
  createSubmodulePR, notifyCodingAgentCallback
- Zod schemas for both task payloads
- 21 unit tests covering all helpers and both tasks

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 28, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces a complete coding agent workflow system including new background tasks (codingAgentTask, updatePRTask), Zod schemas for payload validation, a notification callback mechanism, and refactored sandbox utilities (getSandboxHomeDir, getGitHubAuthPrefix, cloneMonorepoViaAgent, parsePRUrls, pushAndCreatePRsViaAgent). Includes comprehensive test coverage for all new components.

Changes

Cohort / File(s) Summary
Payload Schemas
src/schemas/codingAgentSchema.ts, src/schemas/updatePRSchema.ts
Adds Zod validation schemas for coding agent payloads (prompt, callbackThreadId) and PR update payloads (feedback, branch, snapshotId, prs array, callbackThreadId). Exports inferred TypeScript types and interfaces.
Notification Callback System
src/sandboxes/notifyCodingAgentCallback.ts, src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts
Implements callback notification mechanism with environment-based configuration. Sends POST requests with secret header validation, handles missing URL errors, and includes comprehensive test coverage for success and failure paths.
Background Tasks
src/tasks/codingAgentTask.ts, src/tasks/updatePRTask.ts
Introduces two schemaTask exports orchestrating complex workflows: codingAgentTask spins up sandbox, installs OpenClaw, clones monorepo, runs agent, detects changes, creates PRs, and notifies callback; updatePRTask resumes snapshot, applies feedback via agent, pushes changes across PR branches, and notifies callback with new snapshot.
Sandbox Home Directory Utility
src/sandboxes/getSandboxHomeDir.ts, src/sandboxes/__tests__/getSandboxHomeDir.test.ts
Adds utility to resolve sandbox home directory via shell command with "/root" fallback. Includes tests for standard paths, empty/null outputs, and whitespace trimming.
GitHub Authentication Prefix
src/sandboxes/getGitHubAuthPrefix.ts, src/sandboxes/__tests__/getGitHubAuthPrefix.test.ts
Introduces function extracting GITHUB_TOKEN and returning authenticated URL prefix for git operations, or null if token missing. Centralizes GitHub auth handling for consistent URL rewriting.
Monorepo Clone via Agent
src/sandboxes/cloneMonorepoViaAgent.ts, src/sandboxes/__tests__/cloneMonorepoViaAgent.test.ts
Delegates monorepo cloning to OpenClaw agent with labeled instructions for clone and submodule initialization. Tests verify agent invocation with correct message structure.
PR Creation & Parsing
src/sandboxes/parsePRUrls.ts, src/sandboxes/pushAndCreatePRsViaAgent.ts, src/sandboxes/__tests__/parsePRUrls.test.ts, src/sandboxes/__tests__/pushAndCreatePRsViaAgent.test.ts
Adds parser extracting PR URLs from agent stdout (PR_CREATED: lines) with baseBranch lookup from config. Implements push/PR creation orchestration via agent with per-submodule instructions and sentinel line detection.
Submodule Configuration
src/sandboxes/submoduleConfig.ts
Exports static configuration mapping submodule names to repos and base branches (api, chat, tasks, docs, database, remotion, bash, skills, cli).
Refactored Git Utilities
src/sandboxes/ensureGithubRepo.ts, src/sandboxes/git/addOrgSubmodules.ts, src/sandboxes/git/pushOrgRepos.ts
Updates GitHub repo handling to use getGitHubAuthPrefix instead of direct GITHUB_TOKEN embedding; replaces manual home directory resolution with getSandboxHomeDir utility in addOrgSubmodules and pushOrgRepos.
Test Infrastructure Updates
src/sandboxes/__tests__/addOrgSubmodules.test.ts, src/sandboxes/__tests__/pushOrgRepos.test.ts, src/tasks/__tests__/codingAgentTask.test.ts, src/tasks/__tests__/updatePRTask.test.ts, src/sandboxes/__tests__/logStep.test.ts
Introduces getSandboxHomeDir mocks in test suites, removes inline home directory resolution mocks. Adds comprehensive test coverage for codingAgentTask (full flow, no-changes, failure paths) and updatePRTask (sandbox resume, agent invocation, push delegation, callback notification).
Enhanced Logging
src/sandboxes/logStep.ts
Extends logStep signature to accept optional extra parameter (Record<string, unknown>) passed to logger.log without including in metadata.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant API
    participant Sandbox
    participant OpenClaw as OpenClaw Agent
    participant GitHub
    participant Callback
    
    User->>API: Trigger codingAgentTask
    API->>Sandbox: Create & setup (install OpenClaw)
    Sandbox-->>API: Ready
    API->>Sandbox: Clone monorepo via agent
    Sandbox->>OpenClaw: Clone with instructions
    OpenClaw->>GitHub: Fetch monorepo
    GitHub-->>OpenClaw: Monorepo
    OpenClaw-->>Sandbox: Clone complete
    API->>Sandbox: Run AI agent with prompt
    Sandbox->>OpenClaw: Execute with prompt
    OpenClaw->>Sandbox: Generate changes
    Sandbox-->>API: Changes detected
    API->>Sandbox: Detect changed submodules
    alt Changes Found
        API->>Sandbox: Create PRs via agent
        Sandbox->>OpenClaw: Push & PR instructions
        OpenClaw->>GitHub: Create branches & PRs
        GitHub-->>OpenClaw: PR URLs
        OpenClaw-->>Sandbox: PR_CREATED lines
        API->>Sandbox: Snapshot sandbox
        Sandbox-->>API: Snapshot ID
        API->>Callback: Notify pr_created status
    else No Changes
        API->>Callback: Notify no_changes status
    end
    API->>Sandbox: Stop & cleanup
Loading
sequenceDiagram
    actor User
    participant API
    participant Sandbox
    participant OpenClaw as OpenClaw Agent
    participant GitHub
    participant Callback
    
    User->>API: Trigger updatePRTask
    API->>Sandbox: Resume from snapshot
    Sandbox-->>API: Ready
    API->>Sandbox: Setup OpenClaw agent
    Sandbox-->>API: Agent ready
    API->>Sandbox: Run agent with feedback
    Sandbox->>OpenClaw: Apply feedback changes
    OpenClaw-->>Sandbox: Changes applied
    API->>Sandbox: Get PR list & push updates
    Sandbox->>OpenClaw: Stage, commit, push per PR
    OpenClaw->>GitHub: Push to PR branches
    GitHub-->>OpenClaw: Updates accepted
    OpenClaw-->>Sandbox: Push complete
    API->>Sandbox: Snapshot updated state
    Sandbox-->>API: New snapshot ID
    API->>Callback: Notify updated status
    Callback-->>API: Acknowledged
    API->>Sandbox: Stop & cleanup
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 A coding agent hops through the repo,
Cloning and crafting with OpenClaw's flow,
PRs spring forth from the sandbox's might,
Updates cascade through branches at night,
Callbacks confirm—the work's burning bright! 🌟

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 59.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat: coding agent and update PR Trigger.dev tasks' accurately summarizes the primary changes: two new Trigger.dev tasks (coding-agent and update-pr) for coding agent functionality and PR updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sweetmantech/myc-4371-recoup-coding-sandbox-task-tasks-for-coding-and-pr-review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Update clone directory from /root/monorepo to /home/user/monorepo to
match actual Vercel Sandbox environment, and fix repo name from
recoupable/Recoup-Monorepo to recoupable/mono.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 17

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sandboxes/__tests__/createSubmodulePR.test.ts`:
- Around line 68-73: The test reveals runGitCommand is called with a fourth
argument (subDir) but the function is defined with only three parameters; update
the runGitCommand function signature to accept a fourth parameter (e.g.,
subDir?: string) and propagate it through its implementation and type
definitions, then ensure all callers (including createSubmodulePR's calls) pass
that subDir arg correctly; alternatively, if you prefer removing subDir, remove
the extra argument from all runGitCommand call sites (including the calls in
createSubmodulePR) and update the test to match three-argument calls—prefer the
first option (add subDir) to keep existing callers working.

In `@src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts`:
- Around line 1-16: The test mutates process.env and stubs global.fetch without
restoring them; save the originals at top-level (e.g., const ORIGINAL_ENV = {
...process.env } and const originalFetch = global.fetch) and add an afterEach
that restores process.env = { ...ORIGINAL_ENV }, clears the specific env keys
(CODING_AGENT_CALLBACK_URL, CODING_AGENT_CALLBACK_SECRET) if needed, and
restores global.fetch to originalFetch (or uses vi.unstubGlobal("fetch") /
delete global.fetch then reassign originalFetch), and also call
vi.clearAllMocks()/vi.resetAllMocks() to ensure notifyCodingAgentCallback tests
are isolated. Reference symbols: mockFetch, notifyCodingAgentCallback,
CODING_AGENT_CALLBACK_URL, CODING_AGENT_CALLBACK_SECRET.

In `@src/sandboxes/cloneRecoupMonorepo.ts`:
- Around line 23-29: Replace embedding githubToken in authUrl and passing it on
the git command; instead construct the repo URL without credentials (use
MONOREPO_REPO to form "https://github.com/… .git") and supply the token via a
secure runtime mechanism to sandbox.runCommand (e.g., set a temporary
GIT_ASKPASS script in the command's env that echoes githubToken, or invoke git
with a runtime config like -c http.extraheader="AUTHORIZATION: bearer <token>"
built from the in-memory githubToken) so the token is never part of the command
args or URL or persisted in git config; update the code that builds authUrl and
the sandbox.runCommand call (references: authUrl, githubToken,
sandbox.runCommand, cloneResult, CLONE_DIR) to use the repo-only URL and pass
credentials via env or git -c instead.
- Around line 39-60: The git config commands call runGitCommand (the three calls
configuring user.email, user.name, and the url.insteadOf rewrite using
CLONE_DIR) but their return values are ignored; update the surrounding function
(the clone/checkout routine) to capture each runGitCommand result and if any
call returns a falsy/error value, log or propagate the error and return false
(or throw) so the function does not report success; specifically check the
results of the runGitCommand calls that set user.email, user.name, and the URL
rewrite and bail out immediately on failure.

In `@src/sandboxes/createSubmodulePR.ts`:
- Around line 92-97: The parsed PR number from prResult.stdout() (prUrl ->
prNumber) must be validated before returning; inside createSubmodulePR, after
computing prUrl and prNumber, check that prNumber is a valid positive integer
(e.g., Number.isFinite(prNumber) && prNumber > 0) and if not, log an error via
logger.error including the raw prUrl and throw or return a failure (do not
return success with number 0/NaN). Ensure this validation is applied to the
variables prResult, prUrl and prNumber so callers never receive an invalid PR
number.
- Around line 56-70: The sequence of git operations in createSubmodulePR.ts
ignores runGitCommand's false return and lets execution continue; update the
code around the runGitCommand calls (the calls for checkout branch, add -A,
commit, and push -u origin) to check each awaited result and if any call returns
false immediately stop further work by logging the failure (using the existing
logger) and returning or throwing so the subsequent gh pr create step never
runs; ensure checks are done for the specific calls to runGitCommand that create
the branch, stage, commit, and push.

In `@src/sandboxes/detectChangedSubmodules.ts`:
- Around line 17-23: In detectChangedSubmodules.ts, don't treat failures from
sandbox.runCommand as implicit "no changes": check the runCommand results
(lsResult and the git status result) for success and handle errors explicitly by
surfacing or returning an error instead of returning a false/no_changes outcome;
specifically inspect lsResult.success and the git command result, propagate or
throw a descriptive error including stderr/stdout when either command fails, and
only proceed to parse output when the runCommand call succeeded (references:
lsResult, gitResult/variable used for git status, and the enclosing function
detectChangedSubmodules).

In `@src/sandboxes/notifyCodingAgentCallback.ts`:
- Around line 21-27: The code currently only aborts when
CODING_AGENT_CALLBACK_URL is missing but still sends callbacks with an empty
x-callback-secret; update the startup/dispatch guard to require
CODING_AGENT_CALLBACK_SECRET as well by checking the
process.env.CODING_AGENT_CALLBACK_SECRET (variable secret) wherever you validate
CODING_AGENT_CALLBACK_URL (variable url) — if secret is falsy, call logger.error
with a clear message and return/abort before any request is dispatched (same
pattern used for url), and ensure the HTTP dispatch path that sets the
x-callback-secret header (in notifyCodingAgentCallback) will never send an empty
header when secret is absent.

In `@src/tasks/codingAgentTask.ts`:
- Around line 43-120: The try/finally block in runOpenClawAgent flow lacks a
catch to notify the callback on unexpected errors, so add a catch between the
try and finally that catches any exception, calls notifyCodingAgentCallback(...)
with threadId: callbackThreadId, status: "failed", message set to the error
message (and include error.stack or serialized error), and include any available
stdout/stderr, then rethrow the error; update the block around
installOpenClaw/cloneRecoupMonorepo/runOpenClawAgent/detectChangedSubmodules/createSubmodulePR/sandbox.snapshot
to ensure the catch uses the same callback payload shape as other notifications
and still allows the finally to stop the sandbox.
- Around line 48-50: The gh installation call in codingAgentTask.ts uses
sandbox.runCommand inside logStep but ignores the result; update the code around
logStep("Installing gh CLI") to capture the return value from sandbox.runCommand
(the call that runs "which gh || (apt-get update -qq && apt-get install -y -qq
gh)"), check its exitCode and/or stderr, and if it indicates failure, log a
clear error (using the existing logger) and throw or return an error so
subsequent PR creation steps do not proceed on a failed install; ensure you
reference the sandbox.runCommand result variable and handle both non-zero
exitCode and any error output.
- Around line 62-67: The code calls runOpenClawAgent and ignores its exit
status; update the flow to fail fast when agentResult.exitCode is non-zero by
checking agentResult.exitCode after the await (where agentResult is assigned
from runOpenClawAgent), calling processLogger.error or logStep with the
agentResult output/error details, and then throw an Error or return a failure
result to stop further processing so the task does not proceed to produce
misleading no_changes/partial-success outcomes.
- Around line 87-107: The code currently treats a situation where
changedSubmodules.length > 0 but every createSubmodulePR(...) returned null as
"no_changes"; instead detect this condition and report an explicit failure
status: after building prs from createSubmodulePR calls (variables
changedSubmodules, prs, createSubmodulePR), if changedSubmodules.length > 0 &&
prs.length === 0 set the notifyCodingAgentCallback status to something like
"pr_failed" (or another failure enum used by notifyCodingAgentCallback) rather
than "no_changes", and ensure any downstream logic (snapshot/notifications) uses
that status so operational failures aren’t hidden.

In `@src/tasks/customerPromptTask.ts`:
- Line 60: Make tagging best-effort by preventing failures in
tags.add(`account:${accountId}`) from aborting the task: wrap the call to
tags.add (the line using tags.add and accountId in customerPromptTask.ts) in a
try/catch (or use .catch()) and on error log the failure with context (include
accountId and the error) but do not rethrow—allow prompt generation to continue.

In `@src/tasks/runSandboxCommandTask.ts`:
- Line 32: The call to tags.add(account:${accountId}) inside
runSandboxCommandTask is in the critical path and must not fail the main
operation; wrap the tagging call so failures are non-blocking (either await it
inside a try/catch or call tags.add(...).catch(...)) and on error log a warning
including the accountId and error (use the existing logger/processLogger), but
do not rethrow or return the error—allow the sandbox command to continue even if
tags.add fails.

In `@src/tasks/updatePRTask.ts`:
- Around line 42-91: Wrap the existing try block in updatePRTask with an
explicit catch that calls notifyCodingAgentCallback({ threadId:
callbackThreadId, status: "failed", error: <stringified error>, snapshotId:
maybe undefined }) so the caller is informed on unexpected failures; inside the
catch set metadata.set("currentStep","Failed") and include a concise error
message (e.g. error.message or JSON.stringify(error)) when calling
notifyCodingAgentCallback, then rethrow the error so existing finally (which
stops the sandbox via sandbox.stop()) still runs; reference the existing symbols
runOpenClawAgent, notifyCodingAgentCallback, metadata, sandbox, and
callbackThreadId when implementing this change.
- Around line 47-51: The AI-agent feedback application step (runOpenClawAgent
call in updatePRTask) can fail but the task continues; change the code to detect
a non-success result or thrown error from runOpenClawAgent and abort before any
branch push or marking the run updated. Specifically, call runOpenClawAgent
inside a try/catch (or inspect its returned exit/status value), and if it throws
or returns a non-zero/failure status, log the error via logStep/process logger
and rethrow or return an error so subsequent steps (branch pushes/marking
updated) do not run; ensure the logic references runOpenClawAgent and the
surrounding updatePRTask flow so the early exit prevents pushes.
- Around line 53-73: The add/commit/push steps in the for loop over prs are
currently fire-and-forget; update the loop in updatePRTask.ts so each
runGitCommand call (the calls named "stage changes", "commit feedback", and
"push updates") is awaited and its result checked and on failure either throw or
mark that PR as not updated and break/continue accordingly; use the existing
runGitCommand return/throw behavior to gate progression (so only run commit if
add succeeded and only push if commit succeeded), referencing runGitCommand,
prs, MONOREPO_DIR, feedback and branch to locate the code to change.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 172d971 and 2cefaaf.

📒 Files selected for processing (16)
  • src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts
  • src/sandboxes/__tests__/createSubmodulePR.test.ts
  • src/sandboxes/__tests__/detectChangedSubmodules.test.ts
  • src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts
  • src/sandboxes/cloneRecoupMonorepo.ts
  • src/sandboxes/createSubmodulePR.ts
  • src/sandboxes/detectChangedSubmodules.ts
  • src/sandboxes/notifyCodingAgentCallback.ts
  • src/schemas/codingAgentSchema.ts
  • src/schemas/updatePRSchema.ts
  • src/tasks/__tests__/codingAgentTask.test.ts
  • src/tasks/__tests__/updatePRTask.test.ts
  • src/tasks/codingAgentTask.ts
  • src/tasks/customerPromptTask.ts
  • src/tasks/runSandboxCommandTask.ts
  • src/tasks/updatePRTask.ts

Comment on lines +68 to +73
expect(runGitCommand).toHaveBeenCalledWith(
sandbox,
expect.arrayContaining(["checkout", "-b", "agent/fix-123"]),
expect.any(String),
expect.any(String),
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== runGitCommand definition =="
fd 'runGitCommand.ts' --exec sed -n '1,120p' {}

echo
echo "== createSubmodulePR runGitCommand call sites =="
fd 'createSubmodulePR.ts' --exec rg -nP -C2 '\brunGitCommand\s*\(' {}

echo
echo "== test expectations touching runGitCommand =="
fd 'createSubmodulePR.test.ts' --exec rg -nP -C2 'toHaveBeenCalledWith|runGitCommand'

Repository: recoupable/tasks

Length of output: 1910


This assertion locks in a runGitCommand call-arity mismatch

The function is defined with 3 parameters (sandbox, args, description) but all production calls at src/sandboxes/createSubmodulePR.ts:57-70 pass 4 arguments, including subDir as the 4th argument. The test assertion at lines 68-73 validates this incorrect 4-argument pattern (with two expect.any(String) calls) instead of catching the error.

Fix the function signature to accept the subDir parameter or remove the extra argument from all call sites.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/createSubmodulePR.test.ts` around lines 68 - 73, The
test reveals runGitCommand is called with a fourth argument (subDir) but the
function is defined with only three parameters; update the runGitCommand
function signature to accept a fourth parameter (e.g., subDir?: string) and
propagate it through its implementation and type definitions, then ensure all
callers (including createSubmodulePR's calls) pass that subDir arg correctly;
alternatively, if you prefer removing subDir, remove the extra argument from all
runGitCommand call sites (including the calls in createSubmodulePR) and update
the test to match three-argument calls—prefer the first option (add subDir) to
keep existing callers working.

Comment on lines +1 to +16
import { describe, it, expect, vi, beforeEach } from "vitest";

vi.mock("@trigger.dev/sdk/v3", () => ({
logger: { log: vi.fn(), error: vi.fn() },
}));

const mockFetch = vi.fn();
vi.stubGlobal("fetch", mockFetch);

const { notifyCodingAgentCallback } = await import("../notifyCodingAgentCallback");

beforeEach(() => {
vi.clearAllMocks();
process.env.CODING_AGENT_CALLBACK_URL = "https://recoup-api.vercel.app/api/coding-agent/callback";
process.env.CODING_AGENT_CALLBACK_SECRET = "test-secret";
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Restore env/global state after each test

This suite mutates process.env and stubs fetch but doesn’t restore them, which can bleed state into other tests and create flakes.

Suggested test isolation fix
-import { describe, it, expect, vi, beforeEach } from "vitest";
+import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
@@
 const mockFetch = vi.fn();
-vi.stubGlobal("fetch", mockFetch);
+let previousCallbackUrl: string | undefined;
+let previousCallbackSecret: string | undefined;
@@
 beforeEach(() => {
   vi.clearAllMocks();
+  vi.stubGlobal("fetch", mockFetch);
+  previousCallbackUrl = process.env.CODING_AGENT_CALLBACK_URL;
+  previousCallbackSecret = process.env.CODING_AGENT_CALLBACK_SECRET;
   process.env.CODING_AGENT_CALLBACK_URL = "https://recoup-api.vercel.app/api/coding-agent/callback";
   process.env.CODING_AGENT_CALLBACK_SECRET = "test-secret";
 });
+
+afterEach(() => {
+  vi.unstubAllGlobals();
+  if (previousCallbackUrl === undefined) delete process.env.CODING_AGENT_CALLBACK_URL;
+  else process.env.CODING_AGENT_CALLBACK_URL = previousCallbackUrl;
+  if (previousCallbackSecret === undefined) delete process.env.CODING_AGENT_CALLBACK_SECRET;
+  else process.env.CODING_AGENT_CALLBACK_SECRET = previousCallbackSecret;
+});
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { describe, it, expect, vi, beforeEach } from "vitest";
vi.mock("@trigger.dev/sdk/v3", () => ({
logger: { log: vi.fn(), error: vi.fn() },
}));
const mockFetch = vi.fn();
vi.stubGlobal("fetch", mockFetch);
const { notifyCodingAgentCallback } = await import("../notifyCodingAgentCallback");
beforeEach(() => {
vi.clearAllMocks();
process.env.CODING_AGENT_CALLBACK_URL = "https://recoup-api.vercel.app/api/coding-agent/callback";
process.env.CODING_AGENT_CALLBACK_SECRET = "test-secret";
});
import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
vi.mock("@trigger.dev/sdk/v3", () => ({
logger: { log: vi.fn(), error: vi.fn() },
}));
const mockFetch = vi.fn();
let previousCallbackUrl: string | undefined;
let previousCallbackSecret: string | undefined;
const { notifyCodingAgentCallback } = await import("../notifyCodingAgentCallback");
beforeEach(() => {
vi.clearAllMocks();
vi.stubGlobal("fetch", mockFetch);
previousCallbackUrl = process.env.CODING_AGENT_CALLBACK_URL;
previousCallbackSecret = process.env.CODING_AGENT_CALLBACK_SECRET;
process.env.CODING_AGENT_CALLBACK_URL = "https://recoup-api.vercel.app/api/coding-agent/callback";
process.env.CODING_AGENT_CALLBACK_SECRET = "test-secret";
});
afterEach(() => {
vi.unstubAllGlobals();
if (previousCallbackUrl === undefined) delete process.env.CODING_AGENT_CALLBACK_URL;
else process.env.CODING_AGENT_CALLBACK_URL = previousCallbackUrl;
if (previousCallbackSecret === undefined) delete process.env.CODING_AGENT_CALLBACK_SECRET;
else process.env.CODING_AGENT_CALLBACK_SECRET = previousCallbackSecret;
});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts` around lines 1 -
16, The test mutates process.env and stubs global.fetch without restoring them;
save the originals at top-level (e.g., const ORIGINAL_ENV = { ...process.env }
and const originalFetch = global.fetch) and add an afterEach that restores
process.env = { ...ORIGINAL_ENV }, clears the specific env keys
(CODING_AGENT_CALLBACK_URL, CODING_AGENT_CALLBACK_SECRET) if needed, and
restores global.fetch to originalFetch (or uses vi.unstubGlobal("fetch") /
delete global.fetch then reassign originalFetch), and also call
vi.clearAllMocks()/vi.resetAllMocks() to ensure notifyCodingAgentCallback tests
are isolated. Reference symbols: mockFetch, notifyCodingAgentCallback,
CODING_AGENT_CALLBACK_URL, CODING_AGENT_CALLBACK_SECRET.

Comment on lines +23 to +29
const authUrl = `https://x-access-token:${githubToken}@github.com/${MONOREPO_REPO}.git`;

logger.log("Cloning Recoup monorepo with submodules", { repo: MONOREPO_REPO });

const cloneResult = await sandbox.runCommand({
cmd: "git",
args: ["clone", "--recurse-submodules", authUrl, CLONE_DIR],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid embedding GITHUB_TOKEN directly in git URLs/config

Placing the token in clone/config URLs increases the chance of secret leakage via command tracing, process inspection, or persisted git config in snapshots.

Also applies to: 55-57

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/cloneRecoupMonorepo.ts` around lines 23 - 29, Replace embedding
githubToken in authUrl and passing it on the git command; instead construct the
repo URL without credentials (use MONOREPO_REPO to form "https://github.com/…
.git") and supply the token via a secure runtime mechanism to sandbox.runCommand
(e.g., set a temporary GIT_ASKPASS script in the command's env that echoes
githubToken, or invoke git with a runtime config like -c
http.extraheader="AUTHORIZATION: bearer <token>" built from the in-memory
githubToken) so the token is never part of the command args or URL or persisted
in git config; update the code that builds authUrl and the sandbox.runCommand
call (references: authUrl, githubToken, sandbox.runCommand, cloneResult,
CLONE_DIR) to use the repo-only URL and pass credentials via env or git -c
instead.

Comment on lines +56 to +70
// Create branch, stage, commit, push
await runGitCommand(sandbox, ["-C", subDir, "checkout", "-b", branch], "create branch", subDir);
await runGitCommand(sandbox, ["-C", subDir, "add", "-A"], "stage changes", subDir);
await runGitCommand(
sandbox,
["-C", subDir, "commit", "-m", `agent: ${prompt.slice(0, 72)}`],
"commit changes",
subDir,
);
await runGitCommand(
sandbox,
["-C", subDir, "push", "-u", "origin", branch],
"push branch",
subDir,
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stop immediately when a git step fails.

runGitCommand returns false on failure, but the current flow ignores it and continues to gh pr create, which can cascade errors and produce misleading outcomes.

✅ Proposed fix
   // Create branch, stage, commit, push
-  await runGitCommand(sandbox, ["-C", subDir, "checkout", "-b", branch], "create branch", subDir);
-  await runGitCommand(sandbox, ["-C", subDir, "add", "-A"], "stage changes", subDir);
-  await runGitCommand(
+  if (!(await runGitCommand(sandbox, ["-C", subDir, "checkout", "-b", branch], "create branch"))) {
+    return null;
+  }
+  if (!(await runGitCommand(sandbox, ["-C", subDir, "add", "-A"], "stage changes"))) {
+    return null;
+  }
+  if (!(await runGitCommand(
     sandbox,
     ["-C", subDir, "commit", "-m", `agent: ${prompt.slice(0, 72)}`],
-    "commit changes",
-    subDir,
-  );
-  await runGitCommand(
+    "commit changes",
+  ))) {
+    return null;
+  }
+  if (!(await runGitCommand(
     sandbox,
     ["-C", subDir, "push", "-u", "origin", branch],
-    "push branch",
-    subDir,
-  );
+    "push branch",
+  ))) {
+    return null;
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/createSubmodulePR.ts` around lines 56 - 70, The sequence of git
operations in createSubmodulePR.ts ignores runGitCommand's false return and lets
execution continue; update the code around the runGitCommand calls (the calls
for checkout branch, add -A, commit, and push -u origin) to check each awaited
result and if any call returns false immediately stop further work by logging
the failure (using the existing logger) and returning or throwing so the
subsequent gh pr create step never runs; ensure checks are done for the specific
calls to runGitCommand that create the branch, stage, commit, and push.

return;
}

await tags.add(`account:${accountId}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make account tagging best-effort, not task-blocking

A failure in tags.add() will currently abort customer prompt execution. Tagging should be best-effort so prompt generation can still proceed.

Suggested hardening
-    await tags.add(`account:${accountId}`);
+    try {
+      await tags.add(`account:${accountId}`);
+    } catch (error) {
+      logger.warn("Failed to attach account tag", {
+        accountId,
+        error: error instanceof Error ? error.message : String(error),
+      });
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
await tags.add(`account:${accountId}`);
try {
await tags.add(`account:${accountId}`);
} catch (error) {
logger.warn("Failed to attach account tag", {
accountId,
error: error instanceof Error ? error.message : String(error),
});
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/customerPromptTask.ts` at line 60, Make tagging best-effort by
preventing failures in tags.add(`account:${accountId}`) from aborting the task:
wrap the call to tags.add (the line using tags.add and accountId in
customerPromptTask.ts) in a try/catch (or use .catch()) and on error log the
failure with context (include accountId and the error) but do not rethrow—allow
prompt generation to continue.

},
run: async (payload): Promise<SandboxResult> => {
const { command, args, cwd, sandboxId, accountId } = payload;
await tags.add(`account:${accountId}`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle tag failures as non-blocking metadata

tags.add() is in the critical path. If tagging fails transiently, the entire sandbox command fails even though the main operation could continue.

Suggested hardening
-    await tags.add(`account:${accountId}`);
+    try {
+      await tags.add(`account:${accountId}`);
+    } catch (error) {
+      logger.warn("Failed to attach account tag", {
+        accountId,
+        error: error instanceof Error ? error.message : String(error),
+      });
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
await tags.add(`account:${accountId}`);
try {
await tags.add(`account:${accountId}`);
} catch (error) {
logger.warn("Failed to attach account tag", {
accountId,
error: error instanceof Error ? error.message : String(error),
});
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/runSandboxCommandTask.ts` at line 32, The call to
tags.add(account:${accountId}) inside runSandboxCommandTask is in the critical
path and must not fail the main operation; wrap the tagging call so failures are
non-blocking (either await it inside a try/catch or call
tags.add(...).catch(...)) and on error log a warning including the accountId and
error (use the existing logger/processLogger), but do not rethrow or return the
error—allow the sandbox command to continue even if tags.add fails.

Comment on lines +42 to +91
try {
logStep("Ensuring OpenClaw is running");
await installOpenClaw(sandbox);
await setupOpenClaw(sandbox, CODING_AGENT_ACCOUNT_ID);

logStep("Running AI agent with feedback");
await runOpenClawAgent(sandbox, {
label: "Apply feedback",
message: `The following feedback was given on the existing changes on branch "${branch}":\n\n${feedback}\n\nPlease make the requested changes.`,
});

logStep("Pushing updates to PR branches");
for (const pr of prs) {
const submodule = pr.repo.split("/").pop()!;
// Remap repo name to submodule dir name
const subDir = submodule === "recoup-api" ? "api" : submodule;
const fullDir = `${MONOREPO_DIR}/${subDir}`;

await runGitCommand(sandbox, ["-C", fullDir, "add", "-A"], "stage changes", fullDir);
await runGitCommand(
sandbox,
["-C", fullDir, "commit", "-m", `agent: address feedback\n\n${feedback.slice(0, 200)}`],
"commit feedback",
fullDir,
);
await runGitCommand(
sandbox,
["-C", fullDir, "push", "origin", branch],
"push updates",
fullDir,
);
}

logStep("Taking new snapshot");
const newSnapshot = await sandbox.snapshot();

logStep("Notifying bot");
await notifyCodingAgentCallback({
threadId: callbackThreadId,
status: "updated",
snapshotId: newSnapshot.snapshotId,
});

metadata.set("currentStep", "Complete");

return { snapshotId: newSnapshot.snapshotId };
} finally {
logger.log("Stopping sandbox", { sandboxId: sandbox.sandboxId });
await sandbox.stop();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add a catch branch to notify callback on unexpected failures.

Without a catch, thrown errors skip callback notification and the caller may never receive a terminal status update.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 42 - 91, Wrap the existing try block
in updatePRTask with an explicit catch that calls notifyCodingAgentCallback({
threadId: callbackThreadId, status: "failed", error: <stringified error>,
snapshotId: maybe undefined }) so the caller is informed on unexpected failures;
inside the catch set metadata.set("currentStep","Failed") and include a concise
error message (e.g. error.message or JSON.stringify(error)) when calling
notifyCodingAgentCallback, then rethrow the error so existing finally (which
stops the sandbox via sandbox.stop()) still runs; reference the existing symbols
runOpenClawAgent, notifyCodingAgentCallback, metadata, sandbox, and
callbackThreadId when implementing this change.

Comment on lines +47 to +51
logStep("Running AI agent with feedback");
await runOpenClawAgent(sandbox, {
label: "Apply feedback",
message: `The following feedback was given on the existing changes on branch "${branch}":\n\n${feedback}\n\nPlease make the requested changes.`,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle non-zero AI agent exit before pushing branches.

The task currently proceeds even if the feedback application step fails, which can mark runs as updated without valid changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 47 - 51, The AI-agent feedback
application step (runOpenClawAgent call in updatePRTask) can fail but the task
continues; change the code to detect a non-success result or thrown error from
runOpenClawAgent and abort before any branch push or marking the run updated.
Specifically, call runOpenClawAgent inside a try/catch (or inspect its returned
exit/status value), and if it throws or returns a non-zero/failure status, log
the error via logStep/process logger and rethrow or return an error so
subsequent steps (branch pushes/marking updated) do not run; ensure the logic
references runOpenClawAgent and the surrounding updatePRTask flow so the early
exit prevents pushes.

Comment on lines +53 to +73
logStep("Pushing updates to PR branches");
for (const pr of prs) {
const submodule = pr.repo.split("/").pop()!;
// Remap repo name to submodule dir name
const subDir = submodule === "recoup-api" ? "api" : submodule;
const fullDir = `${MONOREPO_DIR}/${subDir}`;

await runGitCommand(sandbox, ["-C", fullDir, "add", "-A"], "stage changes", fullDir);
await runGitCommand(
sandbox,
["-C", fullDir, "commit", "-m", `agent: address feedback\n\n${feedback.slice(0, 200)}`],
"commit feedback",
fullDir,
);
await runGitCommand(
sandbox,
["-C", fullDir, "push", "origin", branch],
"push updates",
fullDir,
);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Gate commit/push flow on runGitCommand success.

Each git step is fire-and-forget right now. If add/commit/push fails for any repo, the task can still continue and later report updated.

🧩 Proposed fix
       logStep("Pushing updates to PR branches");
       for (const pr of prs) {
@@
-        await runGitCommand(sandbox, ["-C", fullDir, "add", "-A"], "stage changes", fullDir);
-        await runGitCommand(
+        if (!(await runGitCommand(sandbox, ["-C", fullDir, "add", "-A"], "stage changes"))) {
+          throw new Error(`Failed to stage changes for ${pr.repo}`);
+        }
+        if (!(await runGitCommand(
           sandbox,
           ["-C", fullDir, "commit", "-m", `agent: address feedback\n\n${feedback.slice(0, 200)}`],
-          "commit feedback",
-          fullDir,
-        );
-        await runGitCommand(
+          "commit feedback",
+        ))) {
+          throw new Error(`Failed to commit feedback for ${pr.repo}`);
+        }
+        if (!(await runGitCommand(
           sandbox,
           ["-C", fullDir, "push", "origin", branch],
-          "push updates",
-          fullDir,
-        );
+          "push updates",
+        ))) {
+          throw new Error(`Failed to push updates for ${pr.repo}`);
+        }
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 53 - 73, The add/commit/push steps in
the for loop over prs are currently fire-and-forget; update the loop in
updatePRTask.ts so each runGitCommand call (the calls named "stage changes",
"commit feedback", and "push updates") is awaited and its result checked and on
failure either throw or mark that PR as not updated and break/continue
accordingly; use the existing runGitCommand return/throw behavior to gate
progression (so only run commit if add succeeded and only push if commit
succeeded), referencing runGitCommand, prs, MONOREPO_DIR, feedback and branch to
locate the code to change.

Extract getSandboxHomeDir utility from duplicated `echo ~` pattern.
Change cloneRecoupMonorepo to return clone path (string|null) instead
of boolean, removing hardcoded /home/user/monorepo that fails in
Vercel Sandboxes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (18)
src/sandboxes/detectChangedSubmodules.ts (1)

17-23: ⚠️ Potential issue | 🟠 Major

Handle runCommand failures explicitly instead of defaulting to “unchanged.”

Lines 17-23 and 33-39 ignore non-zero exit codes. Failed ls or git status commands can silently produce false no_changes results.

Suggested fix
   const lsResult = await sandbox.runCommand({
     cmd: "sh",
     args: [
       "-c",
       `ls -d ${monorepoDir}/*/ 2>/dev/null | xargs -I{} basename {} | grep -v node_modules`,
     ],
   });
+  if (lsResult.exitCode !== 0) {
+    const stderr = (await lsResult.stderr()) || "";
+    logger.error("Failed to list monorepo directories", { monorepoDir, stderr });
+    throw new Error("Failed to list submodule directories");
+  }

   const allDirs = ((await lsResult.stdout()) || "")
     .trim()
     .split("\n")
     .filter(Boolean);

   const changed: string[] = [];

   for (const dir of allDirs) {
     const statusResult = await sandbox.runCommand({
       cmd: "git",
       args: ["-C", `${monorepoDir}/${dir}`, "status", "--porcelain"],
     });
+    if (statusResult.exitCode !== 0) {
+      const stderr = (await statusResult.stderr()) || "";
+      logger.error("Failed to check submodule status", { dir, stderr });
+      throw new Error(`Failed to check status for submodule: ${dir}`);
+    }

     const stdout = (await statusResult.stdout()) || "";

Also applies to: 33-39

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/detectChangedSubmodules.ts` around lines 17 - 23, The current
calls to sandbox.runCommand in detectChangedSubmodules.ts (e.g., the lsResult
and gitStatusResult invocations) ignore non-zero exit codes and treat failures
as "no_changes"; update the handling to explicitly check the runCommand result
status (or thrown error) and propagate/return an error instead of defaulting to
unchanged: detect the non-zero exit or missing stdout for both the ls command
(lsResult) and the git status command (gitStatusResult), log or attach the
underlying error/message via processLogger or the function's error path, and
throw or return a failure result so callers can distinguish a command failure
from an actual "no_changes" result.
src/sandboxes/notifyCodingAgentCallback.ts (1)

21-27: ⚠️ Potential issue | 🟠 Major

Require callback secret before dispatching request.

On Lines 21-27, only URL is validated. On Line 36, an empty secret is still sent, which weakens callback auth and hides env misconfiguration.

🔐 Suggested fix
   if (!url) {
     logger.error("Missing CODING_AGENT_CALLBACK_URL environment variable");
     return;
   }
+  if (!secret) {
+    logger.error("Missing CODING_AGENT_CALLBACK_SECRET environment variable");
+    return;
+  }

   logger.log("Sending coding agent callback", { threadId: payload.threadId, status: payload.status });

   try {
     const response = await fetch(url, {
       method: "POST",
       headers: {
         "Content-Type": "application/json",
-        "x-callback-secret": secret || "",
+        "x-callback-secret": secret,
       },
       body: JSON.stringify(payload),
     });

Also applies to: 34-37

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/notifyCodingAgentCallback.ts` around lines 21 - 27, The code
currently only validates CODING_AGENT_CALLBACK_URL but still reads and may send
an empty CODING_AGENT_CALLBACK_SECRET; update the startup/dispatch logic in
notifyCodingAgentCallback (or the function that reads const url and const
secret) to require both url and secret: if secret is falsy, log an error like
"Missing CODING_AGENT_CALLBACK_SECRET environment variable" and return without
dispatching the request so you never send an empty secret; ensure the secret
variable used when building the callback request/payload/headers is always
present before making the HTTP call.
src/sandboxes/cloneRecoupMonorepo.ts (2)

42-63: ⚠️ Potential issue | 🟠 Major

Check runGitCommand return values before reporting success.

On Lines 42-63, failures in git config steps are ignored, so the function can return a successful clone path with broken git setup.

Suggested fix
-  await runGitCommand(
+  if (!(await runGitCommand(
     sandbox,
     ["-C", cloneDir, "config", "user.email", "agent@recoupable.com"],
     "set git user email",
-  );
+  ))) return null;

-  await runGitCommand(
+  if (!(await runGitCommand(
     sandbox,
     ["-C", cloneDir, "config", "user.name", "Recoup Agent"],
     "set git user name",
-  );
+  ))) return null;

-  await runGitCommand(
+  if (!(await runGitCommand(
     sandbox,
     [
       "-C", cloneDir, "config",
       `url.https://x-access-token:${githubToken}@github.com/.insteadOf`,
       "https://github.com/",
     ],
     "set URL rewrite for auth",
-  );
+  ))) return null;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/cloneRecoupMonorepo.ts` around lines 42 - 63, The git config
commands run via runGitCommand in cloneRecoupMonorepo (the calls that set
user.email, user.name and the URL rewrite using cloneDir, sandbox and
githubToken) currently ignore failures; update each call to check the returned
result (or await and validate success/exit code/throw) and if any command fails,
log the error/context and propagate the failure (throw or return an error) so
the function does not report a successful clone path with broken git setup;
ensure you reference runGitCommand's return shape and handle its rejection or
non-zero result consistently for all three invocations.

26-33: ⚠️ Potential issue | 🟠 Major

Avoid embedding GITHUB_TOKEN in git URLs/config values.

On Line 26 and Line 59, the token is placed directly into clone/config arguments. This can leak via command tracing, process inspection, or persisted git config in snapshots.

Also applies to: 55-60

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/cloneRecoupMonorepo.ts` around lines 26 - 33, The code
currently embeds the GitHub token in authUrl and passes it to
sandbox.runCommand, which can leak secrets; instead build the repo URL without
the token and pass the token via git's http.extraheader so it is not stored in
args or git config. Replace usage of authUrl and the git clone call in
cloneRecoupMonorepo (the authUrl variable and the sandbox.runCommand call that
invokes "git clone --recurse-submodules") to use the plain
https://github.com/... URL and add a git config flag like -c
http.extraheader="AUTHORIZATION: bearer <githubToken>" (or set GIT_ASKPASS to a
helper) in the sandbox.runCommand invocation so the token is provided at request
time without being embedded in the URL or persisted; do the same change for the
later git commands referenced around lines 55-60 that currently inject the token
into args/config.
src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts (1)

92-103: ⚠️ Potential issue | 🟠 Major

Test currently enshrines insecure token-in-URL behavior.

On Lines 99-103, the test expects GITHUB_TOKEN to appear in the clone URL. This locks in a secret-leak-prone implementation and will fight secure fixes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts` around lines 92 - 103,
The test asserts that GITHUB_TOKEN appears directly in the git clone URL (in
cloneRecoupMonorepo via sandbox.runCommand), which enshrines insecure
token-in-URL behavior; update the test to assert that an authenticated clone is
attempted without checking for the raw token in the URL—e.g., verify the
runCommand invocation uses a credential helper or HTTPS auth mechanism (check
that the clone command contains an authentication indicator such as a configured
credential helper or header flag) or assert that the URL is sanitized/obscured
rather than containing "ghp_test123", and update references to
createMockSandbox/sandbox.runCommand accordingly to look for the secure behavior
instead of the literal token.
src/tasks/runSandboxCommandTask.ts (1)

32-32: ⚠️ Potential issue | 🟠 Major

Don’t let tag attachment failure block sandbox command execution.

On Line 32, a tags.add() failure currently aborts the task before any sandbox work starts.

Suggested hardening
-    await tags.add(`account:${accountId}`);
+    try {
+      await tags.add(`account:${accountId}`);
+    } catch (error) {
+      logger.warn("Failed to attach account tag", {
+        accountId,
+        error: error instanceof Error ? error.message : String(error),
+      });
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/runSandboxCommandTask.ts` at line 32, The call to
tags.add(`account:${accountId}`) in runSandboxCommandTask can throw and
currently aborts execution; wrap that tags.add call in a try/catch inside
runSandboxCommandTask, log the error (using the existing logger or
processLogger) and continue without rethrowing so sandbox command execution
proceeds even if tag attachment fails; keep the rest of runSandboxCommandTask
unchanged and ensure the catch is narrow (only around tags.add) to avoid
swallowing unrelated errors.
src/sandboxes/__tests__/createSubmodulePR.test.ts (1)

68-73: ⚠️ Potential issue | 🔴 Critical

This assertion locks in a runGitCommand signature mismatch.

Line 68 expects a 4-argument call, but src/sandboxes/runGitCommand.ts defines 3 parameters. This test currently validates the wrong contract instead of catching it.

#!/bin/bash
set -euo pipefail

echo "== runGitCommand signature =="
fd 'runGitCommand.ts' --exec sed -n '1,80p' {}

echo
echo "== createSubmodulePR call sites =="
fd 'createSubmodulePR.ts' --exec rg -nP -C2 '\brunGitCommand\s*\(' {}

echo
echo "== test assertions =="
fd 'createSubmodulePR.test.ts' --exec rg -nP -C2 'toHaveBeenCalledWith|runGitCommand'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/createSubmodulePR.test.ts` around lines 68 - 73, The
test's assertion for runGitCommand uses a 4-argument call but the actual
function runGitCommand accepts 3 parameters, so update the test in
createSubmodulePR.test.ts to match the real contract: change the
expect(runGitCommand).toHaveBeenCalledWith(...) to expect three arguments (e.g.,
expect(runGitCommand).toHaveBeenCalledWith(sandbox,
expect.arrayContaining(["checkout", "-b", "agent/fix-123"]),
expect.any(String))) or otherwise use a looser matcher like expect.anything()
for the third arg; ensure the assertion references the runGitCommand symbol and
the same checkout branch array used now so the test validates the real
signature.
src/tasks/customerPromptTask.ts (1)

60-61: ⚠️ Potential issue | 🟠 Major

Make account tagging best-effort to avoid aborting customer prompt flow.

On Line 60, tags.add() is in the critical path. If tagging fails transiently, the task exits before chat generation even though tagging is non-essential metadata.

Suggested hardening
-    await tags.add(`account:${accountId}`);
+    try {
+      await tags.add(`account:${accountId}`);
+    } catch (error) {
+      logger.warn("Failed to attach account tag", {
+        accountId,
+        error: error instanceof Error ? error.message : String(error),
+      });
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/customerPromptTask.ts` around lines 60 - 61, The await
tags.add(`account:${accountId}`) call in customerPromptTask.ts is currently on
the critical path and can abort the customer prompt flow if it fails; make this
tagging best-effort by moving it out of the critical path—either wrap the
tags.add(...) call in a try/catch inside the customerPromptTask and log any
error without rethrowing, or fire-and-forget the Promise
(tags.add(...).catch(err => processLogger.warn(...))) so failures don’t block
subsequent chat generation; reference the tags.add call and the accountId
interpolation when applying the change.
src/tasks/codingAgentTask.ts (4)

62-66: ⚠️ Potential issue | 🟠 Major

Fail fast when the coding agent exits non-zero.

The workflow currently continues even if the core edit step failed.

🧯 Suggested guard
       const agentResult = await runOpenClawAgent(sandbox, {
         label: "Coding agent",
         message: prompt,
       });
+      if (agentResult.exitCode !== 0) {
+        await notifyCodingAgentCallback({
+          threadId: callbackThreadId,
+          status: "failed",
+          message: "Coding agent execution failed",
+          stdout: agentResult.stdout,
+          stderr: agentResult.stderr,
+        });
+        return;
+      }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/codingAgentTask.ts` around lines 62 - 66, The call to
runOpenClawAgent currently ignores failures so the workflow continues even if
the coding agent fails; after awaiting runOpenClawAgent, check the returned
agentResult (e.g., agentResult.exitCode or
agentResult.success/agentResult.error) and if it indicates a non-zero exit or
failure, immediately fail-fast by throwing an Error (or returning a rejected
Promise) with a clear message that includes the returned error/exit details;
update the code around runOpenClawAgent in codingAgentTask (and any surrounding
handler) to perform this guard and stop further steps when the agent failed.

47-48: ⚠️ Potential issue | 🟠 Major

Check gh installation command outcome before proceeding.

If this step fails, downstream PR creation errors become noisy and misleading.

🛠️ Suggested guard
       logStep("Installing gh CLI");
-      await sandbox.runCommand({ cmd: "sh", args: ["-c", "which gh || (apt-get update -qq && apt-get install -y -qq gh)"], sudo: true });
+      const ghInstall = await sandbox.runCommand({
+        cmd: "sh",
+        args: ["-c", "which gh || (apt-get update -qq && apt-get install -y -qq gh)"],
+        sudo: true,
+      });
+      if (ghInstall.exitCode !== 0) {
+        const stdout = (await ghInstall.stdout()) || "";
+        const stderr = (await ghInstall.stderr()) || "";
+        logger.error("Failed to install gh CLI", { stdout, stderr, exitCode: ghInstall.exitCode });
+        throw new Error("Failed to install gh CLI");
+      }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/codingAgentTask.ts` around lines 47 - 48, The gh install step using
logStep("Installing gh CLI") and sandbox.runCommand(...) must check the command
outcome and fail fast; wrap the sandbox.runCommand call for installing gh in a
try/catch or inspect its result/exit code, and if the install fails use
processLogger or the existing logger to emit a clear error (including
stderr/output) and throw or return a failing error so downstream PR creation
does not run; update the code around logStep("Installing gh CLI") and the
sandbox.runCommand invocation to perform this validation and fail fast with a
descriptive message on error.

86-111: ⚠️ Potential issue | 🟠 Major

Do not map PR-creation failures to no_changes.

When submodules changed but PR creation failed, reporting no_changes hides operational failure.

📌 Suggested status handling
-      const prs = [];
+      const prs = [];
+      const failedSubmodules: string[] = [];
       for (const submodule of changedSubmodules) {
         const pr = await createSubmodulePR(sandbox, {
           submodule,
           branch,
           prompt,
           monorepoDir,
         });
         if (pr) {
           prs.push(pr);
+        } else {
+          failedSubmodules.push(submodule);
         }
       }
@@
       await notifyCodingAgentCallback({
         threadId: callbackThreadId,
-        status: prs.length > 0 ? "pr_created" : "no_changes",
+        status: failedSubmodules.length > 0 ? "failed" : "pr_created",
         branch,
         snapshotId,
         prs,
+        message: failedSubmodules.length > 0
+          ? `Failed to create PRs for: ${failedSubmodules.join(", ")}`
+          : undefined,
         stdout: agentResult.stdout,
         stderr: agentResult.stderr,
       });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/codingAgentTask.ts` around lines 86 - 111, The current logic maps a
zero-length prs array to "no_changes", which hides failures when
changedSubmodules existed; update the status computation before calling
notifyCodingAgentCallback to distinguish three cases: if prs.length > 0 set
status to "pr_created", else if changedSubmodules.length > 0 set status to
"pr_failed" (or similar failure status), else set status to "no_changes". Locate
the loop that calls createSubmodulePR and builds prs, then adjust the status
field in the notifyCodingAgentCallback call (referencing changedSubmodules, prs,
createSubmodulePR, and notifyCodingAgentCallback) so operational failures are
reported separately from no-change situations.

42-119: ⚠️ Potential issue | 🟠 Major

Add a catch path to always emit a failed callback on unexpected errors.

Without catch, exceptions skip explicit failure notification and the caller can be left without terminal status.

🧩 Suggested structure
-    try {
+    try {
       // existing flow
+    } catch (error) {
+      const message =
+        error instanceof Error ? error.message : JSON.stringify(error);
+      metadata.set("currentStep", "Failed");
+      await notifyCodingAgentCallback({
+        threadId: callbackThreadId,
+        status: "failed",
+        message,
+      });
+      throw error;
     } finally {
       logger.log("Stopping sandbox", { sandboxId: sandbox.sandboxId });
       await sandbox.stop();
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/codingAgentTask.ts` around lines 42 - 119, Wrap the existing
try/finally in a try/catch/finally so any unexpected exception sends a failed
callback via notifyCodingAgentCallback before rethrowing; in the catch block
capture the thrown error (error.message and error.stack) and include threadId:
callbackThreadId, status: "failed", and a descriptive message plus any available
agent output (use agentResult?.stdout and agentResult?.stderr if defined) so
callers always receive a terminal status; keep the existing finally that logs
and calls sandbox.stop().
src/tasks/updatePRTask.ts (3)

42-94: ⚠️ Potential issue | 🟠 Major

Add a catch path that notifies callback on unexpected failures.

Unhandled exceptions currently skip terminal callback notification, so callers may never receive a final failed state.

🧩 Suggested structure
-    try {
+    try {
       // existing flow
+    } catch (error) {
+      const message =
+        error instanceof Error ? error.message : JSON.stringify(error);
+      metadata.set("currentStep", "Failed");
+      await notifyCodingAgentCallback({
+        threadId: callbackThreadId,
+        status: "failed",
+        message,
+      });
+      throw error;
     } finally {
       logger.log("Stopping sandbox", { sandboxId: sandbox.sandboxId });
       await sandbox.stop();
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 42 - 94, The try/finally in
updatePRTask.ts ensures sandbox.stop but lacks a catch to notify the callback on
errors; add a catch block around the main try body (or convert to
try/catch/finally) that catches any exception, sets
metadata.set("currentStep","Failed"), calls notifyCodingAgentCallback({
threadId: callbackThreadId, status: "failed", error: String(err) }) (include any
available snapshotId if created) to report the failure, then rethrow the error
so existing finally still stops the sandbox; reference the existing
notifyCodingAgentCallback, metadata.set, and sandbox.snapshot/sandbox.stop
locations when adding the handler.

50-55: ⚠️ Potential issue | 🟠 Major

Abort update flow when OpenClaw exits non-zero.

runOpenClawAgent returns an exit code, but this flow continues to git operations even if feedback application failed.

🧯 Suggested guard
       logStep("Running AI agent with feedback");
-      await runOpenClawAgent(sandbox, {
+      const agentResult = await runOpenClawAgent(sandbox, {
         label: "Apply feedback",
         message: `The following feedback was given on the existing changes on branch "${branch}":\n\n${feedback}\n\nPlease make the requested changes.`,
       });
+      if (agentResult.exitCode !== 0) {
+        throw new Error("OpenClaw failed while applying feedback");
+      }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 50 - 55, runOpenClawAgent may return
a non-zero exit code but the current flow proceeds to git operations regardless;
capture its return value (e.g., const exitCode = await runOpenClawAgent(...))
and guard the rest of updatePRTask: if exitCode !== 0, log a clear error via
logStep/processLogger (include branch and exitCode) and abort the update flow by
throwing or returning early so subsequent git operations do not run; reference
the runOpenClawAgent call and the surrounding logStep("Running AI agent with
feedback") to locate where to add the check.

63-75: ⚠️ Potential issue | 🔴 Critical

Gate each git step on success and fix runGitCommand call signature.

These calls pass an extra argument and ignore the returned boolean, allowing the loop to continue after failed stage/commit/push operations.

🔒 Suggested fix
-        await runGitCommand(sandbox, ["-C", fullDir, "add", "-A"], "stage changes", fullDir);
-        await runGitCommand(
+        if (!(await runGitCommand(sandbox, ["-C", fullDir, "add", "-A"], "stage changes"))) {
+          throw new Error(`Failed to stage changes for ${pr.repo}`);
+        }
+        if (!(await runGitCommand(
           sandbox,
           ["-C", fullDir, "commit", "-m", `agent: address feedback\n\n${feedback.slice(0, 200)}`],
-          "commit feedback",
-          fullDir,
-        );
-        await runGitCommand(
+          "commit feedback",
+        ))) {
+          throw new Error(`Failed to commit changes for ${pr.repo}`);
+        }
+        if (!(await runGitCommand(
           sandbox,
           ["-C", fullDir, "push", "origin", branch],
-          "push updates",
-          fullDir,
-        );
+          "push updates",
+        ))) {
+          throw new Error(`Failed to push updates for ${pr.repo}`);
+        }
#!/bin/bash
# Verify helper signature and current call arity
rg -nP 'export\s+async\s+function\s+runGitCommand\s*\(' src/sandboxes/runGitCommand.ts
rg -nP '\brunGitCommand\s*\(' src/tasks/updatePRTask.ts
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 63 - 75, run each git step in
updatePRTask synchronously by checking runGitCommand's actual signature in
src/sandboxes/runGitCommand.ts and then 1) call runGitCommand with the correct
arity (use only the parameters the helper expects) and 2) capture its
boolean/throwing result (e.g., const ok = await runGitCommand(...)); if it
returns false or throws, stop the flow (return or throw) instead of continuing
to the next step. Apply this to the stage/commit/push calls in updatePRTask
(referencing runGitCommand, fullDir, branch, feedback) so you don't ignore
failures and you don't pass the extra fullDir argument if the helper doesn't
accept it.
src/sandboxes/createSubmodulePR.ts (2)

57-70: ⚠️ Potential issue | 🔴 Critical

Fail fast on git failures and remove invalid runGitCommand argument.

This block ignores runGitCommand’s false return and continues into PR creation. It also passes an extra argument not present in the helper signature.

🛠️ Suggested fix
-  await runGitCommand(sandbox, ["-C", subDir, "checkout", "-b", branch], "create branch", subDir);
-  await runGitCommand(sandbox, ["-C", subDir, "add", "-A"], "stage changes", subDir);
-  await runGitCommand(
+  if (!(await runGitCommand(sandbox, ["-C", subDir, "checkout", "-b", branch], "create branch"))) {
+    return null;
+  }
+  if (!(await runGitCommand(sandbox, ["-C", subDir, "add", "-A"], "stage changes"))) {
+    return null;
+  }
+  if (!(await runGitCommand(
     sandbox,
     ["-C", subDir, "commit", "-m", `agent: ${prompt.slice(0, 72)}`],
-    "commit changes",
-    subDir,
-  );
-  await runGitCommand(
+    "commit changes",
+  ))) {
+    return null;
+  }
+  if (!(await runGitCommand(
     sandbox,
     ["-C", subDir, "push", "-u", "origin", branch],
-    "push branch",
-    subDir,
-  );
+    "push branch",
+  ))) {
+    return null;
+  }
#!/bin/bash
# Verify runGitCommand signature and call arity in this file
rg -nP 'export\s+async\s+function\s+runGitCommand\s*\(' src/sandboxes/runGitCommand.ts
rg -nP '\brunGitCommand\s*\(' src/sandboxes/createSubmodulePR.ts
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/createSubmodulePR.ts` around lines 57 - 70, The git sequence in
createSubmodulePR calls runGitCommand (used in checkout, add, commit, push) with
an extra trailing subDir argument and ignores its boolean return; update those
calls to match the runGitCommand signature (remove the extra subDir parameter)
and immediately fail-fast when runGitCommand returns false (e.g., throw or
return early) for the checkout, add, commit, and push steps so the PR creation
does not proceed on git failures; key symbols to edit: runGitCommand calls in
createSubmodulePR (the checkout branch, add -A, commit, and push -u origin
branch invocations) and ensure the function signature in runGitCommand is
respected.

92-97: ⚠️ Potential issue | 🟡 Minor

Validate parsed PR number before returning success.

If gh pr create output shape changes, prNumber can be invalid and still be returned as a successful PR.

🛡️ Suggested guard
   const prUrl = ((await prResult.stdout()) || "").trim();
   const prNumber = parseInt(prUrl.split("/").pop() || "0", 10);
+  if (!Number.isFinite(prNumber) || prNumber <= 0) {
+    logger.error(`Failed to parse PR number for ${submodule}`, { prUrl });
+    return null;
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/createSubmodulePR.ts` around lines 92 - 97, The parsed PR
number from prResult.stdout() (prUrl → prNumber) must be validated before
returning; update the createSubmodulePR logic to check that prNumber is a finite
positive integer (e.g., Number.isInteger(prNumber) && prNumber > 0) and that
prUrl is non-empty and looks like a PR URL, and if validation fails log the full
prResult.stdout() and throw an error (or return a failed result) instead of
returning { repo, number: prNumber, url: prUrl, baseBranch }; use the symbols
prResult, prUrl, prNumber and the return site in createSubmodulePR to locate and
implement this guard.
src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts (1)

1-16: ⚠️ Potential issue | 🟡 Minor

Restore global and env state after each test.

This suite mutates process.env and stubs fetch without cleanup, which can leak state into other suites.

🧪 Suggested isolation fix
-import { describe, it, expect, vi, beforeEach } from "vitest";
+import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
@@
 const mockFetch = vi.fn();
 vi.stubGlobal("fetch", mockFetch);
+let previousCallbackUrl: string | undefined;
+let previousCallbackSecret: string | undefined;
@@
 beforeEach(() => {
   vi.clearAllMocks();
+  previousCallbackUrl = process.env.CODING_AGENT_CALLBACK_URL;
+  previousCallbackSecret = process.env.CODING_AGENT_CALLBACK_SECRET;
   process.env.CODING_AGENT_CALLBACK_URL = "https://recoup-api.vercel.app/api/coding-agent/callback";
   process.env.CODING_AGENT_CALLBACK_SECRET = "test-secret";
 });
+
+afterEach(() => {
+  vi.unstubAllGlobals();
+  if (previousCallbackUrl === undefined) delete process.env.CODING_AGENT_CALLBACK_URL;
+  else process.env.CODING_AGENT_CALLBACK_URL = previousCallbackUrl;
+  if (previousCallbackSecret === undefined) delete process.env.CODING_AGENT_CALLBACK_SECRET;
+  else process.env.CODING_AGENT_CALLBACK_SECRET = previousCallbackSecret;
+});
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts` around lines 1 -
16, Save the original global.fetch and process.env before stubbing and then
restore them in an afterEach to avoid leaking state: capture const
ORIGINAL_FETCH = global.fetch and const ORIGINAL_ENV = { ...process.env } before
calling vi.stubGlobal("fetch", mockFetch) and mutating process.env, then add an
afterEach that does global.fetch = ORIGINAL_FETCH, process.env = {
...ORIGINAL_ENV }, and vi.clearAllMocks()/vi.resetAllMocks() to clean mocks;
reference the existing mockFetch, vi.stubGlobal, process.env assignments and the
notifyCodingAgentCallback test setup when making these changes.
🧹 Nitpick comments (3)
src/sandboxes/__tests__/getSandboxHomeDir.test.ts (1)

12-48: Add tests for rejected command execution paths.

Please add coverage for runCommand rejection and stdout() rejection so fallback behavior is enforced under real failure conditions.

Suggested test additions
 describe("getSandboxHomeDir", () => {
+  it("falls back to /root when runCommand throws", async () => {
+    const sandbox = {
+      runCommand: vi.fn().mockRejectedValue(new Error("command failed")),
+    } as any;
+
+    await expect(getSandboxHomeDir(sandbox)).resolves.toBe("/root");
+  });
+
+  it("falls back to /root when stdout throws", async () => {
+    const sandbox = {
+      runCommand: vi.fn().mockResolvedValue({
+        stdout: async () => {
+          throw new Error("stdout failed");
+        },
+      }),
+    } as any;
+
+    await expect(getSandboxHomeDir(sandbox)).resolves.toBe("/root");
+  });
+
   it("returns the home directory from the sandbox", async () => {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/getSandboxHomeDir.test.ts` around lines 12 - 48, Add
two tests for getSandboxHomeDir to cover rejected execution paths: one where
sandbox.runCommand rejects (simulate createMockSandbox to return a runCommand
that rejects) and assert getSandboxHomeDir returns "/root" and that runCommand
was invoked with { cmd: "sh", args: ["-c", "echo ~"] }; and another where
runCommand resolves but the returned object's stdout() method rejects (simulate
stdout throwing) and assert getSandboxHomeDir returns "/root" in that case as
well. Use the same test structure and helpers (createMockSandbox,
getSandboxHomeDir, sandbox.runCommand, stdout) so these failure scenarios are
explicitly exercised.
src/sandboxes/__tests__/detectChangedSubmodules.test.ts (1)

18-79: Add failure-path tests for command errors.

Current coverage only validates happy paths. Add cases where ls or git status returns non-zero so regressions don’t silently reintroduce false no_changes behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/detectChangedSubmodules.test.ts` around lines 18 -
79, Add tests for detectChangedSubmodules that simulate failing commands by
making the mock sandbox.runCommand return non-zero exitCode for the initial "ls"
call and for a per-submodule "git status" call; use createMockSandbox and
mockResolvedValueOnce to return exitCode !== 0 and appropriate stderr text, then
assert detectChangedSubmodules(sandbox, ...) rejects/throws (or returns an error
indicator your implementation uses) instead of silently returning []—do this for
both the ls failure path and a git status failure path so regressions that treat
command errors as "no changes" are caught.
src/tasks/__tests__/codingAgentTask.test.ts (1)

82-136: Add regression tests for failed-agent and failed-PR branches.

Please add cases for:

  1. runOpenClawAgent returning non-zero exit code (expect status: "failed"),
  2. detectChangedSubmodules non-empty + all createSubmodulePR returning null (expect failure status, not no_changes).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/__tests__/codingAgentTask.test.ts` around lines 82 - 136, Add two
new tests in codingAgentTask.test.ts: one that mocks runOpenClawAgent to return
a non-zero exit (e.g., mockResolvedValueOnce({ exitCode: 1 }) or whatever the
real return shape is) and asserts notifyCodingAgentCallback is called with
expect.objectContaining({ status: "failed" }); and another that mocks
detectChangedSubmodules to return a non-empty array and mocks createSubmodulePR
to return null for each submodule, then run mockRun(basePayload) and assert
notifyCodingAgentCallback was called with expect.objectContaining({ status:
"failed" }) instead of "no_changes". Use the same import/mocking pattern as the
other tests (vi.mocked(...).mockResolvedValueOnce(...)) and reference
runOpenClawAgent, detectChangedSubmodules, createSubmodulePR, and
notifyCodingAgentCallback to locate the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sandboxes/getSandboxHomeDir.ts`:
- Around line 9-11: The getSandboxHomeDir function currently calls
sandbox.runCommand(...).stdout() which can throw and cause the function to
reject instead of returning the intended fallback; wrap the command invocation
and stdout access in a try/catch inside getSandboxHomeDir, call
sandbox.runCommand({ cmd: "sh", args: ["-c", "echo ~"] }) and await
result.stdout() inside the try, trim and return the stdout if present, and in
the catch (or on empty result) return the fallback string "/root" so any error
during runCommand or stdout() yields the fallback as promised.

In `@src/schemas/updatePRSchema.ts`:
- Around line 3-8: The prSchema is too permissive and must validate inputs
before they hit git path logic: tighten prSchema (the exported const prSchema)
so repo is constrained to a GitHub "owner/repo" pattern (e.g., regex enforcing
no path traversal chars and exactly two segments), number is a positive integer
(z.number().int().positive()), and url is validated as a proper GitHub PR URL
(or at minimum a well-formed URL starting with "https://github.com/" and
matching /pull/ or /pulls/); also ensure baseBranch is a non-empty string.
Implement these checks using Zod refinements or regex checks on prSchema so
malformed values are rejected at parse time before updatePRTask.ts uses them.

In `@src/tasks/__tests__/updatePRTask.test.ts`:
- Around line 114-119: The test asserts runGitCommand was called with four
arguments but the helper uses a 3-argument contract; update the assertion in
updatePRTask.test.ts to call
expect(runGitCommand).toHaveBeenCalledWith(expect.anything(),
expect.arrayContaining(["push"]), expect.any(String)) — remove the fourth
expect.any(String) argument so the assertion matches the runGitCommand
3-argument signature used by the code under test.

---

Duplicate comments:
In `@src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts`:
- Around line 92-103: The test asserts that GITHUB_TOKEN appears directly in the
git clone URL (in cloneRecoupMonorepo via sandbox.runCommand), which enshrines
insecure token-in-URL behavior; update the test to assert that an authenticated
clone is attempted without checking for the raw token in the URL—e.g., verify
the runCommand invocation uses a credential helper or HTTPS auth mechanism
(check that the clone command contains an authentication indicator such as a
configured credential helper or header flag) or assert that the URL is
sanitized/obscured rather than containing "ghp_test123", and update references
to createMockSandbox/sandbox.runCommand accordingly to look for the secure
behavior instead of the literal token.

In `@src/sandboxes/__tests__/createSubmodulePR.test.ts`:
- Around line 68-73: The test's assertion for runGitCommand uses a 4-argument
call but the actual function runGitCommand accepts 3 parameters, so update the
test in createSubmodulePR.test.ts to match the real contract: change the
expect(runGitCommand).toHaveBeenCalledWith(...) to expect three arguments (e.g.,
expect(runGitCommand).toHaveBeenCalledWith(sandbox,
expect.arrayContaining(["checkout", "-b", "agent/fix-123"]),
expect.any(String))) or otherwise use a looser matcher like expect.anything()
for the third arg; ensure the assertion references the runGitCommand symbol and
the same checkout branch array used now so the test validates the real
signature.

In `@src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts`:
- Around line 1-16: Save the original global.fetch and process.env before
stubbing and then restore them in an afterEach to avoid leaking state: capture
const ORIGINAL_FETCH = global.fetch and const ORIGINAL_ENV = { ...process.env }
before calling vi.stubGlobal("fetch", mockFetch) and mutating process.env, then
add an afterEach that does global.fetch = ORIGINAL_FETCH, process.env = {
...ORIGINAL_ENV }, and vi.clearAllMocks()/vi.resetAllMocks() to clean mocks;
reference the existing mockFetch, vi.stubGlobal, process.env assignments and the
notifyCodingAgentCallback test setup when making these changes.

In `@src/sandboxes/cloneRecoupMonorepo.ts`:
- Around line 42-63: The git config commands run via runGitCommand in
cloneRecoupMonorepo (the calls that set user.email, user.name and the URL
rewrite using cloneDir, sandbox and githubToken) currently ignore failures;
update each call to check the returned result (or await and validate
success/exit code/throw) and if any command fails, log the error/context and
propagate the failure (throw or return an error) so the function does not report
a successful clone path with broken git setup; ensure you reference
runGitCommand's return shape and handle its rejection or non-zero result
consistently for all three invocations.
- Around line 26-33: The code currently embeds the GitHub token in authUrl and
passes it to sandbox.runCommand, which can leak secrets; instead build the repo
URL without the token and pass the token via git's http.extraheader so it is not
stored in args or git config. Replace usage of authUrl and the git clone call in
cloneRecoupMonorepo (the authUrl variable and the sandbox.runCommand call that
invokes "git clone --recurse-submodules") to use the plain
https://github.com/... URL and add a git config flag like -c
http.extraheader="AUTHORIZATION: bearer <githubToken>" (or set GIT_ASKPASS to a
helper) in the sandbox.runCommand invocation so the token is provided at request
time without being embedded in the URL or persisted; do the same change for the
later git commands referenced around lines 55-60 that currently inject the token
into args/config.

In `@src/sandboxes/createSubmodulePR.ts`:
- Around line 57-70: The git sequence in createSubmodulePR calls runGitCommand
(used in checkout, add, commit, push) with an extra trailing subDir argument and
ignores its boolean return; update those calls to match the runGitCommand
signature (remove the extra subDir parameter) and immediately fail-fast when
runGitCommand returns false (e.g., throw or return early) for the checkout, add,
commit, and push steps so the PR creation does not proceed on git failures; key
symbols to edit: runGitCommand calls in createSubmodulePR (the checkout branch,
add -A, commit, and push -u origin branch invocations) and ensure the function
signature in runGitCommand is respected.
- Around line 92-97: The parsed PR number from prResult.stdout() (prUrl →
prNumber) must be validated before returning; update the createSubmodulePR logic
to check that prNumber is a finite positive integer (e.g.,
Number.isInteger(prNumber) && prNumber > 0) and that prUrl is non-empty and
looks like a PR URL, and if validation fails log the full prResult.stdout() and
throw an error (or return a failed result) instead of returning { repo, number:
prNumber, url: prUrl, baseBranch }; use the symbols prResult, prUrl, prNumber
and the return site in createSubmodulePR to locate and implement this guard.

In `@src/sandboxes/detectChangedSubmodules.ts`:
- Around line 17-23: The current calls to sandbox.runCommand in
detectChangedSubmodules.ts (e.g., the lsResult and gitStatusResult invocations)
ignore non-zero exit codes and treat failures as "no_changes"; update the
handling to explicitly check the runCommand result status (or thrown error) and
propagate/return an error instead of defaulting to unchanged: detect the
non-zero exit or missing stdout for both the ls command (lsResult) and the git
status command (gitStatusResult), log or attach the underlying error/message via
processLogger or the function's error path, and throw or return a failure result
so callers can distinguish a command failure from an actual "no_changes" result.

In `@src/sandboxes/notifyCodingAgentCallback.ts`:
- Around line 21-27: The code currently only validates CODING_AGENT_CALLBACK_URL
but still reads and may send an empty CODING_AGENT_CALLBACK_SECRET; update the
startup/dispatch logic in notifyCodingAgentCallback (or the function that reads
const url and const secret) to require both url and secret: if secret is falsy,
log an error like "Missing CODING_AGENT_CALLBACK_SECRET environment variable"
and return without dispatching the request so you never send an empty secret;
ensure the secret variable used when building the callback
request/payload/headers is always present before making the HTTP call.

In `@src/tasks/codingAgentTask.ts`:
- Around line 62-66: The call to runOpenClawAgent currently ignores failures so
the workflow continues even if the coding agent fails; after awaiting
runOpenClawAgent, check the returned agentResult (e.g., agentResult.exitCode or
agentResult.success/agentResult.error) and if it indicates a non-zero exit or
failure, immediately fail-fast by throwing an Error (or returning a rejected
Promise) with a clear message that includes the returned error/exit details;
update the code around runOpenClawAgent in codingAgentTask (and any surrounding
handler) to perform this guard and stop further steps when the agent failed.
- Around line 47-48: The gh install step using logStep("Installing gh CLI") and
sandbox.runCommand(...) must check the command outcome and fail fast; wrap the
sandbox.runCommand call for installing gh in a try/catch or inspect its
result/exit code, and if the install fails use processLogger or the existing
logger to emit a clear error (including stderr/output) and throw or return a
failing error so downstream PR creation does not run; update the code around
logStep("Installing gh CLI") and the sandbox.runCommand invocation to perform
this validation and fail fast with a descriptive message on error.
- Around line 86-111: The current logic maps a zero-length prs array to
"no_changes", which hides failures when changedSubmodules existed; update the
status computation before calling notifyCodingAgentCallback to distinguish three
cases: if prs.length > 0 set status to "pr_created", else if
changedSubmodules.length > 0 set status to "pr_failed" (or similar failure
status), else set status to "no_changes". Locate the loop that calls
createSubmodulePR and builds prs, then adjust the status field in the
notifyCodingAgentCallback call (referencing changedSubmodules, prs,
createSubmodulePR, and notifyCodingAgentCallback) so operational failures are
reported separately from no-change situations.
- Around line 42-119: Wrap the existing try/finally in a try/catch/finally so
any unexpected exception sends a failed callback via notifyCodingAgentCallback
before rethrowing; in the catch block capture the thrown error (error.message
and error.stack) and include threadId: callbackThreadId, status: "failed", and a
descriptive message plus any available agent output (use agentResult?.stdout and
agentResult?.stderr if defined) so callers always receive a terminal status;
keep the existing finally that logs and calls sandbox.stop().

In `@src/tasks/customerPromptTask.ts`:
- Around line 60-61: The await tags.add(`account:${accountId}`) call in
customerPromptTask.ts is currently on the critical path and can abort the
customer prompt flow if it fails; make this tagging best-effort by moving it out
of the critical path—either wrap the tags.add(...) call in a try/catch inside
the customerPromptTask and log any error without rethrowing, or fire-and-forget
the Promise (tags.add(...).catch(err => processLogger.warn(...))) so failures
don’t block subsequent chat generation; reference the tags.add call and the
accountId interpolation when applying the change.

In `@src/tasks/runSandboxCommandTask.ts`:
- Line 32: The call to tags.add(`account:${accountId}`) in runSandboxCommandTask
can throw and currently aborts execution; wrap that tags.add call in a try/catch
inside runSandboxCommandTask, log the error (using the existing logger or
processLogger) and continue without rethrowing so sandbox command execution
proceeds even if tag attachment fails; keep the rest of runSandboxCommandTask
unchanged and ensure the catch is narrow (only around tags.add) to avoid
swallowing unrelated errors.

In `@src/tasks/updatePRTask.ts`:
- Around line 42-94: The try/finally in updatePRTask.ts ensures sandbox.stop but
lacks a catch to notify the callback on errors; add a catch block around the
main try body (or convert to try/catch/finally) that catches any exception, sets
metadata.set("currentStep","Failed"), calls notifyCodingAgentCallback({
threadId: callbackThreadId, status: "failed", error: String(err) }) (include any
available snapshotId if created) to report the failure, then rethrow the error
so existing finally still stops the sandbox; reference the existing
notifyCodingAgentCallback, metadata.set, and sandbox.snapshot/sandbox.stop
locations when adding the handler.
- Around line 50-55: runOpenClawAgent may return a non-zero exit code but the
current flow proceeds to git operations regardless; capture its return value
(e.g., const exitCode = await runOpenClawAgent(...)) and guard the rest of
updatePRTask: if exitCode !== 0, log a clear error via logStep/processLogger
(include branch and exitCode) and abort the update flow by throwing or returning
early so subsequent git operations do not run; reference the runOpenClawAgent
call and the surrounding logStep("Running AI agent with feedback") to locate
where to add the check.
- Around line 63-75: run each git step in updatePRTask synchronously by checking
runGitCommand's actual signature in src/sandboxes/runGitCommand.ts and then 1)
call runGitCommand with the correct arity (use only the parameters the helper
expects) and 2) capture its boolean/throwing result (e.g., const ok = await
runGitCommand(...)); if it returns false or throws, stop the flow (return or
throw) instead of continuing to the next step. Apply this to the
stage/commit/push calls in updatePRTask (referencing runGitCommand, fullDir,
branch, feedback) so you don't ignore failures and you don't pass the extra
fullDir argument if the helper doesn't accept it.

---

Nitpick comments:
In `@src/sandboxes/__tests__/detectChangedSubmodules.test.ts`:
- Around line 18-79: Add tests for detectChangedSubmodules that simulate failing
commands by making the mock sandbox.runCommand return non-zero exitCode for the
initial "ls" call and for a per-submodule "git status" call; use
createMockSandbox and mockResolvedValueOnce to return exitCode !== 0 and
appropriate stderr text, then assert detectChangedSubmodules(sandbox, ...)
rejects/throws (or returns an error indicator your implementation uses) instead
of silently returning []—do this for both the ls failure path and a git status
failure path so regressions that treat command errors as "no changes" are
caught.

In `@src/sandboxes/__tests__/getSandboxHomeDir.test.ts`:
- Around line 12-48: Add two tests for getSandboxHomeDir to cover rejected
execution paths: one where sandbox.runCommand rejects (simulate
createMockSandbox to return a runCommand that rejects) and assert
getSandboxHomeDir returns "/root" and that runCommand was invoked with { cmd:
"sh", args: ["-c", "echo ~"] }; and another where runCommand resolves but the
returned object's stdout() method rejects (simulate stdout throwing) and assert
getSandboxHomeDir returns "/root" in that case as well. Use the same test
structure and helpers (createMockSandbox, getSandboxHomeDir, sandbox.runCommand,
stdout) so these failure scenarios are explicitly exercised.

In `@src/tasks/__tests__/codingAgentTask.test.ts`:
- Around line 82-136: Add two new tests in codingAgentTask.test.ts: one that
mocks runOpenClawAgent to return a non-zero exit (e.g., mockResolvedValueOnce({
exitCode: 1 }) or whatever the real return shape is) and asserts
notifyCodingAgentCallback is called with expect.objectContaining({ status:
"failed" }); and another that mocks detectChangedSubmodules to return a
non-empty array and mocks createSubmodulePR to return null for each submodule,
then run mockRun(basePayload) and assert notifyCodingAgentCallback was called
with expect.objectContaining({ status: "failed" }) instead of "no_changes". Use
the same import/mocking pattern as the other tests
(vi.mocked(...).mockResolvedValueOnce(...)) and reference runOpenClawAgent,
detectChangedSubmodules, createSubmodulePR, and notifyCodingAgentCallback to
locate the code.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 172d971 and 625b593.

📒 Files selected for processing (18)
  • src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts
  • src/sandboxes/__tests__/createSubmodulePR.test.ts
  • src/sandboxes/__tests__/detectChangedSubmodules.test.ts
  • src/sandboxes/__tests__/getSandboxHomeDir.test.ts
  • src/sandboxes/__tests__/notifyCodingAgentCallback.test.ts
  • src/sandboxes/cloneRecoupMonorepo.ts
  • src/sandboxes/createSubmodulePR.ts
  • src/sandboxes/detectChangedSubmodules.ts
  • src/sandboxes/getSandboxHomeDir.ts
  • src/sandboxes/notifyCodingAgentCallback.ts
  • src/schemas/codingAgentSchema.ts
  • src/schemas/updatePRSchema.ts
  • src/tasks/__tests__/codingAgentTask.test.ts
  • src/tasks/__tests__/updatePRTask.test.ts
  • src/tasks/codingAgentTask.ts
  • src/tasks/customerPromptTask.ts
  • src/tasks/runSandboxCommandTask.ts
  • src/tasks/updatePRTask.ts

Comment on lines +9 to +11
export async function getSandboxHomeDir(sandbox: Sandbox): Promise<string> {
const result = await sandbox.runCommand({ cmd: "sh", args: ["-c", "echo ~"] });
return ((await result.stdout()) || "").trim() || "/root";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle command failures to honor the fallback contract.

If runCommand (or stdout()) throws, this function rejects instead of returning "/root", despite the stated fallback behavior. Please guard the command path with try/catch.

Proposed fix
 export async function getSandboxHomeDir(sandbox: Sandbox): Promise<string> {
-  const result = await sandbox.runCommand({ cmd: "sh", args: ["-c", "echo ~"] });
-  return ((await result.stdout()) || "").trim() || "/root";
+  try {
+    const result = await sandbox.runCommand({ cmd: "sh", args: ["-c", "echo ~"] });
+    return ((await result.stdout()) || "").trim() || "/root";
+  } catch {
+    return "/root";
+  }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function getSandboxHomeDir(sandbox: Sandbox): Promise<string> {
const result = await sandbox.runCommand({ cmd: "sh", args: ["-c", "echo ~"] });
return ((await result.stdout()) || "").trim() || "/root";
export async function getSandboxHomeDir(sandbox: Sandbox): Promise<string> {
try {
const result = await sandbox.runCommand({ cmd: "sh", args: ["-c", "echo ~"] });
return ((await result.stdout()) || "").trim() || "/root";
} catch {
return "/root";
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/getSandboxHomeDir.ts` around lines 9 - 11, The
getSandboxHomeDir function currently calls sandbox.runCommand(...).stdout()
which can throw and cause the function to reject instead of returning the
intended fallback; wrap the command invocation and stdout access in a try/catch
inside getSandboxHomeDir, call sandbox.runCommand({ cmd: "sh", args: ["-c",
"echo ~"] }) and await result.stdout() inside the try, trim and return the
stdout if present, and in the catch (or on empty result) return the fallback
string "/root" so any error during runCommand or stdout() yields the fallback as
promised.

Comment on lines +3 to +8
const prSchema = z.object({
repo: z.string(),
number: z.number(),
url: z.string(),
baseBranch: z.string(),
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Harden prSchema validation before values flow into git path logic.

repo, number, and url are too permissive. In src/tasks/updatePRTask.ts, Line [58]-Line [61], repo is transformed into a local directory target; malformed values should be rejected at schema boundary.

✅ Suggested schema tightening
 const prSchema = z.object({
-  repo: z.string(),
-  number: z.number(),
-  url: z.string(),
-  baseBranch: z.string(),
-});
+  repo: z.enum([
+    "recoupable/recoup-api",
+    "recoupable/chat",
+    "recoupable/tasks",
+    "recoupable/docs",
+    "recoupable/database",
+    "recoupable/remotion",
+    "recoupable/bash",
+    "recoupable/skills",
+    "recoupable/cli",
+  ]),
+  number: z.number().int().positive(),
+  url: z.string().url("url must be a valid URL"),
+  baseBranch: z.string().min(1, "baseBranch is required"),
+}).strict();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const prSchema = z.object({
repo: z.string(),
number: z.number(),
url: z.string(),
baseBranch: z.string(),
});
const prSchema = z.object({
repo: z.enum([
"recoupable/recoup-api",
"recoupable/chat",
"recoupable/tasks",
"recoupable/docs",
"recoupable/database",
"recoupable/remotion",
"recoupable/bash",
"recoupable/skills",
"recoupable/cli",
]),
number: z.number().int().positive(),
url: z.string().url("url must be a valid URL"),
baseBranch: z.string().min(1, "baseBranch is required"),
}).strict();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/schemas/updatePRSchema.ts` around lines 3 - 8, The prSchema is too
permissive and must validate inputs before they hit git path logic: tighten
prSchema (the exported const prSchema) so repo is constrained to a GitHub
"owner/repo" pattern (e.g., regex enforcing no path traversal chars and exactly
two segments), number is a positive integer (z.number().int().positive()), and
url is validated as a proper GitHub PR URL (or at minimum a well-formed URL
starting with "https://github.com/" and matching /pull/ or /pulls/); also ensure
baseBranch is a non-empty string. Implement these checks using Zod refinements
or regex checks on prSchema so malformed values are rejected at parse time
before updatePRTask.ts uses them.

sweetmantech and others added 2 commits March 3, 2026 16:50
Move global URL rewrite config (HTTPS + SSH) before `git clone
--recurse-submodules` so nested submodules like skills/songwriting
can authenticate. Previously the rewrite was set after clone, causing
"could not read Username" errors on nested submodule fetches.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…utils

Extract getGitHubAuthPrefix() utility to eliminate duplicated
`https://x-access-token:${token}@github.com/` construction in
cloneRecoupMonorepo, addOrgSubmodules, and ensureGithubRepo.

Replace inline `echo ~` calls in addOrgSubmodules and pushOrgRepos
with the shared getSandboxHomeDir utility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sandboxes/ensureGithubRepo.ts (1)

77-123: ⚠️ Potential issue | 🟠 Major

Add SSH URL rewrite for account repo submodules.

Account repositories can contain submodules with SSH URLs (e.g., git@github.com:org/repo). The current code only rewrites HTTPS URLs, causing submodule initialization to fail when SSH URLs are present in .gitmodules. Add an SSH URL rewrite rule alongside the existing HTTPS one, matching the pattern used in cloneRecoupMonorepo.ts.

🔧 Proposed fix
       // Set up URL rewriting so submodule clones use auth
       await sandbox.runCommand({
         cmd: "git",
         args: [
           "config",
           `url.${authPrefix}.insteadOf`,
           "https://github.com/",
         ],
       });
+
+      await sandbox.runCommand({
+        cmd: "git",
+        args: [
+          "config",
+          `url.${authPrefix}.insteadOf`,
+          "git@github.com:",
+        ],
+      });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/ensureGithubRepo.ts` around lines 77 - 123, In
ensureGithubRepo.ts, after setting the HTTPS URL rewrite with sandbox.runCommand
(the git config call that uses `url.${authPrefix}.insteadOf
"https://github.com/"`), also add the SSH rewrite rule used in
cloneRecoupMonorepo.ts so submodules using `git@github.com:` are rewritten to
the same authPrefix; i.e., run an additional git config command to map
`git@github.com:` (or the equivalent SSH pattern used in cloneRecoupMonorepo.ts)
to `authPrefix` so submodule clones work; update the
runGitCommand/sandbox.runCommand call sequence near the checkout/main logic to
include this second rewrite immediately after the HTTPS rewrite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sandboxes/git/addOrgSubmodules.ts`:
- Line 74: The code builds authedUrl using
remoteUrl.replace("https://github.com/", authPrefix) which fails for SSH-style
remotes (git@github.com:owner/repo.git); update the logic that produces
authedUrl (use the same code block handling remoteUrl/authPrefix in
addOrgSubmodules) to detect both HTTPS and SSH GitHub origins and normalize SSH
form to an HTTPS URL with the embedded authPrefix before replacing—i.e., if
remoteUrl startsWith "git@github.com:" convert "git@github.com:owner/repo.git"
to "https://github.com/owner/repo.git" (then apply authPrefix replacement), and
add a unit test case for an SSH-style remote to ensure the tokenized URL is
produced correctly.

---

Outside diff comments:
In `@src/sandboxes/ensureGithubRepo.ts`:
- Around line 77-123: In ensureGithubRepo.ts, after setting the HTTPS URL
rewrite with sandbox.runCommand (the git config call that uses
`url.${authPrefix}.insteadOf "https://github.com/"`), also add the SSH rewrite
rule used in cloneRecoupMonorepo.ts so submodules using `git@github.com:` are
rewritten to the same authPrefix; i.e., run an additional git config command to
map `git@github.com:` (or the equivalent SSH pattern used in
cloneRecoupMonorepo.ts) to `authPrefix` so submodule clones work; update the
runGitCommand/sandbox.runCommand call sequence near the checkout/main logic to
include this second rewrite immediately after the HTTPS rewrite.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 625b593 and 2f10b5a.

📒 Files selected for processing (9)
  • src/sandboxes/__tests__/addOrgSubmodules.test.ts
  • src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts
  • src/sandboxes/__tests__/getGitHubAuthPrefix.test.ts
  • src/sandboxes/__tests__/pushOrgRepos.test.ts
  • src/sandboxes/cloneRecoupMonorepo.ts
  • src/sandboxes/ensureGithubRepo.ts
  • src/sandboxes/getGitHubAuthPrefix.ts
  • src/sandboxes/git/addOrgSubmodules.ts
  • src/sandboxes/git/pushOrgRepos.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/sandboxes/cloneRecoupMonorepo.ts

sweetmantech and others added 3 commits March 3, 2026 17:02
Replace `git clone --recurse-submodules` with a two-step approach:
1. Clone without submodules
2. Set URL rewriting in the repo's LOCAL config (not --global)
3. Run `git submodule update --init --recursive` separately

The --global git config was not propagating to submodule child
processes in Vercel Sandboxes, causing "could not read Username"
errors on nested submodules like skills/songwriting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Only init top-level submodules (api, chat, tasks, etc.) — nested
submodules inside skills/ are left to the agent if needed, following
the same delegation pattern used for org submodules. This avoids
git config propagation issues in Vercel Sandboxes where the parent
repo's insteadOf config doesn't reach nested submodule processes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- cloneRecoupMonorepo now returns null if URL rewriting or submodule
  init fails (previously silently continued with a broken clone)
- detectChangedSubmodules now logs "No changed submodules detected"
  when the list is empty, instead of the misleading
  "Changed submodules detected" regardless of result

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts (3)

111-122: Align assertion coverage with the test name intent.

This test says “Recoup Agent” but currently only asserts user.email. Add a user.name assertion too.

🛠️ Suggested patch
   const emailCall = sandbox.runCommand.mock.calls.find((call: any[]) =>
     call[0]?.args?.includes("user.email"),
   );
+  const nameCall = sandbox.runCommand.mock.calls.find((call: any[]) =>
+    call[0]?.args?.includes("user.name"),
+  );
   expect(emailCall).toBeDefined();
   expect(emailCall![0].args).toContain("agent@recoupable.com");
+  expect(nameCall).toBeDefined();
+  expect(nameCall![0].args).toContain("Recoup Agent");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts` around lines 111 - 122,
The test "configures git user as Recoup Agent" only asserts user.email; update
the test that calls cloneRecoupMonorepo to also assert git user.name is set to
"Recoup Agent": after locating the runCommand mock call for "user.email" (using
sandbox.runCommand.mock.calls and call[0].args.includes("user.email")), add a
similar lookup/assert that a call includes "user.name" and that its args contain
"Recoup Agent" (reference the test function and cloneRecoupMonorepo invocation
to find where to add this assertion).

84-100: Strengthen this test to avoid empty-match false positives.

If rewriteCalls is empty, the loop never runs and this test can pass incorrectly. Assert expected rewrite call presence first.

✅ Suggested patch
   const rewriteCalls = calls.filter(
     (c: any) =>
       c.cmd === "git" &&
       c.args?.some((a: string) => a.includes("insteadOf")),
   );
+  expect(rewriteCalls.length).toBeGreaterThanOrEqual(2);
   for (const call of rewriteCalls) {
     expect(call.args).not.toContain("--global");
     expect(call.args).toContain("-C");
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts` around lines 84 - 100,
The test currently can pass if no rewrite calls are found because the loop is
skipped; before iterating over rewriteCalls in the "uses repo-local config..."
test, assert that rewriteCalls is non-empty (e.g.,
expect(rewriteCalls.length).toBeGreaterThan(0) or
expect(rewriteCalls).not.toHaveLength(0)) so the test fails when
cloneRecoupMonorepo or createMockSandbox produces no git insteadOf rewrite
invocation; keep the rest of the checks (no "--global" and presence of "-C")
unchanged.

1-1: Restore process.env.GITHUB_TOKEN after each test for isolation.

beforeEach overrides env state but it is never restored, which can leak into other suites running in the same worker.

🧪 Suggested patch
-import { describe, it, expect, vi, beforeEach } from "vitest";
+import { describe, it, expect, vi, beforeEach, afterEach } from "vitest";
+
+const originalGitHubToken = process.env.GITHUB_TOKEN;
@@
 beforeEach(() => {
   vi.clearAllMocks();
   process.env.GITHUB_TOKEN = "ghp_test123";
 });
+
+afterEach(() => {
+  if (originalGitHubToken === undefined) {
+    delete process.env.GITHUB_TOKEN;
+  } else {
+    process.env.GITHUB_TOKEN = originalGitHubToken;
+  }
+});

Also applies to: 34-37

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts` at line 1, The test
suite's beforeEach mutates process.env.GITHUB_TOKEN without restoring it, so
capture the original value before tests run and restore it in an afterEach to
avoid leaking into other suites; update the test file (referencing beforeEach
and process.env.GITHUB_TOKEN) to save the original token into a local variable
and reset process.env.GITHUB_TOKEN (or delete it if originally undefined) in an
afterEach cleanup block (also apply the same fix where similar setup occurs
around lines 34-37).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts`:
- Around line 111-122: The test "configures git user as Recoup Agent" only
asserts user.email; update the test that calls cloneRecoupMonorepo to also
assert git user.name is set to "Recoup Agent": after locating the runCommand
mock call for "user.email" (using sandbox.runCommand.mock.calls and
call[0].args.includes("user.email")), add a similar lookup/assert that a call
includes "user.name" and that its args contain "Recoup Agent" (reference the
test function and cloneRecoupMonorepo invocation to find where to add this
assertion).
- Around line 84-100: The test currently can pass if no rewrite calls are found
because the loop is skipped; before iterating over rewriteCalls in the "uses
repo-local config..." test, assert that rewriteCalls is non-empty (e.g.,
expect(rewriteCalls.length).toBeGreaterThan(0) or
expect(rewriteCalls).not.toHaveLength(0)) so the test fails when
cloneRecoupMonorepo or createMockSandbox produces no git insteadOf rewrite
invocation; keep the rest of the checks (no "--global" and presence of "-C")
unchanged.
- Line 1: The test suite's beforeEach mutates process.env.GITHUB_TOKEN without
restoring it, so capture the original value before tests run and restore it in
an afterEach to avoid leaking into other suites; update the test file
(referencing beforeEach and process.env.GITHUB_TOKEN) to save the original token
into a local variable and reset process.env.GITHUB_TOKEN (or delete it if
originally undefined) in an afterEach cleanup block (also apply the same fix
where similar setup occurs around lines 34-37).

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f10b5a and 582c587.

📒 Files selected for processing (4)
  • src/sandboxes/__tests__/cloneRecoupMonorepo.test.ts
  • src/sandboxes/__tests__/detectChangedSubmodules.test.ts
  • src/sandboxes/cloneRecoupMonorepo.ts
  • src/sandboxes/detectChangedSubmodules.ts
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/sandboxes/detectChangedSubmodules.ts
  • src/sandboxes/cloneRecoupMonorepo.ts
  • src/sandboxes/tests/detectChangedSubmodules.test.ts

sweetmantech and others added 2 commits March 3, 2026 17:39
- logStep now accepts optional details param (included in logger.log
  but not in metadata) for debugging without polluting the UI
- codingAgentTask logs agent stdout/stderr (last 2000 chars) after
  agent completes
- detectChangedSubmodules logs git status output per submodule and
  lists scanned dirs when no changes found

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace deterministic git operations (clone, push, PR creation) with
OpenClaw agent prompts. The agent operates in its own workspace, solving
the path mismatch that caused failures in production.

New files:
- submoduleConfig.ts: shared SUBMODULE_CONFIG constant
- parsePRUrls.ts: parses PR_CREATED sentinel lines from agent stdout
- cloneMonorepoViaAgent.ts: delegates clone to agent
- pushAndCreatePRsViaAgent.ts: delegates push + PR creation to agent

Deleted obsolete files:
- cloneRecoupMonorepo.ts, detectChangedSubmodules.ts, createSubmodulePR.ts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
src/tasks/updatePRTask.ts (2)

40-90: ⚠️ Potential issue | 🟠 Major

Add a catch block to notify callback on unexpected failures.

The try-finally block lacks a catch clause. If an error is thrown, the callback is never notified with a "failed" status, leaving the caller without a terminal status update.

🛡️ Proposed fix
     try {
       // ... existing code ...
       return { snapshotId: newSnapshot.snapshotId };
+    } catch (error) {
+      logger.error("Update PR task failed", { error });
+      metadata.set("currentStep", "Failed");
+      await notifyCodingAgentCallback({
+        threadId: callbackThreadId,
+        status: "failed",
+        message: error instanceof Error ? error.message : String(error),
+      });
+      throw error;
     } finally {
       logger.log("Stopping sandbox", { sandboxId: sandbox.sandboxId });
       await sandbox.stop();
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 40 - 90, Wrap the existing
try-finally with a try-catch-finally: add a catch(err) that calls
notifyCodingAgentCallback({ threadId: callbackThreadId, status: "failed", error:
String(err) }) and sets metadata.set("currentStep", "Failed") (use the same
callbackThreadId and metadata referenced in the code), then rethrow the error so
the caller sees the failure; keep the existing finally block that logs and
awaits sandbox.stop() so cleanup always runs.

45-49: ⚠️ Potential issue | 🟠 Major

Handle non-zero AI agent exit code before proceeding.

The agent's result is not checked for success. If runOpenClawAgent returns a non-zero exit code, the task continues and may incorrectly report "updated" status.

🛡️ Proposed fix
       logStep("Running AI agent with feedback");
-      await runOpenClawAgent(sandbox, {
+      const agentResult = await runOpenClawAgent(sandbox, {
         label: "Apply feedback",
         message: `The following feedback was given on the existing changes on branch "${branch}":\n\n${feedback}\n\nPlease make the requested changes.`,
       });
+
+      if (agentResult.exitCode !== 0) {
+        throw new Error(`Agent failed to apply feedback (exit code ${agentResult.exitCode}): ${agentResult.stderr || agentResult.stdout}`);
+      }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 45 - 49, Capture and inspect the
result returned by runOpenClawAgent (called in updatePRTask after
logStep("Running AI agent with feedback")); if the agent returns a non-zero exit
code or indicates failure, log an error via logStep/process logger and
abort/throw or return a failure so the task does not continue and incorrectly
mark the PR as "updated"; ensure you reference the agent result variable (e.g.,
const result = await runOpenClawAgent(...)) and check result.exitCode or
result.success before proceeding.
🧹 Nitpick comments (1)
src/sandboxes/parsePRUrls.ts (1)

29-31: Use a normalized repo→baseBranch lookup map instead of per-line scans.

The current Object.values(...).find(...) runs for every parsed PR line and is case-sensitive. A precomputed normalized map avoids repeated scans and prevents accidental fallback to "main" when casing differs.

♻️ Proposed refactor
 import { SUBMODULE_CONFIG } from "./submoduleConfig";
 
+const BASE_BRANCH_BY_REPO = new Map(
+  Object.values(SUBMODULE_CONFIG).map(({ repo, baseBranch }) => [repo.toLowerCase(), baseBranch]),
+);
+
 export interface ParsedPR {
   repo: string;
   number: number;
   url: string;
   baseBranch: string;
@@
-    const repo = match[2];
+    const repo = match[2];
     const number = parseInt(match[3], 10);
@@
-    const configEntry = Object.values(SUBMODULE_CONFIG).find((c) => c.repo === repo);
-    const baseBranch = configEntry?.baseBranch ?? "main";
+    const baseBranch = BASE_BRANCH_BY_REPO.get(repo.toLowerCase()) ?? "main";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/parsePRUrls.ts` around lines 29 - 31, Precompute a normalized
lookup map from SUBMODULE_CONFIG to avoid per-line scans and ensure
case-insensitive matching: build a map (e.g., SUBMODULE_BASE_BRANCH_MAP) where
keys are repo names lowercased and values are baseBranch, then in parsePRUrls.ts
replace the repeated Object.values(SUBMODULE_CONFIG).find((c) => c.repo ===
repo) and the configEntry/baseBranch logic with a direct lookup using
repo.toLowerCase() and fall back only if the lookup is missing; update
references to SUBMODULE_CONFIG, configEntry and baseBranch accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/sandboxes/cloneMonorepoViaAgent.ts`:
- Around line 12-21: runOpenClawAgent is called but its returned exitCode is
ignored, so a failed git clone can go unnoticed; update the call in
cloneMonorepoViaAgent to capture the result (e.g., const result = await
runOpenClawAgent(...)) and immediately check result.exitCode (or result.status)
and fail fast by throwing an Error or returning a rejected Promise / calling
process.exit(1) with a clear message that includes the label "Clone monorepo via
agent" and the exit code so downstream steps do not run on a failed clone.

In `@src/sandboxes/pushAndCreatePRsViaAgent.ts`:
- Around line 30-57: The code currently calls parsePRUrls(result.stdout) without
verifying runOpenClawAgent succeeded; update the logic around runOpenClawAgent
and result to check result.exitCode (or equivalent status) after calling
runOpenClawAgent(sandbox, ...) and before calling parsePRUrls; if exitCode is
non-zero, surface a clear error (throw or return an error) including
result.stderr and exitCode so failures aren’t silently ignored, otherwise call
parsePRUrls(result.stdout) as today; reference runOpenClawAgent, the result
object (result.exitCode, result.stderr, result.stdout), and parsePRUrls to
locate where to add the check and error handling.

In `@src/tasks/updatePRTask.ts`:
- Around line 56-72: The runOpenClawAgent call in updatePRTask
(runOpenClawAgent(..., { label: "Push feedback changes", ... })) does not check
the agent result, so a failed push can still lead to snapshot/notify as
"updated"; change the code to capture the return value (e.g., const pushResult =
await runOpenClawAgent(...)), inspect its success/exit code or error field
(e.g., pushResult.exitCode !== 0 or !pushResult.success), and on failure throw
or return a rejected error (or log and set status to failed) to prevent
subsequent snapshot/notifyUpdated logic from running; update any downstream
logic that assumes success (snapshot, notifyUpdated) to only run when the
pushResult indicates success.

---

Duplicate comments:
In `@src/tasks/updatePRTask.ts`:
- Around line 40-90: Wrap the existing try-finally with a try-catch-finally: add
a catch(err) that calls notifyCodingAgentCallback({ threadId: callbackThreadId,
status: "failed", error: String(err) }) and sets metadata.set("currentStep",
"Failed") (use the same callbackThreadId and metadata referenced in the code),
then rethrow the error so the caller sees the failure; keep the existing finally
block that logs and awaits sandbox.stop() so cleanup always runs.
- Around line 45-49: Capture and inspect the result returned by runOpenClawAgent
(called in updatePRTask after logStep("Running AI agent with feedback")); if the
agent returns a non-zero exit code or indicates failure, log an error via
logStep/process logger and abort/throw or return a failure so the task does not
continue and incorrectly mark the PR as "updated"; ensure you reference the
agent result variable (e.g., const result = await runOpenClawAgent(...)) and
check result.exitCode or result.success before proceeding.

---

Nitpick comments:
In `@src/sandboxes/parsePRUrls.ts`:
- Around line 29-31: Precompute a normalized lookup map from SUBMODULE_CONFIG to
avoid per-line scans and ensure case-insensitive matching: build a map (e.g.,
SUBMODULE_BASE_BRANCH_MAP) where keys are repo names lowercased and values are
baseBranch, then in parsePRUrls.ts replace the repeated
Object.values(SUBMODULE_CONFIG).find((c) => c.repo === repo) and the
configEntry/baseBranch logic with a direct lookup using repo.toLowerCase() and
fall back only if the lookup is missing; update references to SUBMODULE_CONFIG,
configEntry and baseBranch accordingly.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 582c587 and 8c1a14c.

📒 Files selected for processing (13)
  • src/sandboxes/__tests__/cloneMonorepoViaAgent.test.ts
  • src/sandboxes/__tests__/logStep.test.ts
  • src/sandboxes/__tests__/parsePRUrls.test.ts
  • src/sandboxes/__tests__/pushAndCreatePRsViaAgent.test.ts
  • src/sandboxes/cloneMonorepoViaAgent.ts
  • src/sandboxes/logStep.ts
  • src/sandboxes/parsePRUrls.ts
  • src/sandboxes/pushAndCreatePRsViaAgent.ts
  • src/sandboxes/submoduleConfig.ts
  • src/tasks/__tests__/codingAgentTask.test.ts
  • src/tasks/__tests__/updatePRTask.test.ts
  • src/tasks/codingAgentTask.ts
  • src/tasks/updatePRTask.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/tasks/codingAgentTask.ts

Comment on lines +12 to +21
await runOpenClawAgent(sandbox, {
label: "Clone monorepo via agent",
message: [
"Clone the Recoup monorepo and init its submodules:",
"1. Run: git clone https://github.com/recoupable/Recoup-Monorepo.git",
"2. cd into the cloned directory",
"3. Run: git submodule update --init (do NOT use --recursive)",
"4. Configure git user: git config user.email agent@recoupable.com && git config user.name 'Recoup Agent'",
].join("\n"),
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fail fast when agent clone command fails.

runOpenClawAgent returns exitCode, but this function ignores it. A failed clone can silently continue and break downstream PR/update steps.

💡 Proposed fix
 export async function cloneMonorepoViaAgent(sandbox: Sandbox): Promise<void> {
-  await runOpenClawAgent(sandbox, {
+  const result = await runOpenClawAgent(sandbox, {
     label: "Clone monorepo via agent",
     message: [
       "Clone the Recoup monorepo and init its submodules:",
       "1. Run: git clone https://github.com/recoupable/Recoup-Monorepo.git",
       "2. cd into the cloned directory",
       "3. Run: git submodule update --init (do NOT use --recursive)",
       "4. Configure git user: git config user.email agent@recoupable.com && git config user.name 'Recoup Agent'",
     ].join("\n"),
   });
+
+  if (result.exitCode !== 0) {
+    throw new Error(`Clone monorepo via agent failed: ${result.stderr || "unknown error"}`);
+  }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
await runOpenClawAgent(sandbox, {
label: "Clone monorepo via agent",
message: [
"Clone the Recoup monorepo and init its submodules:",
"1. Run: git clone https://github.com/recoupable/Recoup-Monorepo.git",
"2. cd into the cloned directory",
"3. Run: git submodule update --init (do NOT use --recursive)",
"4. Configure git user: git config user.email agent@recoupable.com && git config user.name 'Recoup Agent'",
].join("\n"),
});
export async function cloneMonorepoViaAgent(sandbox: Sandbox): Promise<void> {
const result = await runOpenClawAgent(sandbox, {
label: "Clone monorepo via agent",
message: [
"Clone the Recoup monorepo and init its submodules:",
"1. Run: git clone https://github.com/recoupable/Recoup-Monorepo.git",
"2. cd into the cloned directory",
"3. Run: git submodule update --init (do NOT use --recursive)",
"4. Configure git user: git config user.email agent@recoupable.com && git config user.name 'Recoup Agent'",
].join("\n"),
});
if (result.exitCode !== 0) {
throw new Error(`Clone monorepo via agent failed: ${result.stderr || "unknown error"}`);
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/cloneMonorepoViaAgent.ts` around lines 12 - 21,
runOpenClawAgent is called but its returned exitCode is ignored, so a failed git
clone can go unnoticed; update the call in cloneMonorepoViaAgent to capture the
result (e.g., const result = await runOpenClawAgent(...)) and immediately check
result.exitCode (or result.status) and fail fast by throwing an Error or
returning a rejected Promise / calling process.exit(1) with a clear message that
includes the label "Clone monorepo via agent" and the exit code so downstream
steps do not run on a failed clone.

Comment on lines +30 to +57
const result = await runOpenClawAgent(sandbox, {
label: "Push and create PRs via agent",
message: [
`For each submodule that has uncommitted changes, create a branch, commit, push, and open a PR.`,
``,
`Branch name: ${branch}`,
`PR title: agent: ${prompt.slice(0, 72)}`,
`PR body: Automated PR from coding agent.\n\nPrompt: ${prompt}`,
``,
`Submodule config (name: repo, base branch to target):`,
submoduleInstructions,
``,
`Steps for each changed submodule:`,
`1. cd into the submodule directory`,
`2. git checkout -b ${branch}`,
`3. git add -A`,
`4. git commit -m "agent: ${prompt.slice(0, 72)}"`,
`5. git push -u origin ${branch}`,
`6. gh pr create --repo <repo> --base <baseBranch> --head ${branch} --title "agent: ${prompt.slice(0, 72)}" --body "Automated PR from coding agent."`,
``,
`IMPORTANT: After each PR is created, output exactly this line:`,
`PR_CREATED: <the full PR URL>`,
``,
`If a submodule has no changes, skip it.`,
].join("\n"),
});

return parsePRUrls(result.stdout);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Check agent exit code before parsing output.

The function proceeds to parse stdout regardless of whether the agent succeeded. If runOpenClawAgent returns a non-zero exitCode, the stdout may be incomplete or contain error messages, leading to silent failures (empty PR array returned instead of an error).

🛡️ Proposed fix
   const result = await runOpenClawAgent(sandbox, {
     label: "Push and create PRs via agent",
     message: [
       // ... message content
     ].join("\n"),
   });

+  if (result.exitCode !== 0) {
+    throw new Error(`Agent failed to push and create PRs (exit code ${result.exitCode}): ${result.stderr || result.stdout}`);
+  }
+
   return parsePRUrls(result.stdout);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const result = await runOpenClawAgent(sandbox, {
label: "Push and create PRs via agent",
message: [
`For each submodule that has uncommitted changes, create a branch, commit, push, and open a PR.`,
``,
`Branch name: ${branch}`,
`PR title: agent: ${prompt.slice(0, 72)}`,
`PR body: Automated PR from coding agent.\n\nPrompt: ${prompt}`,
``,
`Submodule config (name: repo, base branch to target):`,
submoduleInstructions,
``,
`Steps for each changed submodule:`,
`1. cd into the submodule directory`,
`2. git checkout -b ${branch}`,
`3. git add -A`,
`4. git commit -m "agent: ${prompt.slice(0, 72)}"`,
`5. git push -u origin ${branch}`,
`6. gh pr create --repo <repo> --base <baseBranch> --head ${branch} --title "agent: ${prompt.slice(0, 72)}" --body "Automated PR from coding agent."`,
``,
`IMPORTANT: After each PR is created, output exactly this line:`,
`PR_CREATED: <the full PR URL>`,
``,
`If a submodule has no changes, skip it.`,
].join("\n"),
});
return parsePRUrls(result.stdout);
const result = await runOpenClawAgent(sandbox, {
label: "Push and create PRs via agent",
message: [
`For each submodule that has uncommitted changes, create a branch, commit, push, and open a PR.`,
``,
`Branch name: ${branch}`,
`PR title: agent: ${prompt.slice(0, 72)}`,
`PR body: Automated PR from coding agent.\n\nPrompt: ${prompt}`,
``,
`Submodule config (name: repo, base branch to target):`,
submoduleInstructions,
``,
`Steps for each changed submodule:`,
`1. cd into the submodule directory`,
`2. git checkout -b ${branch}`,
`3. git add -A`,
`4. git commit -m "agent: ${prompt.slice(0, 72)}"`,
`5. git push -u origin ${branch}`,
`6. gh pr create --repo <repo> --base <baseBranch> --head ${branch} --title "agent: ${prompt.slice(0, 72)}" --body "Automated PR from coding agent."`,
``,
`IMPORTANT: After each PR is created, output exactly this line:`,
`PR_CREATED: <the full PR URL>`,
``,
`If a submodule has no changes, skip it.`,
].join("\n"),
});
if (result.exitCode !== 0) {
throw new Error(`Agent failed to push and create PRs (exit code ${result.exitCode}): ${result.stderr || result.stdout}`);
}
return parsePRUrls(result.stdout);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/sandboxes/pushAndCreatePRsViaAgent.ts` around lines 30 - 57, The code
currently calls parsePRUrls(result.stdout) without verifying runOpenClawAgent
succeeded; update the logic around runOpenClawAgent and result to check
result.exitCode (or equivalent status) after calling runOpenClawAgent(sandbox,
...) and before calling parsePRUrls; if exitCode is non-zero, surface a clear
error (throw or return an error) including result.stderr and exitCode so
failures aren’t silently ignored, otherwise call parsePRUrls(result.stdout) as
today; reference runOpenClawAgent, the result object (result.exitCode,
result.stderr, result.stdout), and parsePRUrls to locate where to add the check
and error handling.

Comment on lines +56 to +72
await runOpenClawAgent(sandbox, {
label: "Push feedback changes",
message: [
`Stage, commit, and push the feedback changes to the existing PR branches.`,
``,
`For each of these PRs, find the submodule directory, stage all changes, commit, and push:`,
prList,
``,
`Commit message: agent: address feedback`,
``,
`Steps for each:`,
`1. cd into the submodule directory`,
`2. git add -A`,
`3. git commit -m "agent: address feedback"`,
`4. git push origin ${branch}`,
].join("\n"),
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Check push agent result for success.

Similar to the feedback step, the push operation's exit code is not validated. A failed push would silently continue to snapshot and notify as "updated".

🛡️ Proposed fix
-      await runOpenClawAgent(sandbox, {
+      const pushResult = await runOpenClawAgent(sandbox, {
         label: "Push feedback changes",
         message: [
           // ... message content
         ].join("\n"),
       });
+
+      if (pushResult.exitCode !== 0) {
+        throw new Error(`Agent failed to push changes (exit code ${pushResult.exitCode}): ${pushResult.stderr || pushResult.stdout}`);
+      }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tasks/updatePRTask.ts` around lines 56 - 72, The runOpenClawAgent call in
updatePRTask (runOpenClawAgent(..., { label: "Push feedback changes", ... }))
does not check the agent result, so a failed push can still lead to
snapshot/notify as "updated"; change the code to capture the return value (e.g.,
const pushResult = await runOpenClawAgent(...)), inspect its success/exit code
or error field (e.g., pushResult.exitCode !== 0 or !pushResult.success), and on
failure throw or return a rejected error (or log and set status to failed) to
prevent subsequent snapshot/notifyUpdated logic from running; update any
downstream logic that assumes success (snapshot, notifyUpdated) to only run when
the pushResult indicates success.

sweetmantech and others added 2 commits March 6, 2026 07:36
Keep both getSandboxHomeDir and logStep mocks in pushOrgRepos test,
adopt main's cleaner logStep implementation with extra param.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The coding agent couldn't push to GitHub because git had no credentials
configured. The GITHUB_TOKEN was injected into openclaw.json but git
push over HTTPS needs url.insteadOf rewriting (same pattern used by
ensureGithubRepo in runSandboxCommandTask).

- Add configureGitAuth: sets global url.insteadOf + git user identity
- Add getSandboxEnv: shared env builder (DRY across both tasks)
- Pass env vars to all runOpenClawAgent calls in codingAgentTask
- Update runSandboxCommandTask to use getSandboxEnv (adds GITHUB_TOKEN)
- Remove redundant git config instruction from cloneMonorepoViaAgent

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sweetmantech and others added 3 commits March 6, 2026 08:38
Moved updatePRTask, updatePRSchema, and tests to their own PR (#49)
to keep this PR focused on the coding-agent task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
setupOpenClaw already injects RECOUP_API_KEY, RECOUP_ACCOUNT_ID, and
GITHUB_TOKEN into openclaw.json. Passing them again via
sandbox.runCommand({ env }) was redundant. configureGitAuth handles
the git-level auth via url.insteadOf.

getSandboxEnv remains used by runSandboxCommandTask where the user's
command runs directly (not via OpenClaw) and needs env vars explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The submodule config had recoupable/recoup-api but the actual
submodule URL is recoupable/api per .gitmodules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sweetmantech sweetmantech merged commit 7ba7544 into main Mar 6, 2026
2 checks passed
@sweetmantech sweetmantech deleted the sweetmantech/myc-4371-recoup-coding-sandbox-task-tasks-for-coding-and-pr-review branch March 6, 2026 16:19
sweetmantech added a commit that referenced this pull request Mar 6, 2026
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant