feat(claude): add retry harness for transient Anthropic API failures#29581
feat(claude): add retry harness for transient Anthropic API failures#29581
Conversation
… API errors - Create actions/setup/js/claude_harness.cjs wrapping Claude Code CLI with retry logic for overload (529) and rate-limit (429) errors, using --continue on partial-execution retries with exponential backoff (5s->10s->20s, max 3) - Add GetHarnessScriptName() to ClaudeEngine returning "claude_harness.cjs" - Update GetExecutionSteps() to use harness prefix and --prompt-file instead of inline shell expansion "$(cat ...)" when harness is available - Update tests to reflect harness-based --prompt-file prompt passing - Add new tests for GetHarnessScriptName(), harness command format, and custom harness override behaviour Agent-Logs-Url: https://github.com/github/gh-aw/sessions/607f6264-584f-4644-9ac7-ce0eb8aa15ca Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…-agent-driver Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
| * @param {(msg: string) => void} logger | ||
| * @returns {Promise<string[]|null>} | ||
| */ | ||
| async function fetchModelsFromUrl(modelsUrl, timeoutMs, logger) { |
There was a problem hiding this comment.
@copilot refactor /reflect and models parsing into separate file shared by all harnesses
There was a problem hiding this comment.
Done in f824127. Created actions/setup/js/awf_reflect.cjs with the shared constants and functions (extractModelIds, fetchModelsFromUrl, enrichReflectModels, fetchAWFReflect). Both copilot_harness.cjs and claude_harness.cjs now require from it and pass { logger: log } to fetchAWFReflect() so each harness keeps its own log prefix. Added awf_reflect.test.cjs with 23 dedicated tests for the shared module.
There was a problem hiding this comment.
Me agree! Retry logic very important. Transient errors bad. Retry harness fix. Good code!
📰 BREAKING: Report filed by Smoke Copilot · ● 1.8M
….cjs
- Create actions/setup/js/awf_reflect.cjs with AWF API proxy /reflect
helpers: constants (AWF_API_PROXY_REFLECT_URL, AWF_REFLECT_OUTPUT_PATH,
AWF_REFLECT_TIMEOUT_MS, AWF_MODELS_URL_TIMEOUT_MS, GEMINI_MODEL_NAME_PREFIX)
and functions (extractModelIds, fetchModelsFromUrl, enrichReflectModels,
fetchAWFReflect) previously duplicated between the two harnesses
- Update copilot_harness.cjs to require from awf_reflect.cjs; pass {logger:log}
to fetchAWFReflect() to preserve [copilot-harness] log prefix
- Update claude_harness.cjs to require from awf_reflect.cjs; pass {logger:log}
to fetchAWFReflect() to preserve [claude-harness] log prefix
- Add actions/setup/js/awf_reflect.test.cjs with 23 tests covering all exported
constants and functions in the shared module
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/5451ae39-f59c-42cd-a6c0-f664fbe1a822
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
There was a problem hiding this comment.
Pull request overview
Adds a Node.js “harness” wrapper for Claude Code CLI to retry and resume after transient Anthropic API failures, and extracts shared AWF api-proxy /reflect helpers for reuse across harnesses.
Changes:
- Introduces
actions/setup/js/claude_harness.cjswith retry +--continueresume behavior and--prompt-filesupport. - Extracts AWF
/reflectfetch/enrichment logic intoactions/setup/js/awf_reflect.cjsand refactorscopilot_harness.cjsto use it. - Updates Claude engine command generation and tests to pass prompts via
--prompt-file(instead of$(cat ...)) and updates locked workflow fixtures accordingly.
Show a summary per file
| File | Description |
|---|---|
| pkg/workflow/runtime_detection.go | Updates Node runtime requirement commentary/logic for engine harness usage. |
| pkg/workflow/engine_agent_import_test.go | Updates assertions to expect --prompt-file for Claude execution. |
| pkg/workflow/claude_engine_test.go | Adds/updates tests for Claude harness script name and --prompt-file usage. |
| pkg/workflow/claude_engine.go | Wraps Claude execution with Node harness + switches prompt passing to --prompt-file. |
| actions/setup/js/copilot_harness.cjs | Refactors AWF reflect logic to shared module; passes logger into reflect fetch. |
| actions/setup/js/awf_reflect.cjs | New shared module for AWF api-proxy /reflect fetching + model enrichment. |
| actions/setup/js/awf_reflect.test.cjs | New vitest coverage for the shared AWF reflect module. |
| .github/workflows/test-create-pr-error-handling.lock.yml | Updates locked Claude invocation to use node …/claude_harness.cjs … --prompt-file …. |
| .github/workflows/smoke-agent-scoped-approved.lock.yml | Same locked command update for smoke workflow. |
| .github/workflows/smoke-agent-public-none.lock.yml | Same locked command update for smoke workflow. |
| .github/workflows/smoke-agent-public-approved.lock.yml | Same locked command update for smoke workflow. |
| .github/workflows/smoke-agent-all-none.lock.yml | Same locked command update for smoke workflow. |
| .github/workflows/smoke-agent-all-merged.lock.yml | Same locked command update for smoke workflow. |
| .github/workflows/schema-consistency-checker.lock.yml | Same locked command update for checker workflow. |
| .github/workflows/lockfile-stats.lock.yml | Same locked command update for stats workflow. |
| .github/workflows/go-pattern-detector.lock.yml | Same locked command update for detector workflow. |
| .github/workflows/github-mcp-tools-report.lock.yml | Same locked command update for report workflow. |
| .github/workflows/example-workflow-analyzer.lock.yml | Same locked command update for analyzer workflow. |
| .github/workflows/daily-team-evolution-insights.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/daily-security-red-team.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/daily-safe-outputs-conformance.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/daily-otel-instrumentation-advisor.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/daily-function-namer.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/daily-code-metrics.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/daily-choice-test.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/daily-astrostylelite-markdown-spellcheck.lock.yml | Same locked command update for daily workflow. |
| .github/workflows/copilot-agent-analysis.lock.yml | Same locked command update for analysis workflow. |
| .github/workflows/commit-changes-analyzer.lock.yml | Same locked command update for analyzer workflow. |
| .github/workflows/ci-doctor.lock.yml | Same locked command update for CI workflow. |
| .github/workflows/aw-failure-investigator.lock.yml | Same locked command update for investigator workflow. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 66/66 changed files
- Comments generated: 3
| function runProcess(command, args, attempt) { | ||
| return new Promise(resolve => { | ||
| const startTime = Date.now(); | ||
| log(`attempt ${attempt + 1}: spawning: ${command} ${args.join(" ").substring(0, 200)}`); | ||
|
|
There was a problem hiding this comment.
The harness logs the spawned command including args.join(" "), which will include the full prompt text (appended as the last arg by resolveClaudePromptFileArgs). This can leak prompt contents into stderr/agent logs and artifacts. Please redact the prompt argument (and ideally any sensitive args) from logs—e.g., log only the executable + non-prompt flags, or replace the last arg with a placeholder like <prompt omitted> when it came from --prompt-file.
| // Copilot consumes engine.harness in execution command generation. | ||
| // Claude is excluded here because its installation steps already include Node.js setup | ||
| // (GenerateNpmInstallSteps with includeNodeSetup=true), so no extra requirement is needed. | ||
| return strings.EqualFold(engineID, string(constants.CopilotEngine)) |
There was a problem hiding this comment.
This comment is now a bit misleading: Claude also supports engine.harness in execution command generation (via the new Node harness wrapper), but is excluded here because Node.js setup is already included in Claude’s installation steps. Consider rewording to clarify that Claude does consume engine.harness, but does not need an additional runtime requirement.
| const promptFile = args[i + 1]; | ||
| try { | ||
| const stat = fs.statSync(promptFile); | ||
| log(`resolved --prompt-file: path=${promptFile} size=${stat.size}B`); | ||
| promptContent = fs.readFileSync(promptFile, "utf8"); | ||
| } catch (error) { | ||
| const err = /** @type {Error} */ error; | ||
| log(`warning: failed to read --prompt-file ${promptFile}: ${err.message}; leaving arguments unchanged`); | ||
| filteredArgs.push(args[i], promptFile); | ||
| } |
There was a problem hiding this comment.
If reading the --prompt-file fails, the harness currently leaves --prompt-file <path> in the args list and proceeds. Since --prompt-file is a harness-only flag (not a Claude CLI flag), this will likely cause the underlying claude process to fail with an “unknown option” style error. Consider treating an unreadable prompt file as a hard error (exit with a clear message) or at least stripping the flag so it’s never forwarded to the Claude subprocess.
🧪 Test Quality Sentinel ReportTest Quality Score: 80/100✅ Excellent test quality
Test Classification DetailsView all 30 classified testsJavaScript —
|
| Test | Classification | Issues |
|---|---|---|
exports expected default values |
✅ Design | — |
returns null for null input |
✅ Design (edge case) | — |
returns null for empty object |
✅ Design (edge case) | — |
returns null for empty data array |
✅ Design (edge case) | — |
extracts ids from OpenAI format |
✅ Design | — |
falls back to name when id is absent in OpenAI format |
✅ Design | — |
extracts ids from Gemini format, stripping prefix |
✅ Design | — |
handles Gemini entries without the prefix |
✅ Design (edge case) | — |
returns sorted results |
✅ Design | — |
returns null for empty models array |
✅ Design (edge case) | — |
does nothing when all configured endpoints already have models |
✅ Design | — |
does nothing for unconfigured endpoints with null models |
✅ Design (edge case) | — |
does nothing when models_url is null |
✅ Design (edge case) | — |
fetches models from models_url for configured endpoints with null models |
✅ Design | fetch mocked (legitimate external I/O) |
leaves models null when models_url fetch fails |
✅ Design (error path) | — |
handles empty endpoints array |
✅ Design (edge case) | — |
returns model IDs on successful fetch |
✅ Design | — |
returns null on non-ok HTTP status |
✅ Design (error path) | — |
returns null on network error |
✅ Design (error path) | — |
saves enriched reflect data when api-proxy returns null models |
✅ Design (integration-style) | — |
does not throw when the reflect endpoint is unreachable |
✅ Design (error path) | — |
does not throw when the reflect endpoint returns non-ok status |
✅ Design (error path) | — |
uses the caller-supplied logger for all messages |
✅ Design | — |
Go — claude_engine_test.go (3 new, 4 modified)
| Test | Classification | Issues |
|---|---|---|
TestClaudeEngineGetHarnessScriptName (new) |
✅ Design | — |
TestClaudeEngineHarnessUsesPromptFile (new) |
✅ Design (error path) | — |
TestClaudeEngineCustomHarnessOverridesBuiltIn (new) |
✅ Design | — |
TestClaudeEngineNoDoubleEscapePrompt (modified) |
✅ Design | — |
TestClaudeEngineAWFWithAgentFileReadsPromptTxt (modified) |
✅ Design | — |
Go — engine_agent_import_test.go (2 modified)
| Test | Classification | Issues |
|---|---|---|
TestClaudeEngineWithAgentFromImports (modified) |
✅ Design | — |
TestClaudeEngineWithoutAgentFile (modified) |
✅ Design | — |
Flagged Tests — Requires Review
⚠️ Test Inflation — claude_engine_test.go
Issue: 78 lines added to the test file vs 30 lines in the production file (claude_engine.go) — a 2.6:1 ratio, exceeding the 2:1 threshold.
Context: This is likely benign — 3 new behavioral tests were added (TestClaudeEngineGetHarnessScriptName, TestClaudeEngineHarnessUsesPromptFile, TestClaudeEngineCustomHarnessOverridesBuiltIn) and each requires setup code and multiple assertions. The inflation appears structural rather than inflated padding. No action strictly required, but be mindful that test verbosity doesn't erode future signal.
Language Support
Tests analyzed:
- 🐹 Go (
*_test.go): 7 tests — unit (//go:build !integration) ✅ build tags present - 🟨 JavaScript (
*.test.cjs): 23 tests (vitest) — mocking limited tofetch(external I/O, legitimate)
Verdict
✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). All 30 new/modified tests verify observable behavioral contracts. No coding-guideline violations detected.
📖 Understanding Test Classifications
Design Tests (High Value) verify what the system does:
- Assert on observable outputs, return values, or state changes
- Cover error paths and boundary conditions
- Would catch a behavioral regression if deleted
- Remain valid even after internal refactoring
Implementation Tests (Low Value) verify how the system does it:
- Assert on internal function calls (mocking internals)
- Only test the happy path with typical inputs
- Break during legitimate refactoring even when behavior is correct
- Give false assurance: they pass even when the system is wrong
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.
References: §25228247290
🧪 Test quality analysis by Test Quality Sentinel · ● 651.3K · ◷
Documents the decision to wrap Claude Code CLI invocation in a Node.js harness (claude_harness.cjs) with exponential-backoff retry logic for transient Anthropic API errors (HTTP 529/429) and session-resume via --continue. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Commit pushed:
|
|
@copilot review all comments |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…ction.go - Redact prompt content in spawn log: add logArgs param to runProcess(), build safeInitialArgs (last arg replaced with '<prompt omitted>') only when original args contained --prompt-file, pass it for logging so task instructions are never written to stderr or agent log artifacts - Treat unreadable --prompt-file as a fatal startup error: resolveClaudePromptFileArgs() now throws instead of silently re-inserting the harness-only flag; main() catches, logs the error and exits before the flag could reach the claude subprocess - Clarify runtime_detection.go comment: reword to note that both Copilot and Claude consume engine.harness, but Claude needs no extra Node.js requirement because its install steps already provision Node via GenerateNpmInstallSteps - Add claude_harness.test.cjs with 10 tests covering resolveClaudePromptFileArgs (including the new throw-on-error path) and stripPromptFileArgs Agent-Logs-Url: https://github.com/github/gh-aw/sessions/91ac6eae-c232-4f42-9bac-1fd6ff799cee Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Addressed all three reviewer comments in f427c1e (and the runtime_detection.go comment was already in c306471):
Also added Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
|
|
|
|
🚀 Smoke OpenCode MISSION COMPLETE! OpenCode delivered. 🔥 |
|
🎬 THE END — Smoke Claude MISSION: ACCOMPLISHED! The hero saves the day! ✨ |
|
📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing... |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
Agent Container Tool Check
Result: 12/12 tools available ✅ Overall Status: PASS
|
|
Codex smoke: FAIL Warning Firewall blocked 2 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "ab.chatgpt.com"
- "chatgpt.com"See Network Configuration for more information.
|
Comment MemoryNote This comment is managed by comment memory.It stores persistent context for this thread in the code block at the top of this comment. Warning Firewall blocked 2 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "ab.chatgpt.com"
- "chatgpt.com"See Network Configuration for more information.
|
|
✅ Smoke test PASS for PR #29581
Overall: PASS · Author: app/copilot-swe-agent · Assignees: pelikhan, Copilot
|
|
🦴 UGH! Me smoke test agent! Me was here! Fire burn bright! Tests run good! Cave painting confirmed: ALL GREEN! 🎉🐒
|
Comment MemoryNote This comment is managed by comment memory.It stores persistent context for this thread in the code block at the top of this comment.
|
There was a problem hiding this comment.
Me copilot smoke test agent. PR add retry harness for Claude. Smart idea. Retry help when API grumpy. Code look solid.
📰 BREAKING: Report filed by Smoke Copilot · ● 1.8M
| } | ||
|
|
||
| // GetHarnessScriptName returns the filename of the JavaScript harness script that wraps | ||
| // the Claude Code CLI with retry logic for transient Anthropic API errors (overload, rate limit). |
There was a problem hiding this comment.
UGH! Big brain retry logic here. Me like how retry wrap Claude call. Very smart.
| @@ -0,0 +1,383 @@ | |||
| // @ts-check | |||
There was a problem hiding this comment.
Me see new harness file! JS wrap Claude CLI. Handle retry outside Go. Good separation of concerns. Me approve.
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
💥 Smoke Test Run 25230880894
Overall: PARTIAL (test #19 skipped; #4 used bash fallback)
|
There was a problem hiding this comment.
💥 Automated smoke test review - all systems nominal! Run 25230880894
💥 [THE END] — Illustrated by Smoke Claude · ● 368.9K
| # shellcheck disable=SC1003 | ||
| sudo -E awf --config "${RUNNER_TEMP}/gh-aw/awf-config.json" --container-workdir "${GITHUB_WORKSPACE}" --mount "${RUNNER_TEMP}/gh-aw:${RUNNER_TEMP}/gh-aw:ro" --mount "${RUNNER_TEMP}/gh-aw:/host${RUNNER_TEMP}/gh-aw:ro" --tty --env-all --exclude-env ANTHROPIC_API_KEY --exclude-env GITHUB_MCP_SERVER_TOKEN --exclude-env MCP_GATEWAY_API_KEY --log-level info --proxy-logs-dir /tmp/gh-aw/sandbox/firewall/logs --audit-dir /tmp/gh-aw/sandbox/firewall/audit --enable-host-access --allow-host-ports 80,443,8080 --skip-pull \ | ||
| -- /bin/bash -c 'export PATH="${RUNNER_TEMP}/gh-aw/mcp-cli/bin:$PATH" && export PATH="$(find /opt/hostedtoolcache /home/runner/work/_tool -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && claude --print --no-chrome --mcp-config "${{ runner.temp }}/gh-aw/mcp-config/mcp-servers.json" --allowed-tools '\''Bash,BashOutput,Edit,Edit(/tmp/gh-aw/cache-memory/*),ExitPlanMode,Glob,Grep,KillBash,LS,MultiEdit,MultiEdit(/tmp/gh-aw/cache-memory/*),NotebookEdit,NotebookRead,Read,Read(/tmp/gh-aw/cache-memory/*),Task,TodoWrite,Write,Write(/tmp/gh-aw/cache-memory/*),mcp__agenticworkflows,mcp__github__download_workflow_run_artifact,mcp__github__get_code_scanning_alert,mcp__github__get_commit,mcp__github__get_dependabot_alert,mcp__github__get_discussion,mcp__github__get_discussion_comments,mcp__github__get_file_contents,mcp__github__get_job_logs,mcp__github__get_label,mcp__github__get_latest_release,mcp__github__get_me,mcp__github__get_notification_details,mcp__github__get_pull_request,mcp__github__get_pull_request_comments,mcp__github__get_pull_request_diff,mcp__github__get_pull_request_files,mcp__github__get_pull_request_review_comments,mcp__github__get_pull_request_reviews,mcp__github__get_pull_request_status,mcp__github__get_release_by_tag,mcp__github__get_secret_scanning_alert,mcp__github__get_tag,mcp__github__get_workflow_run,mcp__github__get_workflow_run_logs,mcp__github__get_workflow_run_usage,mcp__github__issue_read,mcp__github__list_branches,mcp__github__list_code_scanning_alerts,mcp__github__list_commits,mcp__github__list_dependabot_alerts,mcp__github__list_discussion_categories,mcp__github__list_discussions,mcp__github__list_issue_types,mcp__github__list_issues,mcp__github__list_label,mcp__github__list_notifications,mcp__github__list_pull_requests,mcp__github__list_releases,mcp__github__list_secret_scanning_alerts,mcp__github__list_starred_repositories,mcp__github__list_tags,mcp__github__list_workflow_jobs,mcp__github__list_workflow_run_artifacts,mcp__github__list_workflow_runs,mcp__github__list_workflows,mcp__github__pull_request_read,mcp__github__search_code,mcp__github__search_issues,mcp__github__search_orgs,mcp__github__search_pull_requests,mcp__github__search_repositories,mcp__github__search_users,mcp__safeoutputs'\'' --debug-file /tmp/gh-aw/agent-stdio.log --verbose --permission-mode bypassPermissions --output-format stream-json "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/agent-stdio.log | ||
| -- /bin/bash -c 'export PATH="${RUNNER_TEMP}/gh-aw/mcp-cli/bin:$PATH" && export PATH="$(find /opt/hostedtoolcache /home/runner/work/_tool -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && GH_AW_NODE_EXEC="${GH_AW_NODE_BIN:-}"; if [ -z "$GH_AW_NODE_EXEC" ] || [ ! -x "$GH_AW_NODE_EXEC" ]; then GH_AW_NODE_EXEC="$(command -v node 2>/dev/null || echo node)"; fi; "$GH_AW_NODE_EXEC" ${RUNNER_TEMP}/gh-aw/actions/claude_harness.cjs claude --print --no-chrome --mcp-config "${{ runner.temp }}/gh-aw/mcp-config/mcp-servers.json" --allowed-tools '\''Bash,BashOutput,Edit,Edit(/tmp/gh-aw/cache-memory/*),ExitPlanMode,Glob,Grep,KillBash,LS,MultiEdit,MultiEdit(/tmp/gh-aw/cache-memory/*),NotebookEdit,NotebookRead,Read,Read(/tmp/gh-aw/cache-memory/*),Task,TodoWrite,Write,Write(/tmp/gh-aw/cache-memory/*),mcp__agenticworkflows,mcp__github__download_workflow_run_artifact,mcp__github__get_code_scanning_alert,mcp__github__get_commit,mcp__github__get_dependabot_alert,mcp__github__get_discussion,mcp__github__get_discussion_comments,mcp__github__get_file_contents,mcp__github__get_job_logs,mcp__github__get_label,mcp__github__get_latest_release,mcp__github__get_me,mcp__github__get_notification_details,mcp__github__get_pull_request,mcp__github__get_pull_request_comments,mcp__github__get_pull_request_diff,mcp__github__get_pull_request_files,mcp__github__get_pull_request_review_comments,mcp__github__get_pull_request_reviews,mcp__github__get_pull_request_status,mcp__github__get_release_by_tag,mcp__github__get_secret_scanning_alert,mcp__github__get_tag,mcp__github__get_workflow_run,mcp__github__get_workflow_run_logs,mcp__github__get_workflow_run_usage,mcp__github__issue_read,mcp__github__list_branches,mcp__github__list_code_scanning_alerts,mcp__github__list_commits,mcp__github__list_dependabot_alerts,mcp__github__list_discussion_categories,mcp__github__list_discussions,mcp__github__list_issue_types,mcp__github__list_issues,mcp__github__list_label,mcp__github__list_notifications,mcp__github__list_pull_requests,mcp__github__list_releases,mcp__github__list_secret_scanning_alerts,mcp__github__list_starred_repositories,mcp__github__list_tags,mcp__github__list_workflow_jobs,mcp__github__list_workflow_run_artifacts,mcp__github__list_workflow_runs,mcp__github__list_workflows,mcp__github__pull_request_read,mcp__github__search_code,mcp__github__search_issues,mcp__github__search_orgs,mcp__github__search_pull_requests,mcp__github__search_repositories,mcp__github__search_users,mcp__safeoutputs'\'' --debug-file /tmp/gh-aw/agent-stdio.log --verbose --permission-mode bypassPermissions --output-format stream-json --prompt-file /tmp/gh-aw/aw-prompts/prompt.txt${GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/agent-stdio.log |
There was a problem hiding this comment.
🤖 Smoke test review comment #1 — The claude_harness.cjs wrapper is now correctly invoked via node with --prompt-file. This is a good improvement over inline shell expansion which could fail with large prompts.
| # shellcheck disable=SC1003 | ||
| sudo -E awf --config "${RUNNER_TEMP}/gh-aw/awf-config.json" --container-workdir "${GITHUB_WORKSPACE}" --mount "${RUNNER_TEMP}/gh-aw:${RUNNER_TEMP}/gh-aw:ro" --mount "${RUNNER_TEMP}/gh-aw:/host${RUNNER_TEMP}/gh-aw:ro" --mount "${RUNNER_TEMP}/gh-aw/safeoutputs/upload-artifacts:${RUNNER_TEMP}/gh-aw/safeoutputs/upload-artifacts:rw" --tty --env-all --exclude-env ANTHROPIC_API_KEY --exclude-env GH_TOKEN --exclude-env GITHUB_MCP_SERVER_TOKEN --exclude-env MCP_GATEWAY_API_KEY --log-level info --proxy-logs-dir /tmp/gh-aw/sandbox/firewall/logs --audit-dir /tmp/gh-aw/sandbox/firewall/audit --enable-host-access --allow-host-ports 80,443,8080 --skip-pull --difc-proxy-host host.docker.internal:18443 --difc-proxy-ca-cert /tmp/gh-aw/difc-proxy-tls/ca.crt \ | ||
| -- /bin/bash -c 'export PATH="${RUNNER_TEMP}/gh-aw/mcp-cli/bin:$PATH" && export PATH="$(find /opt/hostedtoolcache /home/runner/work/_tool -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && claude --print --no-chrome --mcp-config "${{ runner.temp }}/gh-aw/mcp-config/mcp-servers.json" --allowed-tools '\''Bash(cat),Bash(cat:*),Bash(date),Bash(date:*),Bash(echo),Bash(echo:*),Bash(grep),Bash(head),Bash(ls),Bash(mkdir:*),Bash(pwd),Bash(safeoutputs:*),Bash(sort),Bash(tail),Bash(tee:*),Bash(uniq),Bash(wc),Bash(yq),BashOutput,Edit,ExitPlanMode,Glob,Grep,KillBash,LS,MultiEdit,NotebookEdit,NotebookRead,Read,Task,TodoWrite,Write,mcp__github__download_workflow_run_artifact,mcp__github__get_code_scanning_alert,mcp__github__get_commit,mcp__github__get_dependabot_alert,mcp__github__get_discussion,mcp__github__get_discussion_comments,mcp__github__get_file_contents,mcp__github__get_job_logs,mcp__github__get_label,mcp__github__get_latest_release,mcp__github__get_me,mcp__github__get_notification_details,mcp__github__get_pull_request,mcp__github__get_pull_request_comments,mcp__github__get_pull_request_diff,mcp__github__get_pull_request_files,mcp__github__get_pull_request_review_comments,mcp__github__get_pull_request_reviews,mcp__github__get_pull_request_status,mcp__github__get_release_by_tag,mcp__github__get_secret_scanning_alert,mcp__github__get_tag,mcp__github__get_workflow_run,mcp__github__get_workflow_run_logs,mcp__github__get_workflow_run_usage,mcp__github__issue_read,mcp__github__list_branches,mcp__github__list_code_scanning_alerts,mcp__github__list_commits,mcp__github__list_dependabot_alerts,mcp__github__list_discussion_categories,mcp__github__list_discussions,mcp__github__list_issue_types,mcp__github__list_issues,mcp__github__list_label,mcp__github__list_notifications,mcp__github__list_pull_requests,mcp__github__list_releases,mcp__github__list_secret_scanning_alerts,mcp__github__list_starred_repositories,mcp__github__list_tags,mcp__github__list_workflow_jobs,mcp__github__list_workflow_run_artifacts,mcp__github__list_workflow_runs,mcp__github__list_workflows,mcp__github__pull_request_read,mcp__github__search_code,mcp__github__search_issues,mcp__github__search_orgs,mcp__github__search_pull_requests,mcp__github__search_repositories,mcp__github__search_users,mcp__safeoutputs'\'' --debug-file /tmp/gh-aw/agent-stdio.log --verbose --permission-mode acceptEdits --output-format stream-json "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/agent-stdio.log | ||
| -- /bin/bash -c 'export PATH="${RUNNER_TEMP}/gh-aw/mcp-cli/bin:$PATH" && export PATH="$(find /opt/hostedtoolcache /home/runner/work/_tool -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && GH_AW_NODE_EXEC="${GH_AW_NODE_BIN:-}"; if [ -z "$GH_AW_NODE_EXEC" ] || [ ! -x "$GH_AW_NODE_EXEC" ]; then GH_AW_NODE_EXEC="$(command -v node 2>/dev/null || echo node)"; fi; "$GH_AW_NODE_EXEC" ${RUNNER_TEMP}/gh-aw/actions/claude_harness.cjs claude --print --no-chrome --mcp-config "${{ runner.temp }}/gh-aw/mcp-config/mcp-servers.json" --allowed-tools '\''Bash(cat),Bash(cat:*),Bash(date),Bash(date:*),Bash(echo),Bash(echo:*),Bash(grep),Bash(head),Bash(ls),Bash(mkdir:*),Bash(pwd),Bash(safeoutputs:*),Bash(sort),Bash(tail),Bash(tee:*),Bash(uniq),Bash(wc),Bash(yq),BashOutput,Edit,ExitPlanMode,Glob,Grep,KillBash,LS,MultiEdit,NotebookEdit,NotebookRead,Read,Task,TodoWrite,Write,mcp__github__download_workflow_run_artifact,mcp__github__get_code_scanning_alert,mcp__github__get_commit,mcp__github__get_dependabot_alert,mcp__github__get_discussion,mcp__github__get_discussion_comments,mcp__github__get_file_contents,mcp__github__get_job_logs,mcp__github__get_label,mcp__github__get_latest_release,mcp__github__get_me,mcp__github__get_notification_details,mcp__github__get_pull_request,mcp__github__get_pull_request_comments,mcp__github__get_pull_request_diff,mcp__github__get_pull_request_files,mcp__github__get_pull_request_review_comments,mcp__github__get_pull_request_reviews,mcp__github__get_pull_request_status,mcp__github__get_release_by_tag,mcp__github__get_secret_scanning_alert,mcp__github__get_tag,mcp__github__get_workflow_run,mcp__github__get_workflow_run_logs,mcp__github__get_workflow_run_usage,mcp__github__issue_read,mcp__github__list_branches,mcp__github__list_code_scanning_alerts,mcp__github__list_commits,mcp__github__list_dependabot_alerts,mcp__github__list_discussion_categories,mcp__github__list_discussions,mcp__github__list_issue_types,mcp__github__list_issues,mcp__github__list_label,mcp__github__list_notifications,mcp__github__list_pull_requests,mcp__github__list_releases,mcp__github__list_secret_scanning_alerts,mcp__github__list_starred_repositories,mcp__github__list_tags,mcp__github__list_workflow_jobs,mcp__github__list_workflow_run_artifacts,mcp__github__list_workflow_runs,mcp__github__list_workflows,mcp__github__pull_request_read,mcp__github__search_code,mcp__github__search_issues,mcp__github__search_orgs,mcp__github__search_pull_requests,mcp__github__search_repositories,mcp__github__search_users,mcp__safeoutputs'\'' --debug-file /tmp/gh-aw/agent-stdio.log --verbose --permission-mode acceptEdits --output-format stream-json --prompt-file /tmp/gh-aw/aw-prompts/prompt.txt${GH_AW_MODEL_AGENT_CLAUDE:+ --model "$GH_AW_MODEL_AGENT_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/agent-stdio.log |
There was a problem hiding this comment.
🤖 Smoke test review comment #2 — Consistent harness wrapping across workflows. The claude_harness.cjs node invocation properly handles retry logic for 429/529 errors. Verified in Run 25230880894.
The Claude agent driver had no retry logic — a transient overload (529) or rate-limit (429) from the Anthropic API caused immediate job failure, with no attempt to resume.
Changes
actions/setup/js/awf_reflect.cjs(new, shared)AWF API proxy
/reflecthelpers extracted into a shared module consumed by all harnesses:AWF_API_PROXY_REFLECT_URL,AWF_REFLECT_OUTPUT_PATH,AWF_REFLECT_TIMEOUT_MS,AWF_MODELS_URL_TIMEOUT_MS,GEMINI_MODEL_NAME_PREFIXextractModelIds,fetchModelsFromUrl,enrichReflectModels,fetchAWFReflectactions/setup/js/claude_harness.cjs(new)Node.js harness wrapping Claude Code CLI, mirroring
copilot_harness.cjs:overloaded_error(HTTP 529) andrate_limit_error(HTTP 429) in output--continueso Claude resumes from on-disk session state — no prompt re-injection; prompt is omitted on--continueruns since Claude Code doesn't re-accept itawf_reflect.cjsfor AWF API proxy reflectionactions/setup/js/copilot_harness.cjsawf_reflect.cjs(no behaviour change)pkg/workflow/claude_engine.goHarnessProviderviaGetHarnessScriptName() string { return "claude_harness.cjs" }GetExecutionSteps()now wraps the command asnode claude_harness.cjs claude …and passes--prompt-file /tmp/gh-aw/aw-prompts/prompt.txt(harness resolves file → last positional arg) instead of inline"$(cat …)"shell expansionCustom
engine.harnessin workflow frontmatter overrides the built-in harness, consistent with how Copilot handles it.Tests
"$(cat …)"→ now assert--prompt-fileTestClaudeEngineGetHarnessScriptName,TestClaudeEngineHarnessUsesPromptFile,TestClaudeEngineCustomHarnessOverridesBuiltInawf_reflect.test.cjswith 23 tests covering all exported constants and functions in the shared module✨ PR Review Safe Output Test - Run 25230880894