fix(nighthawk): prevent pkill self-termination via variable indirection#82
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Root cause: the full prompt text is passed as a CLI argument to the claude process via $(cat prompt.txt). The literal string 'next start' appeared in pkill examples, so `pkill -f "next start"` matched the agent's own /proc/PID/cmdline and killed it (exit code 143). Fix: replace all pkill patterns with variable indirection (_SRV=start; pkill -f "next $_SRV") so the literal match string never appears verbatim in the prompt, preventing self-termination. Also: fix timeout-minutes reference in Step 10 from 60 to 45 to match the actual workflow frontmatter value. Agent-Logs-Url: https://github.com/unic/ai-developer-hub/sessions/323869b7-9634-4d0a-a182-3a928a304974 Co-authored-by: studert <1264796+studert@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR updates the Nighthawk workflow prompt documentation to prevent the agent from self-terminating during cleanup when stopping the local Next.js server. It addresses a real failure mode where pkill -f could match the agent process itself because the workflow prompt is passed as part of the agent process command line.
Changes:
- Replaces literal
pkill -f "next start"guidance with a variable-indirection pattern (_SRV=start,pkill ... "next $_SRV") to avoid self-matching. - Expands the Step 8 bash gateway callout to explicitly document the self-termination mechanism and the required mitigation pattern.
- Corrects the iteration timeout reference from
timeout-minutes: 60totimeout-minutes: 45to match the workflow frontmatter.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
The Nightly Issue Implementer was killing itself with exit code 143 (SIGTERM) on every run that reached the cleanup step. Root cause:
gh-awpasses the full prompt as a CLI argument to theclaudeprocess — sopkill -f "next start"matched the agent's own/proc/PID/cmdline(which contained the literal stringnext startfrom the prompt instructions).Changes
Shell variable indirection for all server-stop commands — replaces the literal pattern with a variable so it never appears verbatim in the prompt text:
At runtime bash expands
$_SRV→start; but the stringnext startis absent from the prompt, sopkillno longer matches the agent process.Callout updated (Step 8 bash gateway constraints) — explains the self-match mechanism explicitly and mandates the variable indirection pattern going forward.
Timeout reference corrected — Step 10 referenced
timeout-minutes: 60; changed totimeout-minutes: 45to match the actual workflow frontmatter.No lock.yml changes needed — the prompt is loaded at runtime via
{{#runtime-import .github/workflows/nighthawk.md}}.Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
https://api.github.com/repos/unic/ai-developer-hub/actions/runs/25317528503/usr/bin/gh gh api repos/{owner}/{repo}/actions/runs/25317528503 --jq {databaseId: .id, number: .run_number, url: .html_url, status: .status, conclusion: .conclusion, workflowName: .name, workflowPath: .path, createdAt: .created_at, startedAt: .run_started_at, updatedAt: .updated_at, event: .event, headBranch: .head_branch, ) if 'safeoutputs' in str(data).lower() or 'create_pull_request' in str(data).lower() or 'add_comment' in str(data).lower(): print(json.dumps(data)[:300]) -j DROP(http block)https://api.github.com/repos/unic/ai-developer-hub/actions/workflows/usr/bin/gh gh workflow list --all --json id,name,path,state(http block)If you need me to access, download, or install something from one of these locations, you can either: