Skip to content

fix(nighthawk): prevent pkill self-termination via variable indirection#82

Merged
studert merged 4 commits into
mainfrom
copilot/aw-fix-nightly-issue-implementer
May 5, 2026
Merged

fix(nighthawk): prevent pkill self-termination via variable indirection#82
studert merged 4 commits into
mainfrom
copilot/aw-fix-nightly-issue-implementer

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 5, 2026

The Nightly Issue Implementer was killing itself with exit code 143 (SIGTERM) on every run that reached the cleanup step. Root cause: gh-aw passes the full prompt as a CLI argument to the claude process — so pkill -f "next start" matched the agent's own /proc/PID/cmdline (which contained the literal string next start from the prompt instructions).

Changes

  • Shell variable indirection for all server-stop commands — replaces the literal pattern with a variable so it never appears verbatim in the prompt text:

    _SRV=start
    pkill -TERM -f "next $_SRV" 2>/dev/null || true
    sleep 2
    pkill -KILL -f "next $_SRV" 2>/dev/null || true

    At runtime bash expands $_SRVstart; but the string next start is absent from the prompt, so pkill no longer matches the agent process.

  • Callout updated (Step 8 bash gateway constraints) — explains the self-match mechanism explicitly and mandates the variable indirection pattern going forward.

  • Timeout reference corrected — Step 10 referenced timeout-minutes: 60; changed to timeout-minutes: 45 to match the actual workflow frontmatter.

No lock.yml changes needed — the prompt is loaded at runtime via {{#runtime-import .github/workflows/nighthawk.md}}.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/unic/ai-developer-hub/actions/runs/25317528503
    • Triggering command: /usr/bin/gh gh api repos/{owner}/{repo}/actions/runs/25317528503 --jq {databaseId: .id, number: .run_number, url: .html_url, status: .status, conclusion: .conclusion, workflowName: .name, workflowPath: .path, createdAt: .created_at, startedAt: .run_started_at, updatedAt: .updated_at, event: .event, headBranch: .head_branch, ) if 'safeoutputs' in str(data).lower() or 'create_pull_request' in str(data).lower() or 'add_comment' in str(data).lower(): print(json.dumps(data)[:300]) -j DROP (http block)
  • https://api.github.com/repos/unic/ai-developer-hub/actions/workflows
    • Triggering command: /usr/bin/gh gh workflow list --all --json id,name,path,state (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI linked an issue May 5, 2026 that may be closed by this pull request
@vercel
Copy link
Copy Markdown

vercel Bot commented May 5, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ai-developer-hub Ready Ready Preview, Comment May 5, 2026 10:45am

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 5, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ studert
❌ Copilot
You have signed the CLA already but the status is still pending? Let us recheck it.

Root cause: the full prompt text is passed as a CLI argument to the
claude process via $(cat prompt.txt). The literal string 'next start'
appeared in pkill examples, so `pkill -f "next start"` matched the
agent's own /proc/PID/cmdline and killed it (exit code 143).

Fix: replace all pkill patterns with variable indirection (_SRV=start;
pkill -f "next $_SRV") so the literal match string never appears
verbatim in the prompt, preventing self-termination.

Also: fix timeout-minutes reference in Step 10 from 60 to 45 to match
the actual workflow frontmatter value.

Agent-Logs-Url: https://github.com/unic/ai-developer-hub/sessions/323869b7-9634-4d0a-a182-3a928a304974

Co-authored-by: studert <1264796+studert@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix nightly issue implementer engine failure fix(nighthawk): prevent pkill self-termination via variable indirection May 5, 2026
Copilot finished work on behalf of studert May 5, 2026 10:14
Copilot AI requested a review from studert May 5, 2026 10:14
@studert studert marked this pull request as ready for review May 5, 2026 10:28
Copilot AI review requested due to automatic review settings May 5, 2026 10:28
Copilot finished work on behalf of studert May 5, 2026 10:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Nighthawk workflow prompt documentation to prevent the agent from self-terminating during cleanup when stopping the local Next.js server. It addresses a real failure mode where pkill -f could match the agent process itself because the workflow prompt is passed as part of the agent process command line.

Changes:

  • Replaces literal pkill -f "next start" guidance with a variable-indirection pattern (_SRV=start, pkill ... "next $_SRV") to avoid self-matching.
  • Expands the Step 8 bash gateway callout to explicitly document the self-termination mechanism and the required mitigation pattern.
  • Corrects the iteration timeout reference from timeout-minutes: 60 to timeout-minutes: 45 to match the workflow frontmatter.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .github/workflows/nighthawk.md Outdated
Comment thread .github/workflows/nighthawk.md Outdated
studert and others added 2 commits May 5, 2026 12:42
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@studert studert merged commit 6f96368 into main May 5, 2026
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw] Nightly Issue Implementer failed

4 participants