[aw-failures] [aw] Auto-Triage Issues: agent recurrently aborts with permission-denied on /tmp/gh-aw/agent script writes

### Summary

The `Auto-Triage Issues` workflow has produced two consecutive failed runs in the last hour, both ending with the agent attempting to write/execute a Python processing script in `/tmp/gh-aw/agent/`, hitting `Permission denied and could not request permission from user`, and falling back to a `noop` safe output. **No labels are applied** on these runs — the workflow's purpose is silently lost.

Prior tracking issue [#33422](https://github.com/github/gh-aw/issues/33422) was closed (auto) 8 minutes after the next failure of the same family occurred — the symptom is not resolved, only the tracking issue rotated.

### Affected runs (last 6h)

| Run ID | Workflow | Duration | Engine output |
| --- | --- | --- | --- |
| [§26165337753](https://github.com/github/gh-aw/actions/runs/26165337753) | Auto-Triage Issues | 8.2m | Agent attempted `mkdir -p /tmp/gh-aw/agent && cat > /tmp/gh-aw/agent/process_issues.py <<'PY' ...` → permission denied → graceful noop |
| [§26135471197](https://github.com/github/gh-aw/actions/runs/26135471197) | Auto-Triage Issues | 33s | Agent attempted `mkdir -p /tmp/gh-aw/agent` → permission denied → engine terminated (tracked at #33422) |

### Root cause

The agent's allow-list for `Auto-Triage Issues` is restricted to `bash: ["jq *", "cat *"]` (see `.github/workflows/auto-triage-issues.md:35-37`), expanded by the harness to a small set of read-only shell verbs plus `safeoutputs:*`. Specifically **missing**: `mkdir`, `python3`, and the ability to spawn arbitrary scripts.

The prompt body encourages the agent to "Fetch ALL open issues without any labels ... do not limit to a fixed count" while the pre-step only fetches the first 30 (`auto-triage-issues.md:44`). The agent therefore reaches for a Python script to paginate and process — which the sandbox correctly denies.

### Audit-diff evidence

<details><summary>audit-diff vs prior failure (26135471197 → 26165337753)</summary>

- Firewall: 0 new/removed domains, 0 anomalies — no network drift.
- Same engine (`copilot`, `gpt-5-mini`), same workflow, same failure surface.
- Token usage delta is purely due to how far the agent got before being blocked.

</details>

### Proposed remediation (pick one)

1. **Tighten the prompt (recommended, smallest blast radius):** explicitly instruct the agent to use only `gh`, `jq`, and `safeoutputs add_labels` — and to operate purely on the pre-staged `/tmp/gh-aw/agent/unlabeled-issues.json`. Remove the "fetch ALL" wording or rework it to read pages from the pre-staged file.
2. **Make the pre-step authoritative:** expand the `Fetch unlabeled issues` step to paginate fully (and respect `per_page`), so the agent never needs to reach for scripting to scale up.
3. **(Avoid unless 1+2 fail)** Add `shell(mkdir)` and a constrained Python tool to the allow-list. This widens the sandbox and should be a last resort.

### Success criteria

- Two consecutive scheduled runs of `Auto-Triage Issues` complete with at least one `add_labels` safe-output call (or a justified `noop` where no unlabeled issues exist).
- No `Permission denied and could not request permission from user` in `agent-stdio.log`.
- Auto-tracker issues for `Auto-Triage Issues` stop being re-opened within 6h of being closed.

### Cross-references

- Parent tracker: #32523
- Prior tracking issue (same root cause, closed): #33422
- Workflow file: `.github/workflows/auto-triage-issues.md`

**References:**
- [§26165337753](https://github.com/github/gh-aw/actions/runs/26165337753)
- [§26135471197](https://github.com/github/gh-aw/actions/runs/26135471197)
Related to #32523







> Generated by [🔍 [aw] Failure Investigator (6h)](https://github.com/github/gh-aw/actions/runs/26166254790) · ● 12.2M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Faw-failure-investigator%22&type=issues)
> - [x] expires  on May 27, 2026, 1:49 PM UTC






---

### Closing as stale — root cause no longer reproduces

A 6-hour failure investigation (window ending 2026-05-20T19:35Z) found that the symptom this issue tracks — `Auto-Triage Issues` agent aborting with **`Permission denied`** on `/tmp/gh-aw/agent` script writes — is **no longer present** in current failed runs.

#### Evidence from recent Auto-Triage Issues failures

| Run ID | Agent outcome | Permission-denied seen? | Where the job actually fails |
|---|---|---|---|
| [§26184375104](https://github.com/github/gh-aw/actions/runs/26184375104) | Graceful `noop` after `gh search` returned HTTP 403 | No | Post-agent step *Parse MCP Gateway logs for step summary* — `ERR_SYSTEM: rpc-messages.jsonl is present but zero bytes` |
| [§26181990934](https://github.com/github/gh-aw/actions/runs/26181990934) | `add_labels` succeeded (`["enhancement","workflows"]` applied to #33609) | No | Same MCP telemetry post-step |

<details><summary>Why the original symptom is gone</summary>

- No `mkdir`, `python3`, or `process_issues.py` attempts appear in either `agent-stdio.log`.
- `agent_output.json` for run 26181990934 shows a real `add_labels` action (not a permission-denied fallback).
- The remediation noted in the original report ("tighten the prompt to use only `gh`, `jq`, and `safeoutputs add_labels` on the pre-staged file") appears to have landed.

</details>

#### A different failure now blocks Auto-Triage Issues

The MCP telemetry capture failure that now fails the `agent` job is a **new, separate** regression affecting multiple workflows (Auto-Triage Issues, Contribution Check). It is being tracked in a fresh investigation report — see the parent issue created in this run.

**References:**
- [§26184375104](https://github.com/github/gh-aw/actions/runs/26184375104)
- [§26181990934](https://github.com/github/gh-aw/actions/runs/26181990934)

> Generated by [🔍 [aw] Failure Investigator (6h)](https://github.com/github/gh-aw/actions/runs/26185286524) · ● 27.2M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Faw-failure-investigator%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aw-failures] [aw] Auto-Triage Issues: agent recurrently aborts with permission-denied on /tmp/gh-aw/agent script writes #33560

Summary

Affected runs (last 6h)

Root cause

Audit-diff evidence

Proposed remediation (pick one)

Success criteria

Cross-references

Closing as stale — root cause no longer reproduces

Evidence from recent Auto-Triage Issues failures

A different failure now blocks Auto-Triage Issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Run ID	Workflow	Duration	Engine output
§26165337753	Auto-Triage Issues	8.2m	Agent attempted `mkdir -p /tmp/gh-aw/agent && cat > /tmp/gh-aw/agent/process_issues.py <<'PY' ...` → permission denied → graceful noop
§26135471197	Auto-Triage Issues	33s	Agent attempted `mkdir -p /tmp/gh-aw/agent` → permission denied → engine terminated (tracked at #33422)

Run ID	Agent outcome	Permission-denied seen?	Where the job actually fails
§26184375104	Graceful `noop` after `gh search` returned HTTP 403	No	Post-agent step Parse MCP Gateway logs for step summary — `ERR_SYSTEM: rpc-messages.jsonl is present but zero bytes`
§26181990934	`add_labels` succeeded (`["enhancement","workflows"]` applied to #33609)	No	Same MCP telemetry post-step

[aw-failures] [aw] Auto-Triage Issues: agent recurrently aborts with permission-denied on /tmp/gh-aw/agent script writes #33560

Description

Summary

Affected runs (last 6h)

Root cause

Audit-diff evidence

Proposed remediation (pick one)

Success criteria

Cross-references

Closing as stale — root cause no longer reproduces

Evidence from recent Auto-Triage Issues failures

A different failure now blocks Auto-Triage Issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions