Executive Summary
Two failure clusters identified in the 6-hour lookback window (2026-05-07 ~07:22–13:18 UTC). One is a recurring P0 infrastructure failure already tracked in #30150 (still unresolved). The other is a new timeout root-cause finding for the NLP Analysis workflow.
| Cluster |
Workflow |
Failure Mode |
Run ID |
Severity |
Existing Tracking |
| A |
Daily News |
Node.js not found in AWF chroot — agent never starts |
§25487352199 |
P0 |
#30150 (open, 3d) |
| B |
Copilot PR Conversation NLP Analysis |
20-min agent job timeout — pip install exhausts budget |
§25491428718 |
P1 |
#30815 (auto-alert only) |
Failure Clusters
Cluster A — Daily News: Node.js Not Found in AWF Chroot (P0, Recurring)
Pattern: 5/5 runs in the last 30 days fail with identical pre-agent error. Agent never starts, zero turns.
Root cause: The Copilot CLI requires Node.js at runtime. The AWF chroot environment does not have node on its PATH for this workflow. No setup-node step (or equivalent) is executed before the agent, and the runner's Node.js installation path is not bind-mounted into the chroot.
Error log (identical across all 5 failing runs)
[entrypoint][ERROR] Copilot CLI requires Node.js, but 'node' is not available inside AWF chroot.
[entrypoint][ERROR] Ensure Node.js is installed on the runner and reachable from PATH inside the chroot.
[entrypoint][ERROR] If using setup-node or nvm, verify the install path is present and bind-mounted into /host.
[entrypoint][ERROR] Example locations include /opt/hostedtoolcache/... and /home/runner/.nvm/...
Run history (last 30 days)
| Run ID |
Date |
Conclusion |
Turns |
Error |
| §25487352199 |
2026-05-07 09:22Z |
failure |
0 |
Node.js not found |
| §25367975009 |
2026-05-05 09:16Z |
failure |
0 |
Node.js not found |
| §25311165057 |
2026-05-04 09:20Z |
failure |
0 |
Node.js not found |
| §25297165746 |
2026-05-04 01:49Z |
failure |
0 |
Node.js not found |
| §25100801300 |
2026-04-29 09:18Z |
failure |
0 |
Node.js not found |
Existing tracking: Issue #30150 (open since 2026-05-04) — this failure is fully covered there.
Cluster B — Copilot PR Conversation NLP Analysis: 20-Minute Timeout (P1)
Pattern: The agent job consistently hits the 20-minute GitHub Actions job timeout because the agent attempts to pip install pandas matplotlib seaborn textblob wordcloud scikit-learn at runtime. Package installation is non-deterministic in duration and regularly exceeds the budget.
Root cause: Heavy Python data-science dependencies are not pre-installed on the runner or cached. The agent must install them from scratch on each run, consuming 10–18 minutes just on pip operations and leaving insufficient time for the actual NLP analysis.
Evidence from run §25491428718 (2026-05-07):
- 33 turns, 33m02s before timeout kill
- Job concluded:
##[error]The action 'Execute GitHub Copilot CLI' has timed out after 20 minutes.
- Agent reached pip install stage, attempted multiple fallback approaches
- Same timeout pattern seen in run §25104405737 (2026-04-29)
Agent pip install sequence (run 25491428718)
● Install Python libraries (shell)
pip install pandas matplotlib seaborn scikit-learn textblob wordcloud numpy -q 2>&1 | tail -5
● Read shell output Waiting up to 60 seconds...
● Read shell output Waiting up to 90 seconds...
● Read shell output Waiting up to 120 seconds...
● Read shell output Waiting up to 120 seconds...
● Check available libraries (shell)
● Install core libraries without deps (shell)
● Install numpy and dependencies (shell)
● Install remaining dependencies (shell)
● Fix packaging and check environment (shell)
Total: ~15–18 minutes on pip operations alone
One successful run (25314812268, 2026-05-04, 13m45s) completed — packages may have been cached from a prior pip install on that specific runner.
Existing tracking: #30815 (auto-alert only, no root-cause analysis)
Proposed Fix Roadmap
| Priority |
Issue |
Fix |
| P0 |
Daily News — Node.js chroot |
See #30150 for remediation plan |
| P1 |
NLP Analysis — pip install timeout |
Pre-install Python deps in a pre-agent step; see sub-issue #aw_nlp1 |
Sub-Issues Created
- #aw_nlp1 — NLP Analysis: Pre-install Python dependencies in pre-agent step to fix 20-min timeout
References:
Generated by [aw] Failure Investigator (6h) · ● 459.2K · ◷
Executive Summary
Two failure clusters identified in the 6-hour lookback window (2026-05-07 ~07:22–13:18 UTC). One is a recurring P0 infrastructure failure already tracked in #30150 (still unresolved). The other is a new timeout root-cause finding for the NLP Analysis workflow.
Failure Clusters
Cluster A — Daily News: Node.js Not Found in AWF Chroot (P0, Recurring)
Pattern: 5/5 runs in the last 30 days fail with identical pre-agent error. Agent never starts, zero turns.
Root cause: The Copilot CLI requires Node.js at runtime. The AWF chroot environment does not have
nodeon its PATH for this workflow. Nosetup-nodestep (or equivalent) is executed before the agent, and the runner's Node.js installation path is not bind-mounted into the chroot.Error log (identical across all 5 failing runs)
Run history (last 30 days)
Existing tracking: Issue #30150 (open since 2026-05-04) — this failure is fully covered there.
Cluster B — Copilot PR Conversation NLP Analysis: 20-Minute Timeout (P1)
Pattern: The agent job consistently hits the 20-minute GitHub Actions job timeout because the agent attempts to
pip install pandas matplotlib seaborn textblob wordcloud scikit-learnat runtime. Package installation is non-deterministic in duration and regularly exceeds the budget.Root cause: Heavy Python data-science dependencies are not pre-installed on the runner or cached. The agent must install them from scratch on each run, consuming 10–18 minutes just on pip operations and leaving insufficient time for the actual NLP analysis.
Evidence from run §25491428718 (2026-05-07):
##[error]The action 'Execute GitHub Copilot CLI' has timed out after 20 minutes.Agent pip install sequence (run 25491428718)
Total: ~15–18 minutes on pip operations alone
One successful run (25314812268, 2026-05-04, 13m45s) completed — packages may have been cached from a prior pip install on that specific runner.
Existing tracking: #30815 (auto-alert only, no root-cause analysis)
Proposed Fix Roadmap
Sub-Issues Created
References: