Skip to content

[aw-failures] Failure Report 2026-05-07 (6h window ~07:22–13:18 UTC): Daily News Node.js Chroot Recurrence + NLP Analysis Pip In [Content truncated due to length] #30830

@github-actions

Description

@github-actions

Executive Summary

Two failure clusters identified in the 6-hour lookback window (2026-05-07 ~07:22–13:18 UTC). One is a recurring P0 infrastructure failure already tracked in #30150 (still unresolved). The other is a new timeout root-cause finding for the NLP Analysis workflow.

Cluster Workflow Failure Mode Run ID Severity Existing Tracking
A Daily News Node.js not found in AWF chroot — agent never starts §25487352199 P0 #30150 (open, 3d)
B Copilot PR Conversation NLP Analysis 20-min agent job timeout — pip install exhausts budget §25491428718 P1 #30815 (auto-alert only)

Failure Clusters

Cluster A — Daily News: Node.js Not Found in AWF Chroot (P0, Recurring)

Pattern: 5/5 runs in the last 30 days fail with identical pre-agent error. Agent never starts, zero turns.

Root cause: The Copilot CLI requires Node.js at runtime. The AWF chroot environment does not have node on its PATH for this workflow. No setup-node step (or equivalent) is executed before the agent, and the runner's Node.js installation path is not bind-mounted into the chroot.

Error log (identical across all 5 failing runs)
[entrypoint][ERROR] Copilot CLI requires Node.js, but 'node' is not available inside AWF chroot.
[entrypoint][ERROR] Ensure Node.js is installed on the runner and reachable from PATH inside the chroot.
[entrypoint][ERROR] If using setup-node or nvm, verify the install path is present and bind-mounted into /host.
[entrypoint][ERROR] Example locations include /opt/hostedtoolcache/... and /home/runner/.nvm/...
Run history (last 30 days)
Run ID Date Conclusion Turns Error
§25487352199 2026-05-07 09:22Z failure 0 Node.js not found
§25367975009 2026-05-05 09:16Z failure 0 Node.js not found
§25311165057 2026-05-04 09:20Z failure 0 Node.js not found
§25297165746 2026-05-04 01:49Z failure 0 Node.js not found
§25100801300 2026-04-29 09:18Z failure 0 Node.js not found

Existing tracking: Issue #30150 (open since 2026-05-04) — this failure is fully covered there.


Cluster B — Copilot PR Conversation NLP Analysis: 20-Minute Timeout (P1)

Pattern: The agent job consistently hits the 20-minute GitHub Actions job timeout because the agent attempts to pip install pandas matplotlib seaborn textblob wordcloud scikit-learn at runtime. Package installation is non-deterministic in duration and regularly exceeds the budget.

Root cause: Heavy Python data-science dependencies are not pre-installed on the runner or cached. The agent must install them from scratch on each run, consuming 10–18 minutes just on pip operations and leaving insufficient time for the actual NLP analysis.

Evidence from run §25491428718 (2026-05-07):

  • 33 turns, 33m02s before timeout kill
  • Job concluded: ##[error]The action 'Execute GitHub Copilot CLI' has timed out after 20 minutes.
  • Agent reached pip install stage, attempted multiple fallback approaches
  • Same timeout pattern seen in run §25104405737 (2026-04-29)
Agent pip install sequence (run 25491428718)
● Install Python libraries (shell)
  pip install pandas matplotlib seaborn scikit-learn textblob wordcloud numpy -q 2>&1 | tail -5
● Read shell output Waiting up to 60 seconds...
● Read shell output Waiting up to 90 seconds...
● Read shell output Waiting up to 120 seconds...
● Read shell output Waiting up to 120 seconds...
● Check available libraries (shell)
● Install core libraries without deps (shell)
● Install numpy and dependencies (shell)
● Install remaining dependencies (shell)
● Fix packaging and check environment (shell)

Total: ~15–18 minutes on pip operations alone

One successful run (25314812268, 2026-05-04, 13m45s) completed — packages may have been cached from a prior pip install on that specific runner.

Existing tracking: #30815 (auto-alert only, no root-cause analysis)


Proposed Fix Roadmap

Priority Issue Fix
P0 Daily News — Node.js chroot See #30150 for remediation plan
P1 NLP Analysis — pip install timeout Pre-install Python deps in a pre-agent step; see sub-issue #aw_nlp1

Sub-Issues Created

  • #aw_nlp1 — NLP Analysis: Pre-install Python dependencies in pre-agent step to fix 20-min timeout

References:

Generated by [aw] Failure Investigator (6h) · ● 459.2K ·

  • expires on May 14, 2026, 1:29 PM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions