Skip to content

[aw-failures] 6h Failure Report: 3 claude-engine runs failed (2× no safe output, 1× cache memory blocked) #30101

@github-actions

Description

@github-actions

Executive Summary

In the 6 hours ending 2026-05-04T07:41Z, 3 of 29 runs failed (10.3% failure rate, all claude engine). Two distinct failure clusters identified. No blocked firewall domains contributed to Step Name Alignment or Schema Consistency Checker; Multi-Device Docs Tester had 2 firewall blocks.

Failure Clusters

# Workflow Run Turns Cost Root Cause
1 Schema Consistency Checker §25304247413 61 $1.91 Agent exited without calling any safeoutputs tool
2 Multi-Device Docs Tester §25304397726 81 $1.93 Agent exited without calling any safeoutputs tool; multiple permission blocks (playwright, /tmp writes, settings.json)
3 Step Name Alignment §25301975942 31 $1.10 Cache memory file blocked by security policy (/tmp/gh-aw/cache-memory/ not in allowed dirs); yq glob commands blocked; exhausted max turns

Evidence

Cluster A: Missing Safe Output (Schema Consistency Checker + Multi-Device Docs Tester)

Both workflows completed their full analysis work but never called noop, create_discussion, or any safeoutputs tool. This caused:

  1. output_types output → null (empty)
  2. detection job skipped (condition: output_types != '' → false)
  3. safe_outputs job skipped (condition: detection.result == 'success' → false)
  4. Workflow conclusion: failure

Key logs from detection system evaluation:

Evaluating: (always() && (needs.agent.result != 'skipped') && (((needs.agent.outputs.output_types != '') || (needs.agent.outputs.has_patch == 'true'))))
Expanded: (true && ('failure' != 'skipped') && ((null != '') || ('false' == 'true')))
Result: false

For Multi-Device Docs Tester, the agent also encountered:

  • "Output redirection to '/tmp/preview.log' was blocked" — dev server log writes
  • "Contains simple_expansion" — bash security policy block
  • "Execute skill: playwright-cli" — playwright skill blocked
  • "Claude requested permissions to write to /home/runner/.claude/settings.json" — settings write blocked
  • "Unhandled node type: string/array" — parser errors in tool results

Despite 81 turns of browser testing attempts (curl checks returned 200 for 8 pages), the agent never called noop before exiting.

Cluster B: Cache Memory Access Blocked (Step Name Alignment)

The agent tried to read its cache from /tmp/gh-aw/cache-memory/step-name-alignment/patterns.json but Claude Code's security policy restricts file access to /home/runner/work/gh-aw/gh-aw:

"cat in '/tmp/gh-aw/cache-memory/step-name-alignment/patterns.json' was blocked.
For security, Claude Code may only concatenate files from the allowed working directories
for this session: '/home/runner/work/gh-aw/gh-aw'."

Additionally, yq commands using glob patterns (.github/workflows/*.lock.yml) were blocked as "multiple operations requiring approval". The agent correctly called missing_data(data_type=cache_memory, reason=cache_memory_miss) but then hit error_max_turns after 31 turns.

The audit reported: "evidence": "plan tool_call tool_call tool_call tool_call tool_call tool_call tool_call tool_call tool_call tool_call tool_call tool_call tool_call error finish" confirming the max-turns exhaustion pattern.

Existing Issue Correlation

Note: GitHub issue read API is returning 403 in this context; could not verify existing open issues. Sub-issues should be inspected for duplicates before acting.

Proposed Fix Roadmap

Priority Issue Action
P1 Schema Consistency Checker never calls safeoutputs Workflow definition needs explicit safeoutputs call in all exit paths
P1 Multi-Device Docs Tester never calls safeoutputs + permission failures Add noop call; expand allowed file paths in settings.json for /tmp writes
P2 Step Name Alignment cache memory blocked Grant /tmp/gh-aw/cache-memory/ read access or rewrite cache lookup to use allowed path

References:

Generated by [aw] Failure Investigator (6h) · ● 545.5K ·

  • expires on May 11, 2026, 7:51 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions