-
Notifications
You must be signed in to change notification settings - Fork 268
Description
What happens
When a workflow run leaves /tmp/gh-aw/mcp-logs/ owned by a different UID (e.g., root from a container execution), subsequent runs fail because mkdir -p /tmp/gh-aw/mcp-logs/safeoutputs or file writes into that directory hit EACCES. The failure can manifest in two ways depending on which step hits the bad permissions first:
- Subdirectory creation fails at the
mkdir -pstep, blocking the workflow - File logging is silently disabled —
mcp_server_core.cjs:75and:104swallow write failures, so MCP logs and artifacts are lost without any error
The first case blocks the run. The second case degrades it — the workflow completes but MCP telemetry and artifact collection are silently skipped.
What should happen
The startup scripts should repair ownership/permissions on /tmp/gh-aw/mcp-logs/ for the runtime user before use, the same way install_copilot_cli.sh already does for /home/runner/.copilot. File write failures in the MCP server should be surfaced, not swallowed.
Where in the code
All references are to main at 2d91393f3.
Directory creation (no ownership repair):
start_mcp_gateway.sh:31-32—mkdir -p /tmp/gh-aw/mcp-logsandmkdir -p /tmp/gh-aw/mcp-configwith nochmod/chownmcp_setup_generator.go:196— generatesmkdir -p /tmp/gh-aw/mcp-logs/safeoutputsin workflow YAML, no ownership fixmcp_setup_generator.go:472— same for/tmp/gh-aw/mcp-logs/playwright
File writes that fail silently on bad permissions:
safe_outputs_mcp_server.cjs:34— safe-outputs server writes logsmcp_server_core.cjs:75— swallows file-write failuresmcp_server_core.cjs:104— same pattern, swallows write failures
Environment wiring:
compiler_activation_jobs.go:916— setsGH_AW_MCP_LOG_DIR=/tmp/gh-aw/mcp-logs/safeoutputs
The fix pattern that already exists:
install_copilot_cli.sh:25— repairs ownership before Copilot CLI install (commitada84f04f, PR fix: ensure /home/runner/.copilot directory has correct ownership before Copilot CLI install #13980)
Evidence
Production:
- EACCES on
/tmp/gh-aw/mcp-logs/rpc-messages.jsonlblocked workflow run (v0.50.7) - Run 22497663869: workflow failed at MCP log access
Source-level verification (2026-03-01, main at 2d91393f3):
- Confirmed no
chmod,chown, or cleanup logic exists for/tmp/gh-aw/mcp-logs/in any startup script - Confirmed
mcp_server_core.cjs:75and:104swallow write failures - Confirmed the ownership repair pattern is already applied for Copilot at
install_copilot_cli.sh:25
Proposed fix
Repair ownership/permissions for the runtime user on /tmp/gh-aw/mcp-logs/ after mkdir -p in start_mcp_gateway.sh, following the same pattern as install_copilot_cli.sh:25 (which does this for /home/runner/.copilot).
Additionally, mcp_server_core.cjs:75 and :104 should surface file-write failures rather than swallowing them — at minimum log an error so operators can diagnose permission issues.
Impact
Frequency: Intermittent — depends on whether prior runs left stale directories with mismatched ownership. More likely in repos with high workflow concurrency or container-based engines.
Cost: High when it hits — either blocks the run at subdirectory creation, or silently disables MCP logging/artifact collection. Debugging requires manual inspection of Actions runner state because the swallowed write failures produce no error output.