Skip to content

fix: ensure /tmp/gh-aw exists before activation job scripts run#19040

Merged
pelikhan merged 2 commits intomainfrom
copilot/fix-metrics-collector-error
Mar 1, 2026
Merged

fix: ensure /tmp/gh-aw exists before activation job scripts run#19040
pelikhan merged 2 commits intomainfrom
copilot/fix-metrics-collector-error

Conversation

Copy link
Contributor

Copilot AI commented Mar 1, 2026

The Metrics Collector activation job was crashing with ENOENT: no such file or directory, open '/tmp/gh-aw/aw_info.json' because setup.sh never created /tmp/gh-aw/ — only /opt/gh-aw/* paths were guaranteed. The generate_aw_info.cjs module has a mkdirSync guard, but that's a secondary safeguard; the directory should exist before any script touches it.

Changes

  • actions/setup/setup.sh: Add mkdir -p /tmp/gh-aw immediately after the main destination directory is created, so the path is guaranteed for all jobs using ./actions/setup — including the activation job where create_gh_aw_tmp_dir.sh is never called (it runs only in the agent job).
Original prompt

This section details on the original issue you should resolve

<issue_title>[workflow-health] Metrics Collector failing — ENOENT: /tmp/gh-aw/aw_info.json</issue_title>
<issue_description>The Metrics Collector workflow has begun failing in the activation job with an unexpected infrastructure error. This is a regression from its previously stable state (5+ consecutive successes as of 2026-02-25).

Error Details

Workflow: Metrics Collector - Infrastructure Agent
Run: §22526204448 (run #73, 2026-02-28T18:13)
Job: activation
Error:

Error: ENOENT: no such file or directory, open '/tmp/gh-aw/aw_info.json'
    at Object.writeFileSync (node:fs:2415:20)
    at eval (eval at callAsyncFunction ...
```

### Root Cause Analysis

The `activation` job fails when `generateWorkflowOverview(core)` tries to write `/tmp/gh-aw/aw_info.json`. This file is created by the `gh-aw setup` action as part of initialization. The error indicates `/tmp/gh-aw/` directory did not exist when the activation job ran.

Possible causes:
1. **Transient runner failure**: The setup step may have failed silently and the directory was never created
2. **Setup action regression**: A recent change to the setup action may have broken directory creation
3. **Race condition**: The setup step may complete but not have created the expected path

### Impact

- Metrics Collector is our infrastructure agent — it provides daily metrics for other meta-orchestrators (Campaign Manager, Agent Performance Analyzer, Workflow Health Manager)
- Failure of this workflow degrades shared memory quality and trend analysis
- Without fresh metrics, health monitoring relies on stale data

### Suggested Fix

1. Check recent changes to `actions/setup` and the `generate_workflow_overview.cjs` script
2. Verify the setup action properly creates `/tmp/gh-aw/` before the activation job runs
3. Add explicit directory creation guard (`mkdir -p /tmp/gh-aw`) in setup if missing

### Debugging

```
/agent agentic-workflows debug the agentic workflow metrics-collector failure in https://github.com/github/gh-aw/actions/runs/22526204448

Detected by: Workflow Health Manager, run §22538460268

Generated by Workflow Health Manager - Meta-Orchestrator

  • expires on Mar 2, 2026, 7:28 AM UTC

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix metrics collector failing due to missing aw_info.json fix: ensure /tmp/gh-aw exists before activation job scripts run Mar 1, 2026
@pelikhan pelikhan marked this pull request as ready for review March 1, 2026 12:16
Copilot AI review requested due to automatic review settings March 1, 2026 12:16
@pelikhan pelikhan merged commit 85bcb76 into main Mar 1, 2026
1 check passed
@pelikhan pelikhan deleted the copilot/fix-metrics-collector-error branch March 1, 2026 12:17
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Ensures the Metrics Collector activation job doesn’t crash with ENOENT by guaranteeing /tmp/gh-aw exists before any activation scripts attempt to write into it.

Changes:

  • Create /tmp/gh-aw during actions/setup/setup.sh initialization.
  • Add log output indicating the temp directory was created.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

echo "Created directory: ${DESTINATION}"

# Create /tmp/gh-aw directory so it exists before any activation scripts run
mkdir -p /tmp/gh-aw
Copy link

Copilot AI Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script already uses create_dir for directory creation (likely centralizing error handling, permissions, logging, and consistent behavior). For consistency and to avoid bypassing any safeguards in create_dir, consider using create_dir \"/tmp/gh-aw\" here (and quoting the path) instead of calling mkdir -p directly.

Suggested change
mkdir -p /tmp/gh-aw
create_dir "/tmp/gh-aw"

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[workflow-health] Metrics Collector failing — ENOENT: /tmp/gh-aw/aw_info.json

3 participants