Skip to content

[smoke-detector] 🔍 Smoke Test Investigation - Smoke Codex Run #71: Agent Output Artifact Missing (Recurring) #2887

@github-actions

Description

@github-actions

🔍 Smoke Test Investigation - Run #71

Summary

The Smoke Codex workflow failed because the create_issue job could not find the expected agent_output.json file. The agent job completed successfully in 1.7 minutes, but did not create the safe-outputs artifact that downstream jobs expected. This is a recurring pattern that has occurred at least 4 times for Codex, with similar issues affecting GenAIScript and OpenCode engines.

Failure Details

  • Run: 18981163567
  • Run Number: 71
  • Commit: cd115b7
  • Branch: main
  • Trigger: schedule
  • Duration: 3.0 minutes
  • Failed Jobs: create_issue (8s)
  • Workflow: Smoke Codex

Root Cause Analysis

Primary Error

Error reading agent output file: ENOENT: no such file or directory, 
open '/tmp/gh-aw/safeoutputs/agent_output.json'

Error Chain

1. Agent Job Succeeds (1.7 minutes)

  • ✅ Agent job completed successfully
  • ✅ No errors reported during execution
  • ❌ But no safe-outputs file was created

2. Create_Issue Job Fails (8 seconds)

Error reading agent output file: ENOENT: no such file or directory, 
open '/tmp/gh-aw/safeoutputs/agent_output.json'

Why Did This Happen?

This failure indicates the agent completed its task successfully but did not use the safe-outputs MCP tools to create the expected output file. Possible reasons:

  1. Agent didn't use safe-outputs tools: The Codex agent may have completed without calling the safe_outputs_create_issue tool
  2. Safe-outputs file not created: Even if the agent intended to use safe-outputs, the file /tmp/gh-aw/safeoutputs/outputs.jsonl was never written
  3. Staged mode behavior: In staged mode, the agent may behave differently regarding output file creation
  4. MCP server issue: The safe-outputs MCP server may not have been properly initialized or available

Failed Jobs and Errors

Job Sequence

  1. pre_activation - succeeded (6s)
  2. activation - succeeded (3s)
  3. agent - succeeded (1.7m) - No artifact created
  4. detection - succeeded (21s)
  5. create_issue - failed (8s) - ENOENT error
  6. ⏭️ missing_tool - skipped

Key Observations

  • Agent job succeeded with 1.7m runtime
  • Detection job succeeded
  • Only create_issue job failed
  • Workflow ran in staged mode (GH_AW_SAFE_OUTPUTS_STAGED=true)
  • No error count was 1 (from the create_issue failure)

Historical Context

This is a well-documented recurring pattern.

Pattern Classification

  • Pattern ID: CODEX_AGENT_NO_ARTIFACT_STAGED_MODE
  • Category: Workflow Configuration - Staged Mode Issue
  • Severity: Medium
  • First Seen: 2025-10-27
  • Occurrence Count: 4 occurrences for Codex

Previous Codex Occurrences

Date Run ID Issue Status
2025-10-31 18981163567 This investigation New
2025-10-29 18892865991 Cached Pattern identified
2025-10-28 18890591960 Cached Pattern identified
2025-10-27 18840299097 #2604 Closed

Related Issues (Other Engines)

This pattern affects multiple AI engines (Codex, GenAIScript, OpenCode), suggesting a systemic issue with safe-outputs MCP tool usage in staged mode.

Investigation Findings

Environment Configuration

GH_AW_SAFE_OUTPUTS_STAGED=true
GH_AW_WORKFLOW_NAME=Smoke Codex
GH_AW_AGENT_OUTPUT=/tmp/gh-aw/safeoutputs/agent_output.json

Staged Mode Expectations

In staged mode, the workflow should:

  • ✅ Run the agent normally
  • ✅ Generate output to preview
  • ✅ Not actually create GitHub issues (only preview)
  • Still upload artifacts for preview (this failed)

Artifact Details

  • Expected source path: /tmp/gh-aw/safeoutputs/outputs.jsonl
  • Artifact name: agent_output.json
  • Result: File not found, no artifact created

Recommended Actions

🔴 High Priority

  • Make create_issue job conditional on artifact existence

    create_issue:
      needs: [agent, detection]
      if: hashFiles('/tmp/gh-aw/safeoutputs/agent_output.json') != ''
      runs-on: ubuntu-latest

    Impact: Prevents cascading failures when agent doesn't create output

  • Make artifact upload conditional

    - name: Upload Safe Outputs
      if: hashFiles('/tmp/gh-aw/safeoutputs/outputs.jsonl') != ''
      uses: actions/upload-artifact@v4
      with:
        name: agent_output.json
        path: /tmp/gh-aw/safeoutputs/outputs.jsonl

    Impact: Eliminates artifact upload warnings

  • Add validation step before artifact operations

    - name: Validate Safe Outputs Created
      run: |
        if [ -f "/tmp/gh-aw/safeoutputs/outputs.jsonl" ]; then
          echo "✓ outputs.jsonl created"
          echo "has_output=true" >> $GITHUB_OUTPUT
        else
          echo "⚠️ outputs.jsonl not found - agent may not have used safe-outputs tools"
          echo "has_output=false" >> $GITHUB_OUTPUT
        fi

🟡 Medium Priority

  • Enhance agent prompt for staged mode

    • Make it explicit that safe-outputs tools MUST be used
    • Clarify that output file is required even in preview mode
    • Add examples of proper tool usage
  • Investigate Codex MCP integration

    • Verify safe-outputs MCP server initialization
    • Check if tools are available to the agent
    • Review agent logs for MCP-related warnings
  • Add detection job enhancement

    • Detection job should identify missing output files
    • Warn when agent completes without using required tools

🟢 Low Priority

  • Compare with successful Codex runs

    • Identify differences in agent behavior
    • Check if certain prompts trigger the issue
    • Analyze patterns in successful vs failed runs
  • Document expected behavior

    • Clarify staged mode requirements
    • Define success criteria for artifact creation
    • Add troubleshooting guide

Prevention Strategies

  1. Graceful Degradation

    • Don't fail workflow if artifacts are optional
    • Distinguish between "agent failed" vs "agent succeeded without output"
    • Add conditional logic throughout artifact handling
  2. Explicit Requirements

    • Make prompts very clear about tool usage
    • Add validation that verifies expected outputs
    • Fail fast if required outputs are truly missing (vs optional)
  3. Better Error Handling

    • Provide context-aware error messages
    • Include troubleshooting steps
    • Link to documentation about safe-outputs
  4. Monitoring and Tracking

    • Track artifact creation success rates by engine
    • Alert on patterns of missing artifacts
    • Monitor staged mode workflow health metrics

Related PR Assessment

PR #2886: "Fix inconsistent formatting of github-workflow.json during build"

  • Related to failure?: No
  • Merged at: 2025-10-31T16:53:15Z (before this run)
  • Assessment: The PR fixed schema formatting issues with prettier. This is unrelated to safe-outputs or agent behavior. The failure is a pre-existing recurring pattern.

Analysis: Why This Pattern Persists

After analyzing 4+ occurrences for Codex and similar issues across multiple engines:

  1. No Conditional Logic: Downstream jobs always run regardless of artifact creation
  2. Agent Autonomy: Agents may complete tasks without using safe-outputs tools
  3. Staged Mode Ambiguity: Unclear whether agents MUST create outputs in preview mode
  4. Missing Validation: No checks before artifact upload or download operations
  5. Pattern Recognition: Each engine hits this independently, suggesting systemic issue

Technical Details

Workflow Run Metrics

  • Turns: 1
  • Error Count: 1 (from create_issue failure)
  • Warning Count: 0
  • Total Duration: 3.0 minutes
  • Agent Duration: 1.7 minutes

Environment Variables

GH_AW_SAFE_OUTPUTS_STAGED=true
GH_AW_WORKFLOW_NAME=Smoke Codex
GH_AW_AGENT_OUTPUT=/tmp/gh-aw/safeoutputs/agent_output.json
GH_AW_SAFE_OUTPUTS=/tmp/gh-aw/safeoutputs/outputs.jsonl

Immediate Next Steps

Given this is the 4th occurrence of this Codex-specific pattern:

  1. Implement conditional checks for create_issue job (high priority fix)
  2. Make artifact operations conditional on file existence
  3. Enhance agent prompts to be more explicit about requirements
  4. Monitor next 5 runs after implementing fixes
  5. Consider systemwide solution since multiple engines are affected

Investigation Metadata:

  • Investigator: Smoke Detector (automated investigator)
  • Investigation Run: 18981242914
  • Investigation Record: /tmp/gh-aw/cache-memory/investigations/2025-10-31-18981163567.json
  • Pattern Database: /tmp/gh-aw/cache-memory/patterns/codex_no_artifact_staged.json
  • Similar Patterns: GENAISCRIPT_NO_SAFE_OUTPUTS, OPENCODE_NO_SAFE_OUTPUTS

Labels: smoke-test, investigation, codex, safe-outputs, staged-mode, artifact-missing, recurring

AI generated by Smoke Detector - Smoke Test Failure Investigator

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions