-
Notifications
You must be signed in to change notification settings - Fork 267
Description
🔍 Smoke Test Investigation - Run #71
Summary
The Smoke Codex workflow failed because the create_issue job could not find the expected agent_output.json file. The agent job completed successfully in 1.7 minutes, but did not create the safe-outputs artifact that downstream jobs expected. This is a recurring pattern that has occurred at least 4 times for Codex, with similar issues affecting GenAIScript and OpenCode engines.
Failure Details
- Run: 18981163567
- Run Number: 71
- Commit: cd115b7
- Branch: main
- Trigger: schedule
- Duration: 3.0 minutes
- Failed Jobs: create_issue (8s)
- Workflow: Smoke Codex
Root Cause Analysis
Primary Error
Error reading agent output file: ENOENT: no such file or directory,
open '/tmp/gh-aw/safeoutputs/agent_output.json'
Error Chain
1. Agent Job Succeeds (1.7 minutes)
- ✅ Agent job completed successfully
- ✅ No errors reported during execution
- ❌ But no safe-outputs file was created
2. Create_Issue Job Fails (8 seconds)
Error reading agent output file: ENOENT: no such file or directory,
open '/tmp/gh-aw/safeoutputs/agent_output.json'
Why Did This Happen?
This failure indicates the agent completed its task successfully but did not use the safe-outputs MCP tools to create the expected output file. Possible reasons:
- Agent didn't use safe-outputs tools: The Codex agent may have completed without calling the
safe_outputs_create_issuetool - Safe-outputs file not created: Even if the agent intended to use safe-outputs, the file
/tmp/gh-aw/safeoutputs/outputs.jsonlwas never written - Staged mode behavior: In staged mode, the agent may behave differently regarding output file creation
- MCP server issue: The safe-outputs MCP server may not have been properly initialized or available
Failed Jobs and Errors
Job Sequence
- ✅ pre_activation - succeeded (6s)
- ✅ activation - succeeded (3s)
- ✅ agent - succeeded (1.7m) - No artifact created
- ✅ detection - succeeded (21s)
- ❌ create_issue - failed (8s) - ENOENT error
- ⏭️ missing_tool - skipped
Key Observations
- Agent job succeeded with 1.7m runtime
- Detection job succeeded
- Only create_issue job failed
- Workflow ran in staged mode (
GH_AW_SAFE_OUTPUTS_STAGED=true) - No error count was 1 (from the create_issue failure)
Historical Context
This is a well-documented recurring pattern.
Pattern Classification
- Pattern ID:
CODEX_AGENT_NO_ARTIFACT_STAGED_MODE - Category: Workflow Configuration - Staged Mode Issue
- Severity: Medium
- First Seen: 2025-10-27
- Occurrence Count: 4 occurrences for Codex
Previous Codex Occurrences
| Date | Run ID | Issue | Status |
|---|---|---|---|
| 2025-10-31 | 18981163567 | This investigation | New |
| 2025-10-29 | 18892865991 | Cached | Pattern identified |
| 2025-10-28 | 18890591960 | Cached | Pattern identified |
| 2025-10-27 | 18840299097 | #2604 | Closed |
Related Issues (Other Engines)
- [smoke-detector] 🔍 Smoke Test Investigation - Smoke Codex Run #49: Agent Output Artifact Missing in Staged Mode #2604 - Codex agent output artifact missing (closed, 2025-10-27)
- [smoke-detector] 🔍 Smoke Test Investigation - Smoke GenAIScript Run #57: Agent Does Not Use Safe-Outputs MCP Tools #2307 - GenAIScript agent doesn't use safe-outputs (closed)
- [smoke-detector] 🔍 Smoke Test Investigation - Smoke OpenCode Run #18722224746: Agent Does Not Use Safe-Outputs MCP Tools #2143 - OpenCode agent doesn't use safe-outputs (closed)
- [smoke-outpost] 🔍 Smoke Test Investigation - Smoke OpenCode: Missing agent_output.json File #2121 - Missing agent_output.json file (closed)
This pattern affects multiple AI engines (Codex, GenAIScript, OpenCode), suggesting a systemic issue with safe-outputs MCP tool usage in staged mode.
Investigation Findings
Environment Configuration
GH_AW_SAFE_OUTPUTS_STAGED=true
GH_AW_WORKFLOW_NAME=Smoke Codex
GH_AW_AGENT_OUTPUT=/tmp/gh-aw/safeoutputs/agent_output.jsonStaged Mode Expectations
In staged mode, the workflow should:
- ✅ Run the agent normally
- ✅ Generate output to preview
- ✅ Not actually create GitHub issues (only preview)
- ❌ Still upload artifacts for preview (this failed)
Artifact Details
- Expected source path:
/tmp/gh-aw/safeoutputs/outputs.jsonl - Artifact name:
agent_output.json - Result: File not found, no artifact created
Recommended Actions
🔴 High Priority
-
Make create_issue job conditional on artifact existence
create_issue: needs: [agent, detection] if: hashFiles('/tmp/gh-aw/safeoutputs/agent_output.json') != '' runs-on: ubuntu-latest
Impact: Prevents cascading failures when agent doesn't create output
-
Make artifact upload conditional
- name: Upload Safe Outputs if: hashFiles('/tmp/gh-aw/safeoutputs/outputs.jsonl') != '' uses: actions/upload-artifact@v4 with: name: agent_output.json path: /tmp/gh-aw/safeoutputs/outputs.jsonl
Impact: Eliminates artifact upload warnings
-
Add validation step before artifact operations
- name: Validate Safe Outputs Created run: | if [ -f "/tmp/gh-aw/safeoutputs/outputs.jsonl" ]; then echo "✓ outputs.jsonl created" echo "has_output=true" >> $GITHUB_OUTPUT else echo "⚠️ outputs.jsonl not found - agent may not have used safe-outputs tools" echo "has_output=false" >> $GITHUB_OUTPUT fi
🟡 Medium Priority
-
Enhance agent prompt for staged mode
- Make it explicit that safe-outputs tools MUST be used
- Clarify that output file is required even in preview mode
- Add examples of proper tool usage
-
Investigate Codex MCP integration
- Verify safe-outputs MCP server initialization
- Check if tools are available to the agent
- Review agent logs for MCP-related warnings
-
Add detection job enhancement
- Detection job should identify missing output files
- Warn when agent completes without using required tools
🟢 Low Priority
-
Compare with successful Codex runs
- Identify differences in agent behavior
- Check if certain prompts trigger the issue
- Analyze patterns in successful vs failed runs
-
Document expected behavior
- Clarify staged mode requirements
- Define success criteria for artifact creation
- Add troubleshooting guide
Prevention Strategies
-
Graceful Degradation
- Don't fail workflow if artifacts are optional
- Distinguish between "agent failed" vs "agent succeeded without output"
- Add conditional logic throughout artifact handling
-
Explicit Requirements
- Make prompts very clear about tool usage
- Add validation that verifies expected outputs
- Fail fast if required outputs are truly missing (vs optional)
-
Better Error Handling
- Provide context-aware error messages
- Include troubleshooting steps
- Link to documentation about safe-outputs
-
Monitoring and Tracking
- Track artifact creation success rates by engine
- Alert on patterns of missing artifacts
- Monitor staged mode workflow health metrics
Related PR Assessment
PR #2886: "Fix inconsistent formatting of github-workflow.json during build"
- Related to failure?: No
- Merged at: 2025-10-31T16:53:15Z (before this run)
- Assessment: The PR fixed schema formatting issues with prettier. This is unrelated to safe-outputs or agent behavior. The failure is a pre-existing recurring pattern.
Analysis: Why This Pattern Persists
After analyzing 4+ occurrences for Codex and similar issues across multiple engines:
- No Conditional Logic: Downstream jobs always run regardless of artifact creation
- Agent Autonomy: Agents may complete tasks without using safe-outputs tools
- Staged Mode Ambiguity: Unclear whether agents MUST create outputs in preview mode
- Missing Validation: No checks before artifact upload or download operations
- Pattern Recognition: Each engine hits this independently, suggesting systemic issue
Technical Details
Workflow Run Metrics
- Turns: 1
- Error Count: 1 (from create_issue failure)
- Warning Count: 0
- Total Duration: 3.0 minutes
- Agent Duration: 1.7 minutes
Environment Variables
GH_AW_SAFE_OUTPUTS_STAGED=true
GH_AW_WORKFLOW_NAME=Smoke Codex
GH_AW_AGENT_OUTPUT=/tmp/gh-aw/safeoutputs/agent_output.json
GH_AW_SAFE_OUTPUTS=/tmp/gh-aw/safeoutputs/outputs.jsonl
Immediate Next Steps
Given this is the 4th occurrence of this Codex-specific pattern:
- Implement conditional checks for create_issue job (high priority fix)
- Make artifact operations conditional on file existence
- Enhance agent prompts to be more explicit about requirements
- Monitor next 5 runs after implementing fixes
- Consider systemwide solution since multiple engines are affected
Investigation Metadata:
- Investigator: Smoke Detector (automated investigator)
- Investigation Run: 18981242914
- Investigation Record:
/tmp/gh-aw/cache-memory/investigations/2025-10-31-18981163567.json - Pattern Database:
/tmp/gh-aw/cache-memory/patterns/codex_no_artifact_staged.json - Similar Patterns:
GENAISCRIPT_NO_SAFE_OUTPUTS,OPENCODE_NO_SAFE_OUTPUTS
Labels: smoke-test, investigation, codex, safe-outputs, staged-mode, artifact-missing, recurring
AI generated by Smoke Detector - Smoke Test Failure Investigator