-
Notifications
You must be signed in to change notification settings - Fork 45
Description
🚨 CRITICAL RECURRING FAILURE - 5th Consecutive Occurrence
Summary
The Smoke GenAIScript workflow has FAILED AGAIN after the v0.24.0 release with the EXACT SAME ROOT CAUSE that has been reported in THREE previous issues (#2157, #2204, #2207). This is the 5th consecutive failure of this smoke test since 2025-10-22. Despite multiple investigations and issue reports, the configuration has never been corrected.
Failure Details
- Run: #18757658104
- Commit: 8993988 - "Release v0.24.0"
- Trigger: schedule (automated smoke test)
- Duration: 3.5 minutes
- Failed Job: detection (1.2 minutes)
- Status: ❌ FAILED
Root Cause Analysis
The Problem Persists UNCHANGED
The GenAIScript configuration STILL uses an invalid OpenAI model name:
Location: .github/workflows/shared/genaiscript.md line 6
GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4.1"Problem: gpt-4.1 DOES NOT EXIST in OpenAI's model catalog.
Valid OpenAI models:
gpt-4o✅ (recommended)gpt-4-turbo✅gpt-4✅gpt-3.5-turbo✅
Error Chain (Identical to All Previous Occurrences)
- GenAIScript attempts to resolve and use model
openai:gpt-4.1 - OpenAI API rejects the request (invalid model)
- GenAIScript receives undefined/null response
- GenAIScript crashes:
TypeError: Cannot read properties of undefined (reading 'text') - Detection job fails with exit code 255
- Smoke test marked as failed
Stack Trace
2025-10-23T18:10:09.4293104Z 2025-10-23T18:10:09.429Z genaiscript:error {
2025-10-23T18:10:09.4293428Z name: 'TypeError',
2025-10-23T18:10:09.4293872Z message: "Cannot read properties of undefined (reading 'text')",
2025-10-23T18:10:09.4294339Z stack: "TypeError: Cannot read properties of undefined (reading 'text')\n" +
2025-10-23T18:10:09.4295107Z ' at githubActionSetOutputs ((redacted))\n' +
2025-10-23T18:10:09.4296330Z ' at async Command.runScriptWithExitCode ((redacted))'
2025-10-23T18:10:09.4297303Z }
Failed Jobs and Errors
Job Execution Summary
- ✅ activation - succeeded (2s)
- ✅ agent - succeeded (1.6m) - Agent completed successfully
- ❌ detection - FAILED (1.2m) - Threat detection crashed
- ✅ create_issue - succeeded (5s)
- ⏭️ missing_tool - skipped
Investigation Findings
Complete Failure Timeline
| # | Run ID | Date/Time (UTC) | Trigger | Issue Created | Issue Status |
|---|---|---|---|---|---|
| 1 | 18727962258 | 2025-10-22 19:45:52 | workflow_dispatch | #2157 | Closed as "not_planned" |
| 2 | 18733557489 | 2025-10-23 00:19:22 | schedule | - | Covered by #2157 |
| 3 | 18739169072 | 2025-10-23 06:07:04 | schedule | #2204 | Closed as "completed" |
| 4 | 18747816413 | 2025-10-23 12:08:41 | schedule | #2207 | Closed as "completed" |
| 5 | 18757658104 | 2025-10-23 18:06:57 | schedule | This issue | Open |
Pattern: Failing every ~6 hours on scheduled runs
Duration: Over 22 hours of continuous failures
Failure Rate: 100% since first occurrence
Why This Is Critical NOW
- Post-Release Failure: This failure occurred immediately after the v0.24.0 release, indicating the configuration issue persists across releases
- Multiple Closed Issues: Three separate issues ([smoke-detector] 🔍 Smoke Test Investigation - GenAIScript Invalid Model Name (gpt-4.1) #2157, [smoke-detector] 🚨 CRITICAL RECURRING: GenAIScript Invalid Model (gpt-4.1) - 3rd Occurrence #2204, [smoke-detector] Comment on #2157 #2207) have been created and closed without fixing the root cause
- Wasted Resources: Every scheduled run (every ~6 hours) consumes CI minutes while producing no value
- Security Gap: Threat detection has been non-functional for over 22 hours
- False Confidence: The team may not realize smoke tests are failing continuously
Recommended Actions
🔴 CRITICAL - Immediate Fix (1 minute)
Update .github/workflows/shared/genaiscript.md line 6:
- GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4.1"
+ GH_AW_AGENT_MODEL_VERSION: "openai:gpt-4o"That's it. One line change. Will fix all 5 failures instantly.
🟡 Alternative: Disable Scheduled Workflow
If GenAIScript smoke tests are not being maintained, disable the scheduled trigger to stop generating failed runs and investigation overhead:
# .github/workflows/smoke-genaiscript.md
# Comment out or remove the schedule trigger🟢 Long-Term: Prevent Recurrence
- Add Pre-Flight Model Validation - Validate model names before execution
- Schema Validation - Use JSON schema to validate workflow configurations
- Better Error Handling - Work with GenAIScript team to improve error messages
- Documentation - Document valid model names in configuration files
Historical Context
From investigation database (/tmp/gh-aw/cache-memory/investigations/):
{
"pattern_signature": "GENAISCRIPT_INVALID_MODEL",
"first_occurrence": "2025-10-22T19:45:52Z",
"recurrence_count": 5,
"failure_rate": "100%",
"days_recurring": 1,
"hours_between_occurrences": [5.5, 6.2, 6.6, 6.0],
"is_flaky": false,
"external_dependency": "OpenAI API",
"persistence_across_releases": true
}Impact Assessment
Severity: 🔴 CRITICAL
- All GenAIScript smoke tests failing continuously
- Threat detection non-functional for 22+ hours
- Multiple issues created and closed without resolution
- Post-release failure indicates configuration persists across versions
Urgency: 🔴 IMMEDIATE
- Simple one-line fix available
- Continues to fail every 6 hours indefinitely
- Wasting CI resources and investigation time
Scope:
- Affects: All workflows using
shared/genaiscript.md - Frequency: Every scheduled smoke test run
- Duration: Ongoing since 2025-10-22 19:45 UTC (22+ hours)
Reproduction Steps
- Configure GenAIScript with model:
openai:gpt-4.1 - Set OPENAI_API_KEY (so validation passes)
- Run any GenAIScript workflow
- Observe failure when invalid model is used
- See TypeError accessing undefined result
Related Issues
- [smoke-detector] 🔍 Smoke Test Investigation - GenAIScript Invalid Model Name (gpt-4.1) #2157 - Original investigation (closed as "not_planned")
- [smoke-detector] 🚨 CRITICAL RECURRING: GenAIScript Invalid Model (gpt-4.1) - 3rd Occurrence #2204 - 3rd occurrence (closed as "completed")
- [smoke-detector] Comment on #2157 #2207 - 4th occurrence (closed as "completed")
- [smoke-detector] 🔍 Smoke Test Investigation - GenAIScript OPENAI_API_KEY Missing #2142 - Similar GenAIScript error (different root cause - missing API key)
Request for Action
This issue is being created to request a decision on one of the following:
- Fix the configuration (1-line change) to resolve the issue permanently
- Disable the scheduled workflow if GenAIScript smoke tests are not planned to be maintained
- Explain the strategy if this is expected behavior (so future investigations understand context)
The current situation - where the same failure occurs every 6 hours, generates investigation reports, creates issues that get closed, but nothing gets fixed - is not sustainable.
Investigation Metadata
- Investigator: Smoke Detector (Failure Investigation Agent)
- Investigation Run: #18757754195
- Pattern:
GENAISCRIPT_INVALID_MODEL(5th occurrence) - Investigation Record:
/tmp/gh-aw/cache-memory/investigations/2025-10-23-18757658104.json - Created: 2025-10-23T18:15:00Z
🤖 AI generated by Smoke Detector - Smoke Test Failure Investigator
This is an automated investigation of recurring smoke test failures.
AI generated by Smoke Detector - Smoke Test Failure Investigator