You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Smoke Copilot workflow run §25287782330 failed in the safe_outputs job despite all 15 smoke test cases passing. The agent completed successfully (exit code 0, 3m35s) and the smoke test result file was written. The failure is a workflow definition bug causing a false negative: the smoke test infrastructure is healthy, but the reporting step errors out.
Mechanism: The safe_outputs job processes items batched from the agent run. Two add_comment items targeting discussion #29991 were submitted but never appear in safe-output-items.jsonl (the executed items manifest), causing the job to exit with failure.
Why discussion #29991 fails: The workflow's create-discussion safe-output config includes:
Discussion #29991 was the latest existing smoke-copilot discussion. When the safe_outputs job executes create_discussion (creating #29992), it also closes/locks #29991 as part of close-older-discussions. The two add_comment calls that were targeting #29991 then fail because the discussion is locked.
Output step 4: Add fun comment to #29991 (now closed → FAILS)
Proposed Fix
The workflow should not add output comments to a discussion that will be superseded in the same run. Two options:
Option A (recommended): Change Output step 4 to target the newly created discussion by using its temporary ID, not the fetched #29991. Example:
Use the add_comment tool to add a fun comment to the newly created discussion (use the temporary ID of the create_discussion item)
Option B: Add the fun comment during step 8 (before create_discussion runs), so it executes before the close-older mechanism locks the discussion.
Impact
False negative smoke test results: The actual smoke test passed (all 15 tests ✅), but the workflow reports failure
Broken signal reliability: Smoke Copilot is used to validate the Copilot engine — false failures reduce trust in the smoke test signal
Actor mnkiefer dispatched this run (workflow_dispatch) — the false failure may have caused unnecessary investigation
Confidence & Unknowns
High confidence on root cause (matching evidence: safeoutputs.jsonl vs safe-output-items.jsonl, job timing, workflow config)
Unknown: whether this is a recurring pattern across previous runs (only 1 failure in 6h window; comparison baseline run 25286346709 succeeded but was a pull_request trigger where different output steps apply)
Note: Could not verify existing issue coverage (GitHub API access restricted in this workflow context)
Executive Summary
The Smoke Copilot workflow run §25287782330 failed in the
safe_outputsjob despite all 15 smoke test cases passing. The agent completed successfully (exit code 0, 3m35s) and the smoke test result file was written. The failure is a workflow definition bug causing a false negative: the smoke test infrastructure is healthy, but the reporting step errors out.Failure Cluster
safe_outputs(44s)6-hour window stats: 22 runs total, 1 failure (4.5% failure rate), 1 error (Dev workflow queued). All other 20 runs succeeded.
Root Cause Analysis
Mechanism: The
safe_outputsjob processes items batched from the agent run. Twoadd_commentitems targeting discussion#29991were submitted but never appear insafe-output-items.jsonl(the executed items manifest), causing the job to exit with failure.Why discussion #29991 fails: The workflow's
create-discussionsafe-output config includes:Discussion #29991 was the latest existing smoke-copilot discussion. When the
safe_outputsjob executescreate_discussion(creating #29992), it also closes/locks #29991 as part ofclose-older-discussions. The twoadd_commentcalls that were targeting #29991 then fail because the discussion is locked.Evidence:
safeoutputs.jsonl: 8 items submitted (includes 2×add_commentto [cache-strategy] Cache Strategy Analysis - 2026-05-03 #29991)safe-output-items.jsonl: 6 items executed (bothadd_commentitems missing)add_commentMCP calls returned success during agent run (deferred execution model)safe_outputsjob:conclusion: failure,duration: 44sWorkflow instructions (from smoke-copilot.md):
#29991create_discussion→ creates copilot was here #29992, closes [cache-strategy] Cache Strategy Analysis - 2026-05-03 #29991#29991(now closed → FAILS)Proposed Fix
The workflow should not add output comments to a discussion that will be superseded in the same run. Two options:
Option A (recommended): Change Output step 4 to target the newly created discussion by using its temporary ID, not the fetched
#29991. Example:Option B: Add the fun comment during step 8 (before
create_discussionruns), so it executes before the close-older mechanism locks the discussion.Impact
mnkieferdispatched this run (workflow_dispatch) — the false failure may have caused unnecessary investigationConfidence & Unknowns
pull_requesttrigger where different output steps apply)References: