Workflow Health — 2026-06-04
Executive read: 10 pending items concentrated in Smoke Claude and Changeset Generator; most outcomes are commented/reviewed (medium evidence). Acceptance rate at 78% is solid, but zero-touch rate of 0% indicates all outputs required human engagement. Several items without trackable URLs (pending with "no url" detail).
Legend:
- Status: 🟩 accepted · 🟥 rejected · 🟨 pending · ⬜ unknown
- Lifecycle health: 🟢 resolving · 🟡 in flight · 🟠 aging · 🔴 stuck · ⚪ underdefined
- References: one linked item per status emoji, in the same order as the Status column
🔴 Action Items
-
Smoke Claude — 7 pending items + 2 unknown — This workflow has the highest pending volume. Review the pending create_pull_request_review_comment, post_slack_message, create_code_scanning_alert, and create_check_run items marked "no url" — these cannot be evaluated without proper tracking URLs. Need to verify if these tools succeeded but failed to return URLs, or if they failed silently.
-
Items without trackable URLs — 5 outcomes have detail: "no url" (Smoke Claude: 4 items; untracked). These are all in pending state and cannot be verified. Either the safe-output tools are not returning URLs, or the workflow is not capturing them. Needs investigation.
-
Zero-touch rate at 0% — All 7 accepted items required human engagement (comments, reviews, edits). This suggests the agents' outputs are always incomplete or require human review by design. Flag: Is this expected, or should agents produce more self-contained outputs?
-
Acceptance rate at 78% — Solid but below ideal. The 2 rejected items (update_pull_request in Changeset Generator, add_reviewer in Smoke Claude) should be reviewed to understand why outputs were not retained.
Detailed metrics, evidence quality, workflow counts, and trends
Outcome Scorecard — 2026-06-04
| Metric |
Value |
Status |
| Acceptance rate |
77.8% |
🟡 60-80% |
| Zero-touch rate |
0% |
🔴 <25% |
| Waste rate |
8.7% |
🟢 <10% |
| Median time to resolution |
6 minutes 27 seconds |
— |
| Accepted |
7 / 23 |
— |
| — strong evidence |
0 |
(none) |
| — medium evidence |
7 |
acted on, state retained/replaced |
| — weak evidence |
0 |
(none) |
| Rejected |
2 |
— |
| Ignored |
0 |
— |
| Zero-touch |
0 / 7 |
— |
| Pending |
10 |
— |
| Runs checked |
7 |
— |
Per-Workflow Breakdown
| Workflow |
Accepted |
Rejected |
Ignored |
Pending |
Unknown |
Total |
Acceptance |
Waste |
| Smoke Claude |
2 |
1 |
0 |
7 |
2 |
12 |
66.7% |
8.3% |
| Changeset Generator |
0 |
1 |
0 |
1 |
0 |
2 |
0% |
50.0% |
| Smoke Codex |
0 |
0 |
0 |
1 |
1 |
2 |
— |
0% |
| Smoke Gemini |
3 |
0 |
0 |
1 |
1 |
5 |
100% |
0% |
| Agent Container Smoke Test |
2 |
0 |
0 |
0 |
0 |
2 |
100% |
0% |
Sort note: Sorted by waste rate descending (Changeset Generator worst first).
Evidence Quality
No items evaluated with weak existence-only signals (fallback_exists_only_count = 0). All accepted items have medium evidence (acted on, state retained). No strong evidence outcomes (merged PRs, closed issues, completed tasks with state verification).
Tracking & Data Quality Issues
⚠️ 5 items with missing URLs — These outcomes are in pending state but have url: "" and detail: "no url":
create_pull_request_review_comment (Smoke Claude, 2x)
post_slack_message (Smoke Claude)
create_code_scanning_alert (Smoke Claude)
create_check_run (Smoke Claude)
add_labels (Smoke Claude)
These cannot be verified as accepted or rejected. Recommendation: Check if the safe-output tools are designed to return URLs for all output types, or if these tool types should not generate trackable artifacts.
Trend Comparison (vs. 2026-06-01)
Previous (2026-06-01):
- Acceptance rate: 90.9%
- Waste rate: 3.7%
- Zero-touch rate: 0%
Current (2026-06-04):
- Acceptance rate: 77.8% (⬇️ down 13.1pp)
- Waste rate: 8.7% (⬇️ up 5.0pp)
- Zero-touch rate: 0% (➡️ stable)
⚠️ Regressing: Acceptance rate dropped significantly (13pp), and waste rate increased (5pp). This suggests workflows are producing less polished outputs or targeting less amenable items. Investigate: Did the prompt change, or did the sample shift to harder problems?
📊 Measured by Outcome Collector · haiku45 41.3K
Workflow Health — 2026-06-04
Executive read: 10 pending items concentrated in Smoke Claude and Changeset Generator; most outcomes are commented/reviewed (medium evidence). Acceptance rate at 78% is solid, but zero-touch rate of 0% indicates all outputs required human engagement. Several items without trackable URLs (pending with "no url" detail).
Legend:
🔴 Action Items
Smoke Claude — 7 pending items + 2 unknown — This workflow has the highest pending volume. Review the pending
create_pull_request_review_comment,post_slack_message,create_code_scanning_alert, andcreate_check_runitems marked "no url" — these cannot be evaluated without proper tracking URLs. Need to verify if these tools succeeded but failed to return URLs, or if they failed silently.Items without trackable URLs — 5 outcomes have
detail: "no url"(Smoke Claude: 4 items; untracked). These are all in pending state and cannot be verified. Either the safe-output tools are not returning URLs, or the workflow is not capturing them. Needs investigation.Zero-touch rate at 0% — All 7 accepted items required human engagement (comments, reviews, edits). This suggests the agents' outputs are always incomplete or require human review by design. Flag: Is this expected, or should agents produce more self-contained outputs?
Acceptance rate at 78% — Solid but below ideal. The 2 rejected items (update_pull_request in Changeset Generator, add_reviewer in Smoke Claude) should be reviewed to understand why outputs were not retained.
Detailed metrics, evidence quality, workflow counts, and trends
Outcome Scorecard — 2026-06-04
Per-Workflow Breakdown
Sort note: Sorted by waste rate descending (Changeset Generator worst first).
Evidence Quality
No items evaluated with weak existence-only signals (fallback_exists_only_count = 0). All accepted items have medium evidence (acted on, state retained). No strong evidence outcomes (merged PRs, closed issues, completed tasks with state verification).
Tracking & Data Quality Issues
url: ""anddetail: "no url":create_pull_request_review_comment(Smoke Claude, 2x)post_slack_message(Smoke Claude)create_code_scanning_alert(Smoke Claude)create_check_run(Smoke Claude)add_labels(Smoke Claude)These cannot be verified as accepted or rejected. Recommendation: Check if the safe-output tools are designed to return URLs for all output types, or if these tool types should not generate trackable artifacts.
Trend Comparison (vs. 2026-06-01)
Previous (2026-06-01):
Current (2026-06-04):