Skip to content

[Outcome Report] Workflow Health Report — 2026-06-04 #36783

@github-actions

Description

@github-actions

Workflow Health — 2026-06-04

Executive read: 10 pending items concentrated in Smoke Claude and Changeset Generator; most outcomes are commented/reviewed (medium evidence). Acceptance rate at 78% is solid, but zero-touch rate of 0% indicates all outputs required human engagement. Several items without trackable URLs (pending with "no url" detail).

Workflow Status Lifecycle health References
Smoke Claude 🟨🟨🟥🟩🟨🟨🟨⬜🟩⬜🟨🟨 🟡 in flight 🟨 36748 · 🟥 36748 · 🟩 36748 · ⬜ 36773 · ⬜ 36766 · 🟨 36748
Smoke Gemini 🟨🟩⬜🟩🟩 🟢 resolving 🟨 36774 · 🟩 36769 · ⬜ 36770 · 🟩 36748 · 🟩 36748
Smoke Codex ⬜🟨 🟡 in flight 36771 · 🟨 36748
Changeset Generator 🟨🟥 🟡 in flight 🟨 36769 · 🟥 36769
Agent Container Smoke Test 🟩🟩 🟢 resolving 🟩 36769 · 🟩 36748

Legend:

  • Status: 🟩 accepted · 🟥 rejected · 🟨 pending · ⬜ unknown
  • Lifecycle health: 🟢 resolving · 🟡 in flight · 🟠 aging · 🔴 stuck · ⚪ underdefined
  • References: one linked item per status emoji, in the same order as the Status column

🔴 Action Items

  1. Smoke Claude — 7 pending items + 2 unknown — This workflow has the highest pending volume. Review the pending create_pull_request_review_comment, post_slack_message, create_code_scanning_alert, and create_check_run items marked "no url" — these cannot be evaluated without proper tracking URLs. Need to verify if these tools succeeded but failed to return URLs, or if they failed silently.

  2. Items without trackable URLs — 5 outcomes have detail: "no url" (Smoke Claude: 4 items; untracked). These are all in pending state and cannot be verified. Either the safe-output tools are not returning URLs, or the workflow is not capturing them. Needs investigation.

  3. Zero-touch rate at 0% — All 7 accepted items required human engagement (comments, reviews, edits). This suggests the agents' outputs are always incomplete or require human review by design. Flag: Is this expected, or should agents produce more self-contained outputs?

  4. Acceptance rate at 78% — Solid but below ideal. The 2 rejected items (update_pull_request in Changeset Generator, add_reviewer in Smoke Claude) should be reviewed to understand why outputs were not retained.

Detailed metrics, evidence quality, workflow counts, and trends

Outcome Scorecard — 2026-06-04

Metric Value Status
Acceptance rate 77.8% 🟡 60-80%
Zero-touch rate 0% 🔴 <25%
Waste rate 8.7% 🟢 <10%
Median time to resolution 6 minutes 27 seconds
Accepted 7 / 23
— strong evidence 0 (none)
— medium evidence 7 acted on, state retained/replaced
— weak evidence 0 (none)
Rejected 2
Ignored 0
Zero-touch 0 / 7
Pending 10
Runs checked 7

Per-Workflow Breakdown

Workflow Accepted Rejected Ignored Pending Unknown Total Acceptance Waste
Smoke Claude 2 1 0 7 2 12 66.7% 8.3%
Changeset Generator 0 1 0 1 0 2 0% 50.0%
Smoke Codex 0 0 0 1 1 2 0%
Smoke Gemini 3 0 0 1 1 5 100% 0%
Agent Container Smoke Test 2 0 0 0 0 2 100% 0%

Sort note: Sorted by waste rate descending (Changeset Generator worst first).

Evidence Quality

No items evaluated with weak existence-only signals (fallback_exists_only_count = 0). All accepted items have medium evidence (acted on, state retained). No strong evidence outcomes (merged PRs, closed issues, completed tasks with state verification).

Tracking & Data Quality Issues

⚠️ 5 items with missing URLs — These outcomes are in pending state but have url: "" and detail: "no url":

  • create_pull_request_review_comment (Smoke Claude, 2x)
  • post_slack_message (Smoke Claude)
  • create_code_scanning_alert (Smoke Claude)
  • create_check_run (Smoke Claude)
  • add_labels (Smoke Claude)

These cannot be verified as accepted or rejected. Recommendation: Check if the safe-output tools are designed to return URLs for all output types, or if these tool types should not generate trackable artifacts.

Trend Comparison (vs. 2026-06-01)

Previous (2026-06-01):

  • Acceptance rate: 90.9%
  • Waste rate: 3.7%
  • Zero-touch rate: 0%

Current (2026-06-04):

  • Acceptance rate: 77.8% (⬇️ down 13.1pp)
  • Waste rate: 8.7% (⬇️ up 5.0pp)
  • Zero-touch rate: 0% (➡️ stable)

⚠️ Regressing: Acceptance rate dropped significantly (13pp), and waste rate increased (5pp). This suggests workflows are producing less polished outputs or targeting less amenable items. Investigate: Did the prompt change, or did the sample shift to harder problems?

📊 Measured by Outcome Collector · haiku45 41.3K

  • expires on Jun 11, 2026, 1:09 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions