Outcome Scorecard — 2026-05-29
| Metric |
Value |
Status |
| Acceptance rate |
100% |
🟢 >80% |
| Zero-touch rate |
0% |
🔴 <25% |
| Waste rate |
0% |
🟢 <10% |
| Median time to resolution |
— |
(no completed items) |
| Accepted |
2 / 16 |
— |
| — strong evidence |
1 |
merged, completed, approved |
| — medium evidence |
1 |
engaged, retained |
| — weak evidence |
0 |
existence only |
| Rejected |
0 |
— |
| Ignored |
0 |
no observable follow-up |
| Zero-touch |
0 / 2 |
— |
| Pending |
10 |
— |
| Unknown |
2 |
unclear terminal state |
| Runs checked |
9 |
— |
🟡 Key Observations
1. Early-stage workflow — mostly pending
- 10 of 16 outcomes (62.5%) are still pending, which is expected for a collection run early in the day.
- 2 items have unknown terminal state (likely data quality issues with evaluator).
- Only 2 outcomes have reached a terminal state so far: 1 accepted (strong), 1 accepted (medium).
2. Zero-touch rate remains 0%
- Both accepted items required human interaction or follow-up, indicating bot outputs need refinement or are appropriately scoped for review.
- This mirrors the 2026-05-28 baseline (0% zero-touch).
3. Data quality flag: 2 fallback evaluations
- 2 of 16 outcomes (12.5%) were evaluated with only generic existence checks (
fallback_exists_only_count).
- Below the 20% threshold, but still present. These items contribute weak signal.
Per-Workflow Status
| Workflow |
Items |
Accepted |
Pending |
Unknown |
Acceptance |
| Chaos PR Bundle Fuzzer |
5 |
0 |
5 |
0 |
— |
| PR Sous Chef |
2 |
1 |
1 |
0 |
50% |
| Matt Pocock Skills Reviewer |
2 |
0 |
1 |
1 |
— |
| Issue Monster |
2 |
0 |
2 |
0 |
— |
| PR Description Updater |
1 |
1 |
0 |
0 |
100% |
| PR Code Quality Reviewer |
1 |
0 |
0 |
1 |
— |
| Daily Sentrux Report |
1 |
0 |
0 |
1 |
— |
| Release |
1 |
0 |
0 |
1 |
— |
| Daily Model Inventory Checker |
1 |
0 |
1 |
0 |
— |
🔵 Trend Analysis — vs. 2026-05-28
| Metric |
Yesterday |
Today |
Change |
| Acceptance rate |
100% |
100% |
➡️ Stable |
| Zero-touch rate |
0% |
0% |
➡️ Stable |
| Pending % |
38.5% |
62.5% |
⬆️ More pending |
| Runs checked |
7 |
9 |
⬆️ +2 runs |
Interpretation: The increase in pending items is expected and natural—this collection captured more workflow runs today (9 vs 7 yesterday). The acceptance rate and zero-touch rate remain stable, which is a good signal of consistent bot behavior.
⚠️ Action Items
-
Monitor unknown evaluations — 2 outcomes were marked "unknown" (PR Code Quality Reviewer, Daily Sentrux Report, Release). Check if these workflows are producing outputs that the evaluators cannot classify. May indicate missing outcome types or data schema mismatches.
-
Low workflow engagement on Chaos PR Bundle Fuzzer — 5 items pending with no accepted outcomes yet. If these are multi-day workflows, normal; if single-day, may indicate prompts need refinement.
-
Matt Pocock Skills Reviewer: 1 unknown, 1 pending — Only 2 items tracked. If this is a new workflow, expected; if established, monitor for data collection issues.
-
Next report target — Aim for 15+ outcomes and <5% data quality fallback rate. Continue capturing runs; wait for items to reach terminal states before analyzing trends.
Evidence Quality
⚠️ 2 item(s) were evaluated using only a generic existence check (signal: fallback_exists_only_count = 2). These contribute to weak evidence and may slightly overstate acceptance. Dedicated evaluators provide stronger signals.
📊 Measured by Outcome Collector · haiku45 28.9K
Outcome Scorecard — 2026-05-29
🟡 Key Observations
1. Early-stage workflow — mostly pending
2. Zero-touch rate remains 0%
3. Data quality flag: 2 fallback evaluations
fallback_exists_only_count).Per-Workflow Status
🔵 Trend Analysis — vs. 2026-05-28
Interpretation: The increase in pending items is expected and natural—this collection captured more workflow runs today (9 vs 7 yesterday). The acceptance rate and zero-touch rate remain stable, which is a good signal of consistent bot behavior.
Monitor unknown evaluations — 2 outcomes were marked "unknown" (PR Code Quality Reviewer, Daily Sentrux Report, Release). Check if these workflows are producing outputs that the evaluators cannot classify. May indicate missing outcome types or data schema mismatches.
Low workflow engagement on Chaos PR Bundle Fuzzer — 5 items pending with no accepted outcomes yet. If these are multi-day workflows, normal; if single-day, may indicate prompts need refinement.
Matt Pocock Skills Reviewer: 1 unknown, 1 pending — Only 2 items tracked. If this is a new workflow, expected; if established, monitor for data collection issues.
Next report target — Aim for 15+ outcomes and <5% data quality fallback rate. Continue capturing runs; wait for items to reach terminal states before analyzing trends.
Evidence Quality
fallback_exists_only_count = 2). These contribute to weak evidence and may slightly overstate acceptance. Dedicated evaluators provide stronger signals.