[safe-output-health] Safe Output Health Report - 2026-06-07 #37500

2026-06-07T05:59:21Z

github-actions[bot]
Bot Jun 7, 2026

Executive Summary

Seventh consecutive clean day. 0 safe-output job hard failures across 19 executed safe_outputs jobs in the richest window since 2026-06-01 (49 runs, 02:11Z-05:42Z). This window delivered the broadest PR-reviewer coverage in audit history, finally exercising the long-standing review_path_unresolved_422 gap at scale — and it did not reproduce.

Runs in window: 49 (1 self-monitor, 3 reviewer runs in_progress excluded, 1 activation-gated/skipped)
Engines: copilot 37, claude 11
safe_outputs jobs executed / succeeded: 19 / 19 (100%)
Hard failures: 0 · Failed actuation messages: 0
Collection-time validation rejections (by-design): 1
New error clusters: 0

Safe Output Job Statistics

Job / handler	Executions	Success
submit_pull_request_review	~14	100%
create_pull_request_review_comment	~30	100%
create_issue	5	100%
create_discussion	3	100%
create_pull_request	2	100%
update_pull_request	1	100%
push_to_pull_request_branch	1	100%
add_comment	several	100%
upload_asset	6	100%
noop / missing_tool	many / 2	100%

Headline: review_path_unresolved_422 Exercised at Scale — Clean

For 10 consecutive audits the Path-variant fallback fix (pr_review_buffer.cjs:554) sat unexercised because no PR-reviewer workflows ran in the captured early-morning windows. This window broke that streak hard. Across 5 reviewer batches (02:11 / 02:55 / 03:39 / 04:16-17 / 05:12Z), Matt Pocock Skills Reviewer and PR Code Quality Reviewer emitted dozens of line/path-anchored create_pull_request_review_comment messages plus submit_pull_request_review (REQUEST_CHANGES / COMMENT / APPROVE) on PRs #37469, #37474, #37475, #37480, #37481, #37487.

Audited safe_outputs job conclusions (all success, zero 422)

Run	Workflow	Emission	safe_outputs
27080054931	Matt Pocock	8 review comments + COMMENT (#37469)	success, 0/0
27081741698	PR Code Quality	REQUEST_CHANGES + 7 comments (#37475)	success, 0/0
27083500154	PR Code Quality	REQUEST_CHANGES (#37487)	success, 0/0
27082438727	PR Code Quality	batch-4 review	success, 0/0
27082451485	Design Decision Gate	push_to_pull_request_branch ADR 3422B (#37481)	success, 0/0

No Path could not be resolved / Line could not be resolved 422 anywhere. The happy path is now confirmed at the highest review-comment volume on record.

Caveat: because GitHub resolved every position and returned no 422, the body-only fallback predicate at pr_review_buffer.cjs:554 was never triggered. The Path-variant recovery code path therefore remains technically unvalidated for the 11th consecutive audit — but the production happy path is now very strongly confirmed.

Low-Severity Observations

1. add_comment max-count over-emission (by-design enforcement)

Test Quality Sentinel run-27080054894 emitted a placeholder "test body" add_comment (temporary_id aw_fjvg2zVq) alongside its real APPROVE on #37475. Collection-time validation rejected the excess item: Too many items of type add_comment. Maximum allowed: 1. The safe_outputs job still completed success and the APPROVE actuated (net Failed: 0). By-design max enforcement working correctly. Minor agent-side output-hygiene note: the agent emitted a literal "test body" placeholder it should not have.

2. Out-of-scope agent permission friction (healthy safe-output handoff)

Two workflows emitted missing_tool reporting numerous permission-denied errors: Copilot CLI Deep Research Agent (27083451001) and Daily Compiler Quality Check (27081699666). These are agent-job permission issues (out of scope), but a positive signal: both cleanly emitted and processed the missing_tool fallback (Failed: 0). Recurring Copilot-CLI permission-denied pattern seen on prior days.

Recommendations

No safe-output-job action required — 7 consecutive clean days; all handlers healthy at high volume.
Downgrade review_path_unresolved_422 watch priority: the happy path is now confirmed at record scale across 6 PRs. The Path-variant predicate fix is purely defensive — keep it in tree but consider closing active monitoring unless a real 422 recurs. A synthetic/smoke test that forces a Path could not be resolved 422 would validate the fallback directly rather than waiting for production to produce one.
Agent-side (out of scope; refer to agent-health monitor): recurring Copilot-CLI permission-denied friction (Deep Research / Compiler Quality); stray "test body" placeholder add_comment from Test Quality Sentinel.

Metrics

Overall safe-output success rate: 100% (19/19 jobs)
Most exercised handler: create_pull_request_review_comment (~30, all clean)
Trend: stable — 06-01 (84 jobs) then six partial windows all 100%; today the largest reviewer-coverage window, still 100%

References: 27080054931 · 27081741698 · 27083500154

Generated by 🔒 Safe Output Health Monitor · 581.4 AIC · ⌖ 32.6 AIC · ⊞ 10.2K · ◷

expires on Jun 8, 2026, 5:59 AM UTC

2026-06-08T06:14:08Z

github-actions[bot]
Bot Jun 8, 2026
Author

This discussion was automatically closed because it expired on 2026-06-08T05:59:21.534Z.

Closed by Workflow

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[safe-output-health] Safe Output Health Report - 2026-06-07 #37500

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[safe-output-health] Safe Output Health Report - 2026-06-07 #37500

Uh oh!

github-actions[bot] Bot Jun 7, 2026

Executive Summary

Safe Output Job Statistics

Headline: review_path_unresolved_422 Exercised at Scale — Clean

Low-Severity Observations

Recommendations

Metrics

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Jun 8, 2026 Author

github-actions[bot]
Bot Jun 7, 2026

github-actions[bot]
Bot Jun 8, 2026
Author