[safe-output-health] 🏥 Safe Output Health Report - 2026-05-21 #33705
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Safe Output Health Monitor. A newer discussion is available at Discussion #33948. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Safe-output health is strong: 47 of 48 dispatched jobs completed all messages successfully. One run (§26206536101 — Matt Pocock Skills Reviewer) hit a single root cause that failed all 5 of its messages because PR review submission is atomic.
Safe Output Job Statistics
Error Clusters
Cluster 1:
review_path_unresolved_422— PR review submission atomic-failureCount: 5 message failures across 1 run
Affected Jobs:
create_pull_request_review_comment(4),submit_pull_request_review(1)Affected Workflows: Matt Pocock Skills Reviewer
Affected Run: §26206536101 on PR refactor(parser): break up oversized functions in pkg/parser to satisfy 60-line limit #33683
Sample Error:
Root Cause: The handler buffers each
create_pull_request_review_commentindividually (each "Message N completed successfully"), then submits all 4 as a single PR review at the end. GitHub'sPOST /pulls/{n}/reviewsis atomic — if any one comment references a path/line not present in the diff, the entire review is rejected and every buffered comment is reported as failed. Candidate offending paths in this run:pkg/parser/import_bfs.go:35,:97,:118, anddocs/adr/33683-extract-helpers-to-satisfy-function-line-limit.md:48. The .md file was likely a new file the agent invented or commented on a line outside the diff hunks.Impact: High for this run (all skill-review feedback dropped). Low repo-wide (1/48 jobs).
Root Cause Analysis
API-Related Issues
The only error category in the window is a GitHub REST 422 from
POST /pulls/{n}/reviews. No rate-limit hits (the run's pre-check showed 13390/15000 remaining headroom — 89%), no auth failures, no network timeouts.Data Validation Issues
The failure is effectively a missing pre-flight diff validation step in the safe-output handler. The handler trusts the agent's
(path, line)coordinates until submission time. The error message returned by GitHub does not identify which comment was offending, which forces the whole batch to fail without diagnostic info.Permission Issues
None observed. The same job had
Resource not accessible by integrationwarnings in adjacent Design Decision Gate runs forbranch protectionreads, but those are warnings, not safe-output failures, and they did not affect output processing.Other Issues
No parsing, JSON, or handler-manager bugs were observed. All 88 messages were parsed and dispatched to handlers correctly. The 5 failures occurred purely at the GitHub API submit stage.
Recommendations
Critical Issues (Immediate Action Required)
None. Failure is isolated to one run and the affected workflow's other invocations succeeded.
Bug Fixes Required
safe_output_handler_manager.cjs— PR review finalization block (=== Finalizing PR Review ===)(path, line)causesPOST /pulls/{n}/reviewsto 422; manager records all buffered comments as failed without identifying which one is at fault.GET /repos/{owner}/{repo}/pulls/{n}/filesand verify each buffered comment'spathexists in the response and itsline/positionfalls inside a diff hunk. Drop invalid ones (logging which) and submit the rest. On 422 from GitHub, fall back to submitting comments individually so the offending entry is isolated.create_pull_request_review_comment,submit_pull_request_reviewConfiguration Changes
None recommended. The Matt Pocock Skills Reviewer config (
max=10,side=RIGHT) is reasonable.Process Improvements
Improve agent-side path discipline
docs/adr/33683-extract-helpers-to-satisfy-function-line-limit.md) that are not present as modified lines in the PR diff.create_pull_request_review_comment, the agent prompt for Matt Pocock Skills Reviewer should include the explicit constraint: "Only comment on lines that appear in the PR diff. If you want to suggest a new file, useadd_comment(issue comment) instead."Path could not be resolved422s without code changes.Better error surfacing in handler manager
Work Item Plans
Work Item 1: Pre-validate PR review comment paths against PR diff
GET /repos/{owner}/{repo}/pulls/{n}/filesonce per review submissionPOST /pulls/{n}/reviewscreate_pull_request_review_commenthandler buffer flush insafe_output_handler_manager.cjs. Use existinggetOctokitto call pulls/files; build aSet<path>and aMap<path, Array<{start,end}>>of hunk ranges; filter buffered comments through this before the review POST.Work Item 2: Individual-comment fallback when atomic review POST 422s
POST /pulls/{n}/reviewsreturns 422 "Path could not be resolved", fall back to submitting each buffered comment individually so the offending entry is isolated and the rest survive.pulls/{n}/commentscalls.Work Item 3: Agent prompt constraint for Matt Pocock Skills Reviewer
add_commentinstead.Historical Context
This is the first Safe Output Health audit recorded in cache memory (
/tmp/gh-aw/cache-memory/safe-output-health/). No prior baseline to compare. Patternreview_path_unresolved_422added toerror-patterns.jsonfor future trend tracking.Trends
Metrics and KPIs
add_comment,create_issue,update_issue,add_labels,noop— 100% in this windowcreate_pull_request_review_comment(atomic submission risk)Next Steps
review_path_unresolved_422cluster does not recur in the next 24h auditsafe-output-health/2026-05-21.jsonReferences:
Beta Was this translation helpful? Give feedback.
All reactions