[safe-output-health] Safe Output Health Report - 2026-05-22 (95.8% success, 1 new handler-inconsistency cluster) #33948

2026-05-22T05:55:11Z

github-actions[bot]
Bot May 22, 2026

Overview

Reviewed all 65 agentic workflow runs in github/gh-aw from the last 24 hours (2026-05-21 ~05:30 UTC → 2026-05-22 ~05:30 UTC). Of those, 24 runs reached the Process Safe Outputs step and collectively handled approximately 105 safe-output messages. 103 messages succeeded, 2 failed (both in a single run), and 1 run hit the same 422 review-path pattern from yesterday but recovered cleanly via the new body-only fallback. Overall safe-output run-level success rate: 95.8% (23 of 24 runs with zero failed messages).

The headline of yesterday's audit — review_path_unresolved_422 — appears to be remediated: the handler now catches the 422 and retries as a body-only review. Today surfaces a new, narrower handler inconsistency that explains the single failed run.

Summary

Period: Last 24 hours
Runs analyzed: 65
Active workflows: ~30
Safe-output jobs that processed messages: 24
Safe-output runs failed: 1 (Smoke Claude run §26269897290)
Soft-recovered warnings: 1 (Smoke Copilot run §26269860137)
Total messages processed: 105 successful / 2 failed
Error clusters identified: 1 new (target_star_review_comment_no_pr_number_fallback), 1 recurring-but-now-recovering (review_path_unresolved_422)

Safe-Output Job Statistics

Job/Message Type	Executions Observed	Failures	Success Rate
create_issue	~6	0	100%
add_comment	~10	0	100%
add_labels	~6	0	100%
add_reviewer	~3	0	100%
update_pull_request	~3	0	100%
submit_pull_request_review	~5	0 (1 soft-recovered)	100%
create_pull_request_review_comment	~10	2	80%
push_to_pull_request_branch	~4	0	100%
close_pull_request	~2	0	100%
create_code_scanning_alert	~2	0	100%
resolve_pull_request_review_thread	~3	0 (3 skipped, token config)	100%
post_slack_message (custom)	~2	0	100%
comment_memory	~2	0	100%
miscellaneous (set_issue_type, dispatch_workflow, reply_to_review_comment, etc.)	~8	0	100%

Counts are approximate — derived from Total messages summaries plus per-handler grep. The summary in gh-aw MCP logs reports total_safe_items: 75 (which appears to exclude certain types); my walk of Process Safe Outputs.txt summaries totals ~105.

Critical Issues

Cluster 1 (NEW): `target_star_review_comment_no_pr_number_fallback`

Count: 2 occurrences in 1 run
Affected handler: create_pull_request_review_comment
Affected workflow: Smoke Claude

Sample error (Process Safe Outputs.txt from §26269897290, step 9_Process Safe Outputs, line 409-410):

##[warning]Target is "*" but no pull_request_number specified in comment item
##[error]✗ Message 5 (create_pull_request_review_comment) failed: Target is "*" but no pull_request_number specified

Root cause: Inconsistent target: "*" auto-resolution across handlers. In the same run, update_pull_request (msg 4) and submit_pull_request_review (msg 7) both auto-resolved target: "*" to triggering PR Add Pi inference request diagnostics to provider logging #33886 (Resolved target pull request #33886 (target config: *) and Set review context from triggering PR: github/gh-aw#33886). But create_pull_request_review_comment (msgs 5 and 6) rejected with no fallback attempt — even though that fallback does exist in this handler on another code path (it worked in §26269860246 and §26267820166 where the same handler logged Fetched full pull request details for PR #xxxxx).
Impact: Per-message failure; other 10 messages in the same run still completed. Job-level status escalated to "failed". Smoke-only today; could surface on any workflow whose agent emits target: "*" review comments without an explicit pull_request_number.

Recovered Patterns (no action required — keep monitoring)

Cluster 2 (REMEDIATED from 2026-05-21): `review_path_unresolved_422`

Count: 1 occurrence in 1 run, gracefully recovered
Affected handler: submit_pull_request_review
Affected workflow: Smoke Copilot

Evidence (Process Safe Outputs.txt from §26269860137, step 9_Process Safe Outputs, lines 553-556):

POST /repos/github/gh-aw/pulls/33852/reviews - 422
##[warning]PR review submission failed due to unresolvable comment line(s):
  Unprocessable Entity: "Line could not be resolved". Retrying as body-only review.
Created PR review #4342724195 (body-only fallback)
✓ PR review submitted successfully

Outcome: Final summary reports Failed: 0. Yesterday this same pattern hard-failed the run; today the handler catches the 422 and falls back to a body-only review. Yesterday's primary recommendation is implemented.

Root Cause Analysis

Handler-level inconsistency (target resolution)

Three handlers all support target: "*", but their treatment of the case "item lacks pull_request_number" diverges:

Handler	Behavior on `target: "*"` + no `pull_request_number` on item
`update_pull_request`	Resolves to triggering PR (works in 26269897290 msg 4)
`submit_pull_request_review`	Sets review context from triggering PR (works in 26269897290 msg 7)
`create_pull_request_review_comment`	Rejects with error when the item itself carries `target: ""`; resolves to triggering PR* when the item omits `target` and inherits the handler-level default. This branch-conditional behavior is the bug.

API/permission warnings (informational — no escalation)

Skipping resolve_pull_request_review_thread for PRRT_*: Resource not accessible by integration — recurring across Smoke Claude runs. Config issue: GITHUB_TOKEN does not have pull_requests: write GraphQL scope for resolving review threads. Surfaces as an informative skip, not a failure.
##[warning][renderMarkdownTemplate] Fence count mismatch: input had N fence marker(s), output has M — cosmetic; occurs whenever conditional blocks containing fences are stripped. No functional impact.

Recommendations

Immediate Actions

None — no production safe-output failures in the last 24 hours. The single failed run is a smoke test exercising a real handler gap.

Bug Fix Required

Unify target: "*" resolution across safe-output handlers
- File: actions/safe-outputs/safe_output_handler_manager.cjs (and individual handler modules)
- Problem: create_pull_request_review_comment has two code paths for target: "*"; only one falls back to the triggering PR.
- Fix: Extract a shared resolveTriggeringPullRequest({ item, handlerConfig, env }) utility used by create_pull_request_review_comment, submit_pull_request_review, update_pull_request, add_reviewer, push_to_pull_request_branch, etc.
- Affected jobs: create_pull_request_review_comment (others already do the right thing)
- Effort: Small

Process Improvement

Improve error granularity on item rejection
- Current: Target is "*" but no pull_request_number specified — no item index, no path, no diff between "agent emitted bad data" vs "handler couldn't auto-resolve".
- Proposed: Include message_index, path, and a hint like Set explicit pull_request_number or remove the target field to use handler default in the error so agents can self-correct on retry and humans can read the log faster.

Work Item Plans

Work Item 1: Fix `target: "*"` fallback inconsistency in `create_pull_request_review_comment`

Type: Bug Fix
Priority: High (single-source-of-truth prevents this entire class of inconsistency)
Description: Handler rejects items with target: "*" and no pull_request_number even when the triggering PR context is available (and sibling handlers in the same run use it). Refactor to share one resolveTriggeringPullRequest utility across handlers.
Acceptance criteria:
- Single utility function used by all PR-targeting handlers
- Adding a regression test that reproduces 26269897290's payload (item with target: "*", no pull_request_number) and asserts the handler resolves to the triggering PR
- Smoke Claude workflow runs cleanly with Failed: 0 on a representative sample
Technical approach: Audit safe_output_handler_manager.cjs, identify per-handler resolution paths, extract shared utility. Add unit tests covering: explicit pull_request_number, target: "*" + no number (should auto-resolve), target: "owner/repo#N" + no number, missing triggering-PR env (should reject with clearer error).
Estimated effort: Small (~half day including tests)
Dependencies: None.

Work Item 2: Add per-item index to safe-output rejection errors

Type: Enhancement
Priority: Medium
Description: When a safe-output item is rejected, the error message does not identify which item index/path failed. Add Message N index and any item-level fields (path, line, pull_request_number) to the error.
Acceptance criteria:
- All ##[error]✗ Message N (handler) failed: lines include enough context for the user (or agent) to identify the bad item without reading earlier lines
- No new noisy logging on success path
Estimated effort: XS

Historical Context

Date	Runs	Safe-output runs	Failed runs	Success rate	Headline cluster
2026-05-21	73	48	1	97.9%	`review_path_unresolved_422` (hard fail)
2026-05-22	65	24	1	95.8%	`target_star_review_comment_no_pr_number_fallback` (NEW)

Trend: Yesterday's headline pattern is now recovering automatically (one occurrence today, fully self-healed via body-only retry). Today's new pattern is a related but narrower handler-inconsistency cluster — easy to fix at the source. Absolute count of failed safe-output runs holds at 1/day across both audits; the affected workflow in both cases is a smoke/test workflow exercising edge cases, not a production data-flow workflow.

Metrics and KPIs

Overall safe-output run success rate: 95.8% (23/24 runs without failures)
Per-message success rate: ~98.1% (103/105)
Most reliable handlers: create_issue, add_comment, add_labels, update_pull_request, add_reviewer, push_to_pull_request_branch, create_code_scanning_alert, comment_memory — 100% across all observed runs
Most problematic handler: create_pull_request_review_comment — 80% (2 failures of ~10 invocations, all in 1 smoke run)
Already-remediated pattern: review_path_unresolved_422 — soft recovery confirmed working

Next Steps

Open a bug-fix PR implementing Work Item 1 (unify target: "*" resolution)
Open an enhancement PR for Work Item 2 (item-index error context)
Monitor for a second occurrence of target_star_review_comment_no_pr_number_fallback over the next 1-2 daily audits to confirm whether the smoke test deterministically reproduces it (rather than being intermittent agent-output noise)
Keep monitoring review_path_unresolved_422 recurrences to ensure body-only fallback continues to land cleanly

References:

Failed run: §26269897290 — Smoke Claude, 2 create_pull_request_review_comment failures
Soft-recovered run: §26269860137 — Smoke Copilot, 422 → body-only fallback success
Sibling run with same handler working: §26269860246 — Smoke Claude, same handler succeeded via triggering-PR auto-resolve

Generated by 🔒 Safe Output Health Monitor · ● 49.5M · ◷

expires on May 23, 2026, 5:55 AM UTC

2026-05-23T05:42:35Z

github-actions[bot]
Bot May 23, 2026
Author

This discussion has been marked as outdated by Safe Output Health Monitor.

A newer discussion is available at Discussion #34174.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[safe-output-health] Safe Output Health Report - 2026-05-22 (95.8% success, 1 new handler-inconsistency cluster) #33948

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[safe-output-health] Safe Output Health Report - 2026-05-22 (95.8% success, 1 new handler-inconsistency cluster) #33948

Uh oh!

github-actions[bot] Bot May 22, 2026

Overview

Summary

Safe-Output Job Statistics

Critical Issues

Cluster 1 (NEW): target_star_review_comment_no_pr_number_fallback

Recovered Patterns (no action required — keep monitoring)

Cluster 2 (REMEDIATED from 2026-05-21): review_path_unresolved_422

Root Cause Analysis

Handler-level inconsistency (target resolution)

API/permission warnings (informational — no escalation)

Recommendations

Immediate Actions

Bug Fix Required

Process Improvement

Work Item Plans

Work Item 1: Fix target: "*" fallback inconsistency in create_pull_request_review_comment

Work Item 2: Add per-item index to safe-output rejection errors

Historical Context

Metrics and KPIs

Next Steps

Replies: 1 comment

Uh oh!

github-actions[bot] Bot May 23, 2026 Author

github-actions[bot]
Bot May 22, 2026

Cluster 1 (NEW): `target_star_review_comment_no_pr_number_fallback`

Cluster 2 (REMEDIATED from 2026-05-21): `review_path_unresolved_422`

Work Item 1: Fix `target: "*"` fallback inconsistency in `create_pull_request_review_comment`

github-actions[bot]
Bot May 23, 2026
Author