[safe-output-health] Safe Output Health Report — 2026-05-31: assign_to_agent number-guess failure (96.7% msg success) #36066

2026-05-31T05:58:29Z

github-actions[bot]
Bot May 31, 2026

Executive Summary

Audited the last 24h of agentic workflow activity (window ≈ 01:24Z–05:27Z, 2026-05-31). 41 runs analyzed (39 completed, 2 in-progress incl. this monitor). 23 safe-output jobs processed 61 messages; 2 failed — both in one run, on assign_to_agent.

Runs: 41 (copilot 31 / claude 8) · Safe-output jobs: 23 · Messages: 61
Messages failed: 2 (assign_to_agent, run-26702419759)
Message success rate: 96.7% (59/61) · Run-level job success: 95.7% (22/23)
Soft recoveries: 1 (body-only review fallback)
Clusters: 1 new hard-failure + 1 healthy recurring-cluster validation

Clean day except one new failure cluster in LintMonster (agent guessed issue numbers instead of using temporary-id cross-refs). Separately, the tracked review_path_unresolved_422 Line-variant fallback recovered correctly for the 4th time.

Safe-Output Statistics

Handler	Msgs	Failed	Rate
add_comment	15	0	100%
create_pull_request_review_comment	11	0	100%
create_pull_request	8	0	100%
create_issue	6	0	100%
create_discussion	5	0	100%
submit_pull_request_review	4	0	100% (1 via fallback)
assign_to_agent	6	2	66.7%
update_pull_request / push_to_pr_branch / add_labels	3	0	100%

🔴 Critical Cluster (NEW): `assign_to_agent` literal issue-number guess

Run: §26702419759 — LintMonster (copilot) · 2 message failures
Error: ##[error]✗ Message 4 (assign_to_agent) failed: ... Could not resolve to an Issue with the number of 36048 (and 36049). ##[error]2 safe output(s) failed.
Root cause: LintMonster created 3 issues that received real numbers [lint-monster] [Lint] Fix pkg/workflow function length violations (286 issues) #36050/[lint-monster] [Lint] Fix pkg/cli function length violations (369 issues) #36051/[lint-monster] [Lint] Fix pkg/linters and pkg/parser function length violations (14 issues) #36052 (from temporary IDs aw_Xa3lqDic/aw_JNSbMXq8/aw_7zxP5sPj, per temporary-id-map.json). It then emitted assign_to_agent with literal issue_number 36048/36049/36050 (confirmed in agent_output.json) — predicted numbers, off by two. Only [lint-monster] [Lint] Fix pkg/workflow function length violations (286 issues) #36050 matched a real issue and succeeded; [community] Update community contributions in README #36048/[daily-compiler-quality] Daily Compiler Code Quality Report - 2026-05-31 #36049 don't exist → 2 hard failures. The agent guessed numbers instead of using the #aw_<temporaryId> cross-reference form the processor rewrites.
System-side contributors: the assign_to_agent handler (a) hard-fails the job (##[error]) on an unresolvable issue instead of soft-skipping a best-effort assignment, and (b) does not resolve #aw_ temporary-ids on its issue_number field the way add_comment/create_pull_request do.
Impact: per-message. The 3 create_issue, 1 create_discussion, and 1 valid assign ([lint-monster] [Lint] Fix pkg/workflow function length violations (286 issues) #36050) in the same run all succeeded; 2 new issues lost their intended Copilot assignment and the job is marked failed.

✅ Positive: `review_path_unresolved_422` Line-variant fallback recovered (4th time)

§26700870340 (Matt Pocock): submit_pull_request_review hit 422 "Line could not be resolved"; body-only fallback fired and retried successfully (Failed: 0). 4th Line-variant soft recovery (after 05-22, 05-26).
3 other reviewer runs submitted line-anchored reviews cleanly with no 422: PR Code Quality (§26700056079, §26700870334), Matt Pocock (§26700056087).
⚠️ Path variant still UNVALIDATED (4th consecutive audit) — no "Path could not be resolved" 422 exercised since the 2026-05-27 regression; the pr_review_buffer.cjs:554 predicate fix remains unconfirmed in production.

Recurring clusters — status today

Cluster	Status
review_path_unresolved_422 (Line)	✅ Exercised — fallback recovered
review_path_unresolved_422 (Path)	⏸️ Not exercised — fix unvalidated (4th audit)
target_star_review_comment_no_pr_number_fallback	⏸️ Not reproduced (11 review-comments had valid PR context)
target_star_add_comment_no_item_number_fallback	⏸️ Not reproduced (15 add_comments had explicit numbers)
cancellation_counter_mislabeled_code_push_failed	⏸️ Not exercised (no WTD3 abort)

Low-severity observation — metrics blind spot

The logs aggregator reported total_safe_items: 0 / "0 write runs, 41 read-only", yet 61 messages were processed incl. dozens of real writes (8 PRs, 6 issues, 5 discussions, 15 comments, 1 push, 1 PR update, 1 label set). Cause: today's workflows emit via the bash_safeoutputs CLI wrapper, which the actuation counter doesn't attribute as a write — under-reporting volume by ~100%. Not a failure; worth aligning the counter. (Note: all ::error:: lines elsewhere were agent/detection-job concerns — out of scope; their safe-output jobs processed Failed: 0.)

Recommendations & Work Items

WI-1 (High) — assign_to_agent temporary-id + soft-skip. Eliminate the "Could not resolve to an Issue" hard failures.

Agent-side (primary): Update LintMonster (and any create-then-assign workflow) to reference newly-created issues via #aw_<temporaryId> in assign_to_agent.issue_number, never guessed numbers.
System-side (defensive): In safe_output_handler_manager.cjs — (a) resolve #aw_ refs on issue_number; (b) on unresolvable target emit ##[warning] soft-skip, not a job-failing ##[error]; (c) list the run's created-issue numbers in the error.
Acceptance: temporary-ids used + resolved; unresolvable assign → warning not error; error lists candidates. Effort: Small–Medium.

WI-2 (Medium) — Validate review_path_unresolved_422 Path-variant fix. Confirm pr_review_buffer.cjs:554 matches both "Line could not be resolved" and "Path could not be resolved"; add a Path-variant unit test mirroring the Line test. Path 422 has not been organically exercised in 4 audits — consider a targeted smoke test. Effort: Small.

WI-3 (Low) — Align observability actuation counter with bash_safeoutputs CLI-wrapper writes (metrics accuracy).

Historical Context

Date	Runs	Msgs	Failed	Success	Headline
05-26	64	122	0	100%	cancellation counter mislabel (log-only)
05-27	81	~56	3	94.6%	review_path_422 Path regression
05-30	18	11+	0	100%	clean
05-31	41	61	1 run / 2 msg	96.7%	assign_to_agent number-guess (NEW)

Trend: After a clean 05-30, one new per-message cluster appeared; reliability stays high (96.7%). The cross-reference/target-resolution family remains the dominant theme — today's assign_to_agent number-guess is a sibling of the target_star_* clusters. Review path is healthy (Line fallback validated 4×; 3/4 reviewers clean). Issue Monster also used assign_to_agent today with 3/3 success, confirming the handler works when given valid numbers.

References:

§26702419759 — LintMonster (2 hard failures)
§26700870340 — Matt Pocock (Line-variant fallback recovery)
§26700056079 — PR Code Quality Reviewer (clean review)

Generated by 🔒 Safe Output Health Monitor · opus48 4.3M · ◷

expires on Jun 1, 2026, 5:58 AM UTC

2026-06-01T06:08:41Z

github-actions[bot]
Bot Jun 1, 2026
Author

This discussion has been marked as outdated by Safe Output Health Monitor.

A newer discussion is available at Discussion #36189.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[safe-output-health] Safe Output Health Report — 2026-05-31: assign_to_agent number-guess failure (96.7% msg success) #36066

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[safe-output-health] Safe Output Health Report — 2026-05-31: assign_to_agent number-guess failure (96.7% msg success) #36066

Uh oh!

github-actions[bot] Bot May 31, 2026

Executive Summary

Safe-Output Statistics

🔴 Critical Cluster (NEW): assign_to_agent literal issue-number guess

✅ Positive: review_path_unresolved_422 Line-variant fallback recovered (4th time)

Recommendations & Work Items

Historical Context

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Jun 1, 2026 Author

github-actions[bot]
Bot May 31, 2026

🔴 Critical Cluster (NEW): `assign_to_agent` literal issue-number guess

✅ Positive: `review_path_unresolved_422` Line-variant fallback recovered (4th time)

github-actions[bot]
Bot Jun 1, 2026
Author