Skip to content

[safeoutputs] #35579 description fixes never reached the runtime tool copy — all 4 stripped-field patterns recurred today (8× su [Content truncated due to length] #36000

@github-actions

Description

@github-actions

Summary

Issue #35579 (“[safeoutputs] Improve tool descriptions: submit_pull_request_review, add_comment, create_check_run, create_discussion silently strip...”) was closed as completed on 2026-05-28. The four recommended description fixes were applied to pkg/workflow/js/safe_outputs_tools.json — but not to the runtime copy actions/setup/js/safe_outputs_tools.json, which is the file the workflow Generate Safe Outputs Tools step actually loads at run time. The improved descriptions are therefore not live.

Consequence: in the last 24h, all four #35579 patterns recurred — 21 silent-strip events across 9 runs, including 8 occurrences of the headline submit_pull_request_review + pull_request_number strip across 4 different workflows. One of these strips (add_comment + issue_number in Contribution Check) is also the direct cause of the hard failure tracked in open issue #35984.

This is not a duplicate of #35579 (closed). The description content is already correct in pkg/; the actionable defect is that it never propagated to the runtime bundle, and the sync test that should have caught it only checks tool names, not descriptions.

Evidence: the two copies disagree

Sentinel phrase (from the #35579 fix) pkg/workflow/js/safe_outputs_tools.json actions/setup/js/safe_outputs_tools.json (runtime)
do NOT pass pull_request_number (submit_pull_request_review) ✅ present ❌ absent
NOT issue_number (add_comment.item_number) ✅ present ❌ absent
NOT accepted as a parameter (create_check_run.name) ✅ present ❌ absent
not categoryId or category_id (create_discussion.category) ✅ present ❌ absent

The live submit_pull_request_review description served to the agent in run §26694574313 (tools/list response) is byte-for-byte the old text in actions/setup/js/safe_outputs_tools.json — it begins “Submit a pull request review with a status decision. REQUIRED: every call...” with no pull_request_number warning.

Root cause: the sync test has a blind spot

pkg/workflow/safe_outputs_tools_test.go:387 (TestSafeOutputsToolsJSONInSync) is meant to keep the two copies aligned, but it compares only tool names and order:

assert.Equal(t, compilerNames, runtimeNames,
    "pkg/workflow/js/safe_outputs_tools.json and actions/setup/js/safe_outputs_tools.json must define the same tools in the same order; update both files together")

Because #35579 changed only description text (no tools added/removed/reordered), the test stayed green, the PR merged, and the issue was closed as completed — while the runtime copy silently kept the old descriptions.

Recurrence evidence (last 24h)

21 Stripped unknown keys events across 9 runs. All four #35579 patterns are present:

Full strip tally by tool × field × workflow
Tool Stripped field Count Workflows (runs) #35579?
submit_pull_request_review pull_request_number 8 Test Quality Sentinel (§26694574313, §26694675959, §26689955757); PR Code Quality Reviewer (§26694574317, §26689955726); Matt Pocock Skills Reviewer (§26694675926, §26689955714); Smoke Claude (§26690105542) ✅ yes
create_check_run name 3 Matt Pocock (§26694675926, §26689955714); PR Code Quality Reviewer (§26694675930) ✅ yes
add_comment issue_number 1 Contribution Check (§26689948645) — also caused #35984 ✅ yes
create_pull_request_review_comment item_number 5 Matt Pocock (§26694574307) new variant
submit_pull_request_review item_number 1 Matt Pocock (§26694574307) new variant
create_pull_request_review_comment pull_number 1 PR Code Quality Reviewer (§26694675930) new alias
submit_pull_request_review pull_number 1 PR Code Quality Reviewer (§26694675930) new alias
create_issue close_older_issues 1 Contribution Check (§26689948645) minor (config-only key)

Server-log proof, e.g. run 26694574313 line 88:

[safeoutputs] Stripped unknown keys for strict schema tool 'submit_pull_request_review': ["pull_request_number"]

The #35984 connection (add_comment + issue_number)

In Contribution Check run §26689948645, the agent emitted add_comment with issue_number (not the schema field item_number). The MCP server's strict schema silently stripped issue_number, leaving the add_comment item with no resolvable target on a schedule trigger → the Process Safe Outputs job hard-failed (open issue #35984). The #35579 add_comment description fix (“this field is named item_number, NOT issue_number”) — already in pkg/ but not deployed — would have steered the agent to the correct field and prevented that failure.

Recommended remediation

  1. (High) Propagate the descriptions to the runtime copy. Update actions/setup/js/safe_outputs_tools.json so its description / field-description text matches pkg/workflow/js/safe_outputs_tools.json for submit_pull_request_review, add_comment (item_number), create_check_run, and create_discussion (category). Better: make one file the single source and generate the other during build so they cannot drift.
  2. (High) Close the test blind spot. Strengthen TestSafeOutputsToolsJSONInSync to compare full tool definitions — descriptions and inputSchema (or a hash of each tool object) — not just names+order. This is the guardrail that should have failed when [safeoutputs] Improve tool descriptions: submit_pull_request_review, add_comment, create_check_run, create_discussion silently a [Content truncated due to length] #35579 edited only one copy.
  3. (Medium) Document the alias variants surfaced today. Agents also tried pull_number (an alias for pull_request_number) and item_number on the review-comment / review tools. Consider accepting the common aliases on these PR-targeting tools (as update_pull_request already does with pr/pr_number), or explicitly listing the rejected spellings in each description.

Verification plan

  • After (1)+(2): a unit test diffing the two JSON files by full content passes only when descriptions match; make build && make recompile && make test green.
  • Re-run a PR-reviewer workflow and confirm the tools/list submit_pull_request_review description now contains “do NOT pass pull_request_number”.
  • Monitor mcp-logs/safeoutputs/server.log for 3–7 days; Stripped unknown keys lines for these tools should fall toward zero.

Implementation checklist

  • Sync 4 descriptions into actions/setup/js/safe_outputs_tools.json (or generate it from pkg/ during build)
  • Strengthen TestSafeOutputsToolsJSONInSync to compare descriptions + inputSchema, not just names
  • (Optional) Accept/aliases or explicit-reject docs for pull_number / item_number on review tools
  • make build && make recompile && make test
  • Monitor server logs via daily-safe-output-optimizer for 7 days

References

Generated by ⚡ Daily Safe Output Tool Optimizer · opus48 4.5M ·

  • expires on Jun 1, 2026, 9:27 PM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions