Skip to content

Analyze: update_pull_request with empty args silently drops safe outputs#33134

Merged
pelikhan merged 11 commits into
mainfrom
copilot/update-pull-request-execution
May 18, 2026
Merged

Analyze: update_pull_request with empty args silently drops safe outputs#33134
pelikhan merged 11 commits into
mainfrom
copilot/update-pull-request-execution

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 18, 2026

When the agent calls update_pull_request({}) (no body, title, or update_branch), the safe outputs MCP server returns {"result":"success"}, but the entry is rejected downstream by collect_ndjson_output.cjs via requiresOneOf:title,body,update_branch — causing output_types to be empty, which cascades to both detection and safe_outputs jobs being skipped. No actual PR update ever executes.

Root cause

  • update_pull_request has no dedicated handler in attachHandlers — falls through to defaultHandler, which writes {type: "update_pull_request"} (empty) to safeoutputs.jsonl and returns success
  • requiresOneOf validation only runs in collect_ndjson_output.cjs post-hoc; agent receives no error signal and cannot retry
  • The tool's JSON Schema has no required fields, giving the LLM no upfront signal that at least one of title/body/update_branch is needed

What this analysis identifies (no code changes in this session)

  • Server-side gap: safe outputs HTTP server should return an MCP error (not success) when update_pull_request receives no meaningful fields
  • Schema gap: MCP tool schema should reflect the requiresOneOf constraint so models see it before calling
  • Validation asymmetry: server-side and collect_ndjson_output.cjs enforcement are out of sync for this tool type

pr-sous-chef: refreshed branch via https://github.com/github/gh-aw/actions/runs/26056882797

Generated by 👨‍🍳 PR Sous Chef ·

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot Do the work suggested in the description

Copilot AI and others added 2 commits May 18, 2026 19:34
…nd update schema description

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…mment

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 18, 2026

@copilot Do the work suggested in the description

Done. Three gaps addressed across two commits:

  1. Server-side validation (safe_outputs_handlers.cjs): Added updatePullRequestHandler that throws a -32602 MCP error immediately when none of title, body, or update_branch are provided — so the agent gets an actionable error and can retry instead of receiving a silent success.

  2. Handler wiring (safe_outputs_tools_loader.cjs): Registered update_pull_request: handlers.updatePullRequestHandler in attachHandlers so the dedicated handler is used rather than falling through to defaultHandler.

  3. Schema gap (safe_outputs_tools.json, both copies): Updated the tool description with IMPORTANT: At least one of 'title', 'body', or 'update_branch' must be provided; calling this tool with no fields will return an error. so LLMs see the constraint before calling.

The update_branch: false exclusion is intentional and consistent with the downstream requiresOneOf validator in safe_output_type_validator.cjs (which also excludes field === false from the count), as false carries no update intent.

Copilot AI requested a review from pelikhan May 18, 2026 19:39
@pelikhan pelikhan marked this pull request as ready for review May 18, 2026 19:49
Copilot AI review requested due to automatic review settings May 18, 2026 19:49
@github-actions github-actions Bot mentioned this pull request May 18, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 6M

if (!hasTitle && !hasBody && !hasUpdateBranch) {
throw {
code: -32602,
message: `${ERR_VALIDATION}: update_pull_request requires at least one of: 'title', 'body', 'update_branch' fields`,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The update_branch: false exclusion rule is correctly implemented here and mirrors safe_output_type_validator.cjs, but the shared semantic is captured only in a comment with no shared constant or utility. If the downstream validator's logic ever changes (e.g., treating 0 or null the same as false), this handler will silently diverge.

Consider extracting a tiny shared predicate, or at minimum adding a cross-reference comment pointing to the exact line in safe_output_type_validator.cjs where the parallel logic lives, so future maintainers update both places together.

};
}

return defaultHandler("update_pull_request")(safeArgs);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/diagnose] The draft field is present in the inputSchema and can legitimately represent update intent (converting a PR from draft → ready or vice-versa). A call like update_pull_request({ draft: true }) will pass this handler's validation (none of title/body/update_branch are set), get written to NDJSON, but may then fail downstream if requiresOneOf is the only gating mechanism.

Is draft intentionally excluded from the "at least one of" constraint? If it's a valid standalone update, the description should say so; if not, the handler should gate on it too.


it("error message should mention all required fields", () => {
try {
handlers.updatePullRequestHandler({});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[/tdd] This test uses try/catch + expect.fail instead of Jest's toThrow matcher, which is the pattern used everywhere else in this file (e.g., line 1764). Consider:

it('error message should mention all required fields', () => {
  expect(() => handlers.updatePullRequestHandler({})).toThrow(
    expect.objectContaining({
      message: expect.stringMatching(/title.*body.*update_branch/s),
    })
  );
});

Consistent style makes the test suite easier to scan and keeps coverage reporters accurate.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a safe-outputs validation gap where update_pull_request({}) previously returned success while silently producing an invalid/empty NDJSON entry downstream, causing safe output types to be dropped and jobs to be skipped.

Changes:

  • Add a dedicated update_pull_request MCP handler that rejects invocations with no meaningful update fields, returning an MCP invalid-params error immediately.
  • Register the new handler in the tool loader and add unit tests covering the new validation behavior.
  • Update tool descriptions to warn that at least one of title/body/update_branch is required, and tweak a Datadog MCP header env var placeholder in a locked workflow.
Show a summary per file
File Description
pkg/workflow/js/safe_outputs_tools.json Updates update_pull_request tool description to document the at-least-one-field requirement.
actions/setup/js/safe_outputs_tools.json Same description update for the setup action’s tool manifest.
actions/setup/js/safe_outputs_tools_loader.cjs Registers a dedicated handler for update_pull_request instead of falling through to defaultHandler.
actions/setup/js/safe_outputs_handlers.cjs Implements updatePullRequestHandler to enforce requires-one-of validation at invocation time.
actions/setup/js/safe_outputs_handlers.test.cjs Adds test coverage for updatePullRequestHandler error/success paths.
.github/workflows/smoke-otel-backends.lock.yml Changes Datadog MCP header placeholder from ${DD_APPLICATION_KEY} to ${DD_APP_KEY}.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 6/6 changed files
  • Comments generated: 4

Comment on lines 801 to 805
"name": "update_pull_request",
"description": "Update an existing GitHub pull request's title or body. Supports replacing, appending to, or prepending content to the body. Title is always replaced. Only the fields you specify will be updated; other fields remain unchanged.",
"description": "Update an existing GitHub pull request's title or body. Supports replacing, appending to, or prepending content to the body. Title is always replaced. Only the fields you specify will be updated; other fields remain unchanged. IMPORTANT: At least one of 'title', 'body', or 'update_branch' must be provided; calling this tool with no fields will return an error.",
"inputSchema": {
"type": "object",
"properties": {
Comment on lines 945 to 949
"name": "update_pull_request",
"description": "Update an existing GitHub pull request's title or body. Supports replacing, appending to, or prepending content to the body. Title is always replaced. Only the fields you specify will be updated; other fields remain unchanged.",
"description": "Update an existing GitHub pull request's title or body. Supports replacing, appending to, or prepending content to the body. Title is always replaced. Only the fields you specify will be updated; other fields remain unchanged. IMPORTANT: At least one of 'title', 'body', or 'update_branch' must be provided; calling this tool with no fields will return an error.",
"inputSchema": {
"type": "object",
"properties": {
Comment on lines +1372 to +1377
const hasTitle = safeArgs.title !== undefined;
const hasBody = safeArgs.body !== undefined;
// update_branch: false is treated as not provided because it carries no update intent
// (it's the default behaviour). This mirrors the downstream requiresOneOf validator in
// safe_output_type_validator.cjs which also excludes field === false from the count.
const hasUpdateBranch = safeArgs.update_branch !== undefined && safeArgs.update_branch !== false;
"headers": {
"DD_API_KEY": "\${DD_API_KEY}",
"DD_APPLICATION_KEY": "\${DD_APPLICATION_KEY}",
"DD_APPLICATION_KEY": "\${DD_APP_KEY}",
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 70/100

⚠️ Acceptable — with suggestions

Metric Value
New/modified tests analyzed 8
✅ Design tests (behavioral contracts) 8 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 4 (50%)
Duplicate test clusters 1
Test inflation detected Yes (2.75:1 ratio)
🚨 Coding-guideline violations None

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). No coding-guideline violations detected. Score is 70/100.

🧪 Test quality analysis by Test Quality Sentinel · ● 17.5M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 70/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). No coding-guideline violations detected. Minor suggestions: consolidate 4 near-identical happy-path tests into a table-driven it.each pattern to reduce duplication and the 2.75:1 test inflation ratio.

@github-actions
Copy link
Copy Markdown
Contributor

@copilot refresh the branch and rerun checks.

Generated by 👨‍🍳 PR Sous Chef ·

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all comments

Copilot AI and others added 2 commits May 18, 2026 20:23
…s per review feedback

- Use typeof/=== true checks in handler (rejects null title/body, null/false update_branch)
- Add anyOf JSON Schema constraint to both safe_outputs_tools.json copies so {} is rejected at schema level
- Add tests for null title, null body, null update_branch inputs

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
…moke-otel-backends.lock.yml

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 18, 2026

@copilot review all comments

All four review comments addressed:

  1. Null handling in handler (8c4c500): Changed !== undefined checks to typeof title === 'string', typeof body === 'string', update_branch === truenull and non-string values are now rejected.

  2. anyOf JSON Schema constraint (8c4c500): Added anyOf: [required:title, required:body, required:update_branch+const:true] to both safe_outputs_tools.json copies so clients/LLMs see the constraint before calling.

  3. Tests for null inputs (8c4c500): Added test cases for {title: null}, {body: null}, {update_branch: null} — all confirm MCP -32602 is thrown.

  4. DD_APPLICATION_KEY header consistency (latest commit): Fixed mcp-inspector.lock.yml line 999 to use ${DD_APPLICATION_KEY} — now consistent with smoke-otel-backends.lock.yml.

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot ensure claude/codex support anyOf. Otherwise encode it in the tool description.

… for OpenAI/Codex which ignores anyOf

OpenAI function calling does not enforce anyOf in JSON Schema. The anyOf
remains for clients that do (Claude and others), while the description
now explicitly states the REQUIRED constraint and that update_branch:false
does not count, giving all LLMs an upfront signal regardless of anyOf support.

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 18, 2026

@copilot ensure claude/codex support anyOf. Otherwise encode it in the tool description.

OpenAI/Codex does not enforce anyOf in function/tool schemas (it's silently ignored per their docs). Claude supports anyOf but partially. The anyOf is kept for clients that do enforce it, and the constraint is now encoded redundantly in the description for those that don't:

REQUIRED: You must provide at least one of: 'title' (a non-empty string), 'body' (a string), or 'update_branch' set to true (not false). Omitting all three, or passing only 'update_branch: false', will return a -32602 error.

The update_branch property description also now explicitly states that false is treated as omitted, closing the subtle ambiguity.

* Validates that at least one of 'title', 'body', or 'update_branch' is provided,
* matching the server-side requiresOneOf:title,body,update_branch validation.
*/
const updatePullRequestHandler = args => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move inline helper to top level function and add tests

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in the latest commit. Extracted the validation logic into a top-level hasUpdatePullRequestFields(args) function (placed between buildIntentErrorResponse and createHandlers), exported it from module.exports, and added 12 direct unit tests for it in a new top-level describe("hasUpdatePullRequestFields", ...) block — covering null, undefined, {}, false/null field values, and all valid field combinations.

…th tests

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan May 18, 2026 21:14
@pelikhan pelikhan merged commit 596bc69 into main May 18, 2026
29 of 32 checks passed
@pelikhan pelikhan deleted the copilot/update-pull-request-execution branch May 18, 2026 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants