Skip to content

Prevent safe-output PR probing from creating stray downstream pull requests#33044

Merged
pelikhan merged 14 commits into
mainfrom
copilot/agent-step-probes-runtime
May 18, 2026
Merged

Prevent safe-output PR probing from creating stray downstream pull requests#33044
pelikhan merged 14 commits into
mainfrom
copilot/agent-step-probes-runtime

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 18, 2026

The create_pull_request safe-output path was easy for agents to treat as a runtime probe, which could burn budget on retries and accidentally publish placeholder PRs to downstream repositories. This change makes exploratory PR emissions fail fast and clarifies that safeoutputs are one-shot declarations of real side effects, not a sandbox.

  • Guard obvious probe payloads in create_pull_request

    • Rejects placeholder PR intents before they are recorded when the payload clearly looks exploratory, e.g. trivial title/body values such as test, probe, temp, wip, or the observed *-test-from-main-* branch pattern.
    • Returns an actionable error directing the agent to use noop or report_incomplete instead of retrying variants.
  • Tighten safe-output tool guidance

    • Updates the static create_pull_request tool description to explicitly say the call is a real write-intent, not something to test with placeholder content or auth experiments.
    • Extends the runtime-enriched description in the tool loader with the same warning, including the cross-repo target-repo case.
  • Steer agent behavior away from CLI probing

    • Updates the shared safe-output prompt to state that safe-output calls are write-once declarations for real downstream effects.
    • Updates the PR-specific safe-output prompt to say:
      • call create_pull_request exactly once when the final payload is ready
      • stop immediately after that call
      • do not fall back to manual git push / remote rewrites / auth probing
    • Removes CLI examples that encourage safeoutputs --help / per-tool --help discovery for safeoutputs.

Example of the new failure mode for an exploratory PR attempt:

{
  "branch": "docs/pr-17198-test-from-main-1853f10f924372d4",
  "title": "test",
  "body": "test"
}

This now returns an error instead of recording a create_pull_request intent:

{
  "result": "error",
  "error": "Refusing to record an exploratory pull request. create_pull_request is for a real intended PR only ... use noop or report_incomplete instead."
}

Changeset

  • Type: patch
  • Description: Prevent exploratory create_pull_request safe-output calls from recording stray downstream pull requests and clarify that safeoutputs are one-shot declarations of real side effects.

Generated by 📋 Changeset Generator for issue #33044 ·



✨ PR Review Safe Output Test - Run 26047160386

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · ● 5M ·

Copilot AI and others added 2 commits May 18, 2026 14:02
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix agent job to rely on safe-outputs without probing runtime Prevent safe-output PR probing from creating stray downstream pull requests May 18, 2026
Copilot AI requested a review from pelikhan May 18, 2026 14:05
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot also fix/harden add_comment, create_issue and push_to_pull_request_branch

Copilot AI and others added 3 commits May 18, 2026 14:21
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot refactor probe helper to top level function, add tests

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 18, 2026

@copilot also fix/harden add_comment, create_issue and push_to_pull_request_branch

Addressed in 9176c73. Added the same exploratory-payload hardening for add_comment, create_issue, and push_to_pull_request_branch, plus matching tool guidance/tests.

@pelikhan pelikhan marked this pull request as ready for review May 18, 2026 14:31
Copilot AI review requested due to automatic review settings May 18, 2026 14:31
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 18, 2026

@copilot refactor probe helper to top level function, add tests

Addressed in 02a0aca. The probe validation helpers are now top-level functions and I added focused helper tests alongside the existing handler coverage.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens safe-output behavior and guidance to prevent agents from using real write-intent tools as probes, especially for exploratory pull request creation.

Changes:

  • Adds probe/placeholder validation for PRs, PR branch pushes, issues, and comments.
  • Updates safe-output tool descriptions and prompts to emphasize one-shot real side effects.
  • Adds tests for the new validation and runtime description enrichment.
Show a summary per file
File Description
pkg/workflow/js/safe_outputs_tools.json Updates generated/static safe-output tool descriptions with anti-probing guidance.
actions/setup/md/safe_outputs_prompt.md Adds general write-once safe-output guidance.
actions/setup/md/safe_outputs_create_pull_request.md Clarifies PR creation should happen once with final payload and no manual probing.
actions/setup/md/mcp_cli_tools_prompt.md Removes help/probing-oriented examples and adds safeoutputs CLI guidance.
actions/setup/js/safe_outputs_tools.json Updates action-side tool descriptions with anti-probing guidance.
actions/setup/js/safe_outputs_tools_loader.test.cjs Adds/updates tests for enriched safety guidance in registered tools.
actions/setup/js/safe_outputs_tools_loader.cjs Enriches registered write-intent tool descriptions with runtime safety warnings.
actions/setup/js/safe_outputs_handlers.test.cjs Adds tests for rejecting obvious exploratory payloads.
actions/setup/js/safe_outputs_handlers.cjs Implements placeholder/probe detection and error responses before recording safe outputs.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 9/9 changed files
  • Comments generated: 1

Comment on lines 388 to 390
* Generates git patch for the changes (unless allow-empty is true)
* Supports multi-repo scenarios via the optional 'repo' parameter
*/
@github-actions
Copy link
Copy Markdown
Contributor

@copilot please fix the failing lint jobs, then rerun CI.

Generated by 👨‍🍳 PR Sous Chef ·

@pelikhan pelikhan added the smoke label May 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

Smoke OTEL completed successfully!

@github-actions
Copy link
Copy Markdown
Contributor

📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing...

@github-actions github-actions Bot removed the smoke label May 18, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

💫 TO BE CONTINUED... Smoke Claude failed to deliver outputs! Our hero faces unexpected challenges...

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

🚀 Smoke Gemini MISSION COMPLETE! Gemini has spoken. ✨

Caution

agentic threat detected
Threat detection flagged this output in warn mode. Manual review is REQUIRED before any follow-up automation.

Details

The threat detection results could not be parsed.

Review the workflow run logs for details.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟

@github-actions
Copy link
Copy Markdown
Contributor

Agent Container Tool Check

Tool Status Version
bash 5.2.21
sh available
git 2.54.0
jq 1.7
yq v4.53.2
curl 8.5.0
gh 2.92.0
node v22.22.2
python3 3.14.5
go 1.24.13
java 10.0.300
dotnet (not found)

Result: 11/12 tools available ⚠️dotnet is not installed.

Overall Status: FAIL — dotnet is missing from the agent container.

🔧 Tool validation by Agent Container Smoke Test · ● 1.2M ·

@github-actions
Copy link
Copy Markdown
Contributor

Commit pushed: bed1662

Generated by Changeset Generator

@github-actions
Copy link
Copy Markdown
Contributor

Caution

agentic threat detected
Threat detection flagged this output in warn mode. Manual review is REQUIRED before any follow-up automation.

Details

The threat detection results could not be parsed.

Review the workflow run logs for details.

Smoke Test Results

  • GitHub MCP Testing: ✅
  • Web Fetch Testing: ✅
  • File Writing Testing: ✅
  • Bash Tool Testing: ✅
  • Build gh-aw: ❌

Overall status: FAIL

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • localhost

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "localhost"

See Network Configuration for more information.

Smoke Gemini — Powered by Gemini ·

@github-actions
Copy link
Copy Markdown
Contributor

Codex smoke: FAIL
PRs: #33093 Stop futile auth retries in Copilot/Claude/Codex harnesses after first-attempt authentication failure; #33087 Render report_incomplete reasons in agent failure comment and issue templates
✅ GitHub, Serena, file, bash, build, cache
❌ Playwright, web-fetch, comment-memory, set-issue-field
Run: https://github.com/github/gh-aw/actions/runs/26047161375

🔮 The oracle has spoken through Smoke Codex ·

@github-actions
Copy link
Copy Markdown
Contributor

💥 Smoke Test: Claude — Run 26047160386

Core (1–12): ✅✅⚠️✅✅✅✅✅✅✅✅✅ (Serena returned 2/3 symbols then MCP dropped)
PR Review (13–19): ✅✅✅✅✅✅⚠️ (#19 skipped — no safe test PR)

Overall: ⚠️ PARTIAL§26047160386

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · ● 5M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💥 Automated smoke test review - all systems nominal!

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · ● 5M

@github-actions
Copy link
Copy Markdown
Contributor

👋 The smoke test agent was here! 🤖 Beep boop — just swinging by to say hi and confirm all systems operational. Your tree map looks magnificent! 🌳✨

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot · ● 14.4M ·

@github-actions
Copy link
Copy Markdown
Contributor

Smoke Test: Copilot — Run §26047160572

@app/copilot-swe-agent (author) | assignees: @pelikhan, @Copilot

Test Status
GitHub MCP (merged PRs)
MCP Scripts GH CLI
Serena CLI
Playwright
Web Fetch ⚠️
File Write + Bash
Discussion Interaction
Build gh-aw
Upload artifact
Discussion Creation
Workflow Dispatch
PR Review
Comment Memory ⚠️
Sub-Agent

Overall: ✅ PASS

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot · ● 14.4M ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR adds intent probe validation to prevent agents from using safe-output tools for exploratory testing. The approach is solid — the trivial value detection in intent_probe.cjs is a good defense layer. Two inline comments left above.

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

  • accounts.google.com
  • android.clients.google.com
  • clients2.google.com
  • contentautofill.googleapis.com
  • safebrowsingohttpgateway.googleapis.com
  • www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

📰 BREAKING: Report filed by Smoke Copilot · ● 14.4M

@github-actions
Copy link
Copy Markdown
Contributor

📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤

@github-actions
Copy link
Copy Markdown
Contributor

@copilot review all comments and address the unresolved review feedback on this PR.

Generated by 👨‍🍳 PR Sous Chef ·

@github-actions
Copy link
Copy Markdown
Contributor

@copilot please post a short blocker summary and next steps for this PR.

Generated by 👨‍🍳 PR Sous Chef ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent step probes the runtime instead of relying solely on safe-outputs (leaves stray test PRs behind)

3 participants