Prevent safe-output PR probing from creating stray downstream pull requests#33044
Conversation
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot also fix/harden add_comment, create_issue and push_to_pull_request_branch |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
|
@copilot refactor probe helper to top level function, add tests |
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR tightens safe-output behavior and guidance to prevent agents from using real write-intent tools as probes, especially for exploratory pull request creation.
Changes:
- Adds probe/placeholder validation for PRs, PR branch pushes, issues, and comments.
- Updates safe-output tool descriptions and prompts to emphasize one-shot real side effects.
- Adds tests for the new validation and runtime description enrichment.
Show a summary per file
| File | Description |
|---|---|
pkg/workflow/js/safe_outputs_tools.json |
Updates generated/static safe-output tool descriptions with anti-probing guidance. |
actions/setup/md/safe_outputs_prompt.md |
Adds general write-once safe-output guidance. |
actions/setup/md/safe_outputs_create_pull_request.md |
Clarifies PR creation should happen once with final payload and no manual probing. |
actions/setup/md/mcp_cli_tools_prompt.md |
Removes help/probing-oriented examples and adds safeoutputs CLI guidance. |
actions/setup/js/safe_outputs_tools.json |
Updates action-side tool descriptions with anti-probing guidance. |
actions/setup/js/safe_outputs_tools_loader.test.cjs |
Adds/updates tests for enriched safety guidance in registered tools. |
actions/setup/js/safe_outputs_tools_loader.cjs |
Enriches registered write-intent tool descriptions with runtime safety warnings. |
actions/setup/js/safe_outputs_handlers.test.cjs |
Adds tests for rejecting obvious exploratory payloads. |
actions/setup/js/safe_outputs_handlers.cjs |
Implements placeholder/probe detection and error responses before recording safe outputs. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 9/9 changed files
- Comments generated: 1
| * Generates git patch for the changes (unless allow-empty is true) | ||
| * Supports multi-repo scenarios via the optional 'repo' parameter | ||
| */ |
|
|
|
✅ Smoke OTEL completed successfully! |
|
📰 BREAKING: Smoke Copilot is now investigating this pull request. Sources say the story is developing... |
|
💫 TO BE CONTINUED... Smoke Claude failed to deliver outputs! Our hero faces unexpected challenges... |
|
🚀 Smoke Gemini MISSION COMPLETE! Gemini has spoken. ✨ Caution agentic threat detected |
|
🚀 Smoke Pi MISSION COMPLETE! Pi delivered. 🥧 |
|
✨ The prophecy is fulfilled... Smoke Codex has completed its mystical journey. The stars align. 🌟 |
Agent Container Tool Check
Result: 11/12 tools available Overall Status: FAIL — dotnet is missing from the agent container.
|
|
Commit pushed:
|
|
Caution agentic threat detected Smoke Test Results
Overall status: FAIL Warning Firewall blocked 1 domainThe following domain was blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "localhost"See Network Configuration for more information.
|
|
Codex smoke: FAIL
|
💥 Smoke Test: Claude — Run 26047160386Core (1–12): ✅✅ Overall: Warning Firewall blocked 6 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
|
There was a problem hiding this comment.
💥 Automated smoke test review - all systems nominal!
Warning
Firewall blocked 6 domains
The following domains were blocked by the firewall during workflow execution:
accounts.google.comandroid.clients.google.comclients2.google.comcontentautofill.googleapis.comsafebrowsingohttpgateway.googleapis.comwww.google.com
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
💥 [THE END] — Illustrated by Smoke Claude · ● 5M
|
👋 The smoke test agent was here! 🤖 Beep boop — just swinging by to say hi and confirm all systems operational. Your tree map looks magnificent! 🌳✨ Warning Firewall blocked 6 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
|
Smoke Test: Copilot — Run §26047160572
Overall: ✅ PASS Warning Firewall blocked 6 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
|
There was a problem hiding this comment.
This PR adds intent probe validation to prevent agents from using safe-output tools for exploratory testing. The approach is solid — the trivial value detection in intent_probe.cjs is a good defense layer. Two inline comments left above.
Warning
Firewall blocked 6 domains
The following domains were blocked by the firewall during workflow execution:
accounts.google.comandroid.clients.google.comclients2.google.comcontentautofill.googleapis.comsafebrowsingohttpgateway.googleapis.comwww.google.com
To allow these domains, add them to the
network.allowedlist in your workflow frontmatter:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
📰 BREAKING: Report filed by Smoke Copilot · ● 14.4M
|
📰 VERDICT: Smoke Copilot has concluded. All systems operational. This is a developing story. 🎤 |
|
|
|
|
The
create_pull_requestsafe-output path was easy for agents to treat as a runtime probe, which could burn budget on retries and accidentally publish placeholder PRs to downstream repositories. This change makes exploratory PR emissions fail fast and clarifies that safeoutputs are one-shot declarations of real side effects, not a sandbox.Guard obvious probe payloads in
create_pull_requesttitle/bodyvalues such astest,probe,temp,wip, or the observed*-test-from-main-*branch pattern.nooporreport_incompleteinstead of retrying variants.Tighten safe-output tool guidance
create_pull_requesttool description to explicitly say the call is a real write-intent, not something to test with placeholder content or auth experiments.Steer agent behavior away from CLI probing
create_pull_requestexactly once when the final payload is readygit push/ remote rewrites / auth probingsafeoutputs --help/ per-tool--helpdiscovery for safeoutputs.Example of the new failure mode for an exploratory PR attempt:
{ "branch": "docs/pr-17198-test-from-main-1853f10f924372d4", "title": "test", "body": "test" }This now returns an error instead of recording a
create_pull_requestintent:{ "result": "error", "error": "Refusing to record an exploratory pull request. create_pull_request is for a real intended PR only ... use noop or report_incomplete instead." }Changeset
create_pull_requestsafe-output calls from recording stray downstream pull requests and clarify that safeoutputs are one-shot declarations of real side effects.✨ PR Review Safe Output Test - Run 26047160386
Warning
Firewall blocked 6 domains
The following domains were blocked by the firewall during workflow execution:
accounts.google.comandroid.clients.google.comclients2.google.comcontentautofill.googleapis.comsafebrowsingohttpgateway.googleapis.comwww.google.comSee Network Configuration for more information.