Reinforce timeout + do-not-poll guidance in terminal tool descriptions by meganrogge · Pull Request #320141 · microsoft/vscode

meganrogge · 2026-06-05T16:58:07Z

Addresses microsoft/vscode-internalbacklog#7870.

The agent has been observed polling get_terminal_output for ~27 steps across three npm commands when a too-short timeout (120s for a 3–5 minute install) caused sync→background promotion. The system prompt preamble already says "do NOT poll", and runInTerminal's modelDescription already recommends generous timeouts — but neither of those signals attaches to the spot where the model is actually making the decision (the timeout parameter description for runInTerminal, and the top-level description for get_terminal_output).

This PR reinforces both signals at the point of use:

runInTerminal → timeout parameter description

Before:

Optional hard cap in milliseconds on how long the tool tracks the command before returning. Omit to let the command run to completion (recommended for package installs, builds, and long-running scripts). Use 0 to explicitly indicate no timeout.

After:

Optional hard cap in milliseconds on how long the tool tracks the command before returning. Recommended: 600000 (10 min) for package installs, 900000 (15 min) for large builds. Omit entirely for commands that should run to completion (the safest default for installs and builds). Use 0 to explicitly indicate no timeout. A too-short timeout forces background promotion and triggers polling — prefer generous timeouts.

getTerminalOutput → modelDescription

Before:

Get output from an active terminal execution (identified by the id returned from run_in_terminal).

After:

Get output from an active terminal execution (identified by the id returned from run_in_terminal). Only use this tool if you need to inspect output from a terminal that was started in async mode and you have a concrete reason to read it now. If a sync command timed out and moved to the background, you will be automatically notified on your next turn when it completes — do NOT poll with this tool. End your turn and wait for the notification.

Why not pull these from the system prompt preamble?

The preamble already contains both pieces of guidance, but models attend most strongly to tool parameter descriptions at call time, and the preamble has to compete with many other paragraphs. Restating the rule at the decision point is what gets it followed.

Risks / follow-ups

The two strings are longer now (more prompt tokens per tool definition). Estimated cost: ~80 extra tokens combined per request that includes these tools — small compared to the ~27 polling steps each ~1.5 KB of output that this is trying to prevent.
Issue find free port based on node-portfinder package #7870 (proposal P5c) also asks us to verify the notification mechanism is reliable; that's a separate investigation and not in this PR.

Tests

No assertions reference these exact strings (verified via grep for "Optional hard cap in milliseconds" and "Get output from an active terminal"). No test changes required.

vs-code-engineering · 2026-06-05T16:59:52Z

📬 CODENOTIFY

The following users are being notified based on files changed in this PR:

@anthonykim1

Matched files:

src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/getTerminalOutputTool.ts
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/runInTerminalTool.ts

Copilot

Pull request overview

This PR updates the terminal chat agent tool descriptions to more strongly discourage polling for terminal output and to steer agents toward using sufficiently generous timeout values so sync commands don’t get promoted to background unexpectedly.

Changes:

Strengthen run_in_terminal.timeout parameter guidance with concrete “generous timeout” examples and explicit warning about short timeouts causing background promotion/polling behavior.
Expand get_terminal_output’s modelDescription with “do not poll; wait for notification” guidance at the tool decision point.

Show a summary per file

File	Description
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/runInTerminalTool.ts	Updates the `timeout` parameter description to steer agents toward longer timeouts and avoid timeout-driven background promotion/polling.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/getTerminalOutputTool.ts	Updates the tool description to discourage polling and instruct waiting for completion notifications.

Copilot's findings

Files reviewed: 2/2 changed files
Comments generated: 2

- getTerminalOutput: clarify valid use case (async output inspection) before the do-not-poll guidance - timeout: restore 'Optional' label, add human-readable durations (600000 = 10 min, 900000 = 15 min), and explain background promotion in plain terms the model can reason about Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- getTerminalOutput: cover both async-mode and timed-out-sync use cases instead of restricting to async-only - timeout: restructure to lead with 'if you set a timeout, be generous' so the numeric recommendations are clearly conditional, avoiding contradiction with the omit-entirely guidance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

meganrogge · 2026-06-05T17:11:26Z

/requires-eval-assessment terminalbench2 gpt-5.4,claude-opus-4.6,claude-opus-4.7

vs-code-engineering · 2026-06-05T17:12:39Z

⏳ Queued vscode build for d6b25ffd4811be6c53c41c8680ee122b162fe253 (step 1/2).

Build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445312
When this succeeds, the eval-assessment publish build will be queued automatically.

vs-code-engineering · 2026-06-05T18:14:32Z

🔄 First vscode build failed; retried failed stages: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445312

vs-code-engineering · 2026-06-05T20:04:43Z

⏳ Queued vscode build for f465bdf99855f7830d5cbde95684d5792ae623de (step 1/2).

Build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445344
When this succeeds, the eval-assessment publish build will be queued automatically.

vs-code-engineering · 2026-06-05T21:01:33Z

🚀 Queued eval-assessment publish build for 5b01c33963060dbf8a51a628e8db202435bc7e12 (step 2/2).

Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445372
On success, publishes @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.5b01c33963 on the dev tag.

vs-code-engineering · 2026-06-05T21:15:58Z

✅ Eval-assessment build published.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.5b01c33963 (tag: dev)
Install: npm install @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.5b01c33963
Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445372

vs-code-engineering · 2026-06-05T21:16:13Z

🔬 Queued eval-assessment benchmark for 96e8d26961.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.5b01c33963 (dev tag)
Agent: vscode
Benchmark: terminalbench2
Tracking issues:
- terminalbench2 / gpt-5.4: https://github.com/github/evald/issues/29177
- terminalbench2 / claude-opus-4.6: https://github.com/github/evald/issues/29178
- terminalbench2 / claude-opus-4.7: https://github.com/github/evald/issues/29179

Results will be posted back here when the run completes.

vs-code-engineering · 2026-06-06T07:52:52Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/29179
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445372

🧪 Results

vs-code-engineering · 2026-06-06T08:29:42Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/29177
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445372

🧪 Results

vs-code-engineering · 2026-06-06T09:27:58Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/29178
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=445372

🧪 Results

Reinforce timeout + do-not-poll guidance in terminal tool descriptions

ad320b6

Copilot AI review requested due to automatic review settings June 5, 2026 16:58

Copilot started reviewing on behalf of meganrogge June 5, 2026 16:58 View session

Trim tool description wording

fe833ae

Copilot AI reviewed Jun 5, 2026

View reviewed changes

Comment thread src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/runInTerminalTool.ts Outdated

Comment thread src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/getTerminalOutputTool.ts Outdated

Megan Rogge and others added 2 commits June 5, 2026 13:02

Restore original first sentence of getTerminalOutput description

e913bfe

meganrogge self-assigned this Jun 5, 2026

meganrogge added this to the 1.125.0 milestone Jun 5, 2026

meganrogge added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Jun 5, 2026

vs-code-engineering Bot removed the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Jun 5, 2026

lramos15 approved these changes Jun 5, 2026

View reviewed changes

Merge branch 'main' into megan/terminal-tool-desc-no-poll

f465bdf

meganrogge added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Jun 5, 2026

meganrogge modified the milestones: 1.125.0, 1.124.0 Jun 5, 2026

meganrogge enabled auto-merge (squash) June 5, 2026 20:15

meganrogge merged commit c2d6b5a into main Jun 5, 2026
25 checks passed

meganrogge deleted the megan/terminal-tool-desc-no-poll branch June 5, 2026 20:23

vs-code-engineering Bot removed the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Jun 5, 2026

Conversation

meganrogge commented Jun 5, 2026

Why not pull these from the system prompt preamble?

Risks / follow-ups

Tests

Uh oh!

vs-code-engineering Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📬 CODENOTIFY

@anthonykim1

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

meganrogge commented Jun 5, 2026

Uh oh!

vs-code-engineering Bot commented Jun 5, 2026

Uh oh!

vs-code-engineering Bot commented Jun 5, 2026

Uh oh!

vs-code-engineering Bot commented Jun 5, 2026

Uh oh!

Uh oh!

vs-code-engineering Bot commented Jun 5, 2026

Uh oh!

vs-code-engineering Bot commented Jun 5, 2026

Uh oh!

vs-code-engineering Bot commented Jun 5, 2026

Uh oh!

vs-code-engineering Bot commented Jun 6, 2026

Uh oh!

vs-code-engineering Bot commented Jun 6, 2026

Uh oh!

vs-code-engineering Bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vs-code-engineering Bot commented Jun 5, 2026 •

edited

Loading