Experiment with terminal output deltas for repeated polls by kevin-m-kent · Pull Request #315543 · microsoft/vscode

kevin-m-kent · 2026-05-10T15:12:44Z

This adds an experimental chat.tools.terminal.outputDeltas setting for repeated get_terminal_output polling.

When the experiment is enabled:

The first poll for a terminal execution returns the full output, preserving current context.
Repeated polls with identical output return a short unchanged marker instead of replaying the full terminal buffer.
Polls with appended output return only the new suffix.
Non-prefix changes, such as screen rewrites or truncated scrollback, fall back to returning the current full output.

This change is stateful across explicit get_terminal_output calls and targets prompt/tool-result replay from terminal polling loops.

Add an experimental setting for get_terminal_output polling to return unchanged markers or appended deltas after the first full terminal snapshot. This keeps existing behavior when the flag is disabled and falls back to the current full output when the previous snapshot no longer matches. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds an experimental, stateful “output delta” mode for get_terminal_output polling to reduce repeated prompt/tool-result size when terminal output is unchanged or only appended.

Changes:

Introduces chat.tools.terminal.outputDeltas (experimental, auto-enabled experiment flag) configuration setting.
Updates GetTerminalOutputTool to optionally return unchanged markers or appended-output suffixes on repeated polls (falling back to full output for non-prefix changes).
Adds browser tests covering unchanged output, appended output, and non-prefix rewrite fallback behavior.

Show a summary per file

File	Description
src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/getTerminalOutputTool.test.ts	Adds tests validating unchanged-marker, delta-suffix, and fallback-to-full-output behaviors when the experiment is enabled.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/common/terminalChatAgentToolsConfiguration.ts	Registers the new `chat.tools.terminal.outputDeltas` experimental setting.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/getTerminalOutputTool.ts	Implements stateful output snapshotting and delta formatting behind the new configuration flag.

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 0

kevin-m-kent · 2026-05-11T13:06:50Z

Validated that this is working as expected locally with the flag. Everything after the initial terminal poll returns a message like when the terminal hasn't updated since the last poll:

Output of terminal f7a744dc-de40-46e7-a631-776c4d73ea22 unchanged since previous poll (161 characters already shown). No new output.

meganrogge · 2026-05-11T14:50:53Z

The map stores full output strings (potentially large) for up to 100 terminals, with no cleanup on terminal disposal. Stale entries only get evicted when the cap is hit via FIFO.

Suggestions:

Clear on dispose — override dispose() to call this._lastOutputByTerminalId.clear().
Listen for terminal disposal — register a listener (e.g. onDidDisposeInstance) to delete the entry for that terminal ID, so we don't hold output strings for terminals that no longer exist.
Consider storing a hash + length instead of the full string — if we only need the "unchanged" and "startsWith" checks, we could store a hash of the previous output and the length. We'd lose the ability to do the prefix check cheaply, but it would drastically reduce memory. Alternatively, keep the current approach but lower the cap or add a per-entry size limit.

meganrogge

Thanks for this! See my comment

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kevin-m-kent · 2026-05-11T18:02:48Z

The map stores full output strings (potentially large) for up to 100 terminals, with no cleanup on terminal disposal. Stale entries only get evicted when the cap is hit via FIFO.

Suggestions:

Clear on dispose — override dispose() to call this._lastOutputByTerminalId.clear().

Listen for terminal disposal — register a listener (e.g. onDidDisposeInstance) to delete the entry for that terminal ID, so we don't hold output strings for terminals that no longer exist.

Consider storing a hash + length instead of the full string — if we only need the "unchanged" and "startsWith" checks, we could store a hash of the previous output and the length. We'd lose the ability to do the prefix check cheaply, but it would drastically reduce memory. Alternatively, keep the current approach but lower the cap or add a per-entry size limit.

Thanks @meganrogge - I think these are all great suggestions. I made some updates that I believe addresses your feedback and tested again locally. For #3 I went with the hash approach.

bhavyaus · 2026-05-11T23:58:08Z

Probably not a big deal but worth checking if it works across compaction boundaries since we lose the full tool output in that case if the model reuses terminals/kicks off async tasks, hits compaction, and then resumes tasks

kevin-m-kent · 2026-05-12T12:16:28Z

Probably not a big deal but worth checking if it works across compaction boundaries since we lose the full tool output in that case if the model reuses terminals/kicks off async tasks, hits compaction, and then resumes tasks

I was chatting about this with @isidorn as well. I'm not sure how we solve this scenario, other than relying on the compaction call to preserve what is relevant from the previous tool result.

isidorn · 2026-05-12T13:11:09Z

relying on the compaction call to preserve what is relevant from the previous tool result.

I think this is a fair assumption to make.

meganrogge · 2026-05-12T13:48:39Z

/requires-eval-assessment terminalbench2 gpt-5.4,claude-opus-4.6,claude-opus-4.7

meganrogge · 2026-05-12T13:48:51Z

Seeing how this impacts evals

vs-code-engineering · 2026-05-12T13:49:44Z

⏳ Queued vscode build for f0bd87449265b04584b1d082863a7bd1ef6d01b6 (step 1/2).

Build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438325
When this succeeds, the eval-assessment publish build will be queued automatically.

isidorn · 2026-05-12T13:58:22Z

Pretty cool! I did not know we have this evals integration!

Also consider doing evals for GPT-5.5 as well, since that is the latest OpenAI model.

meganrogge · 2026-05-12T14:18:34Z

/requires-eval-assessment terminalbench2 gpt-5.4,claude-opus-4.6,claude-opus-4.7,gpt-5.5

vs-code-engineering · 2026-05-12T22:10:00Z

⏳ Queued vscode build for f0bd87449265b04584b1d082863a7bd1ef6d01b6 (step 1/2).

Build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438656
When this succeeds, the eval-assessment publish build will be queued automatically.

vs-code-engineering · 2026-05-12T22:59:48Z

🚀 Queued eval-assessment publish build for 858d4286b1e2e7ae6586ed44dad2c11af46cab92 (step 2/2).

Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438695
On success, publishes @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.858d4286b1 on the dev tag.

vs-code-engineering · 2026-05-12T23:10:18Z

🔬 Queued eval-assessment benchmark for 95df7cf309.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.858d4286b1 (dev tag)
Benchmark: terminalbench2
Tracking issues:
- terminalbench2 / gpt-5.4: https://github.com/github/evald/issues/22421
- terminalbench2 / claude-opus-4.6: https://github.com/github/evald/issues/22422
- terminalbench2 / claude-opus-4.7: https://github.com/github/evald/issues/22423
- terminalbench2 / gpt-5.5: https://github.com/github/evald/issues/22424

Results will be posted back here when the run completes.

vs-code-engineering · 2026-05-12T23:11:10Z

✅ Eval-assessment build published.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.858d4286b1 (tag: dev)
Install: npm install @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.858d4286b1
Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438695

isidorn · 2026-05-13T07:36:46Z

@jukasper I tried to access the https://github.com/github/evald/ links but I do not seem to have access. Can you add me please?

meganrogge · 2026-05-13T13:23:33Z

Repeating as we saw a regression on Claude Opus 4.6, but it was not huge.

vs-code-engineering · 2026-05-13T13:24:32Z

⏳ Queued vscode build for f0bd87449265b04584b1d082863a7bd1ef6d01b6 (step 1/2).

Build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438904
When this succeeds, the eval-assessment publish build will be queued automatically.

vs-code-engineering · 2026-05-13T14:12:25Z

🚀 Queued eval-assessment publish build for 537b6f0cc8115f78a903d0e1b2ab2706423aa2cc (step 2/2).

Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438928
On success, publishes @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.537b6f0cc8 on the dev tag.

vs-code-engineering · 2026-05-13T14:26:39Z

🔬 Queued eval-assessment benchmark for 641f97f4c9.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.537b6f0cc8 (dev tag)
Benchmark: terminalbench2
Tracking issues:
- terminalbench2 / gpt-5.4: https://github.com/github/evald/issues/22519
- terminalbench2 / claude-opus-4.6: https://github.com/github/evald/issues/22520
- terminalbench2 / claude-opus-4.7: https://github.com/github/evald/issues/22521
- terminalbench2 / gpt-5.5: https://github.com/github/evald/issues/22522

Results will be posted back here when the run completes.

vs-code-engineering · 2026-05-13T14:27:54Z

✅ Eval-assessment build published.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.537b6f0cc8 (tag: dev)
Install: npm install @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.537b6f0cc8
Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438928

meganrogge

Thanks! Let's see how it goes 😄 . We'll want to kick off evals with and without this setting too.

vs-code-engineering · 2026-05-13T17:43:24Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/22522
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438928

🧪 Results

vs-code-engineering · 2026-05-13T21:51:22Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/22521
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438928

🧪 Results

vs-code-engineering · 2026-05-13T21:52:51Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/22520
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438928

🧪 Results

vs-code-engineering · 2026-05-13T21:54:36Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/22519
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=438928

🧪 Results

…315543)

Copilot AI review requested due to automatic review settings May 10, 2026 15:12

Copilot started reviewing on behalf of kevin-m-kent May 10, 2026 15:13 View session

Copilot AI reviewed May 10, 2026

View reviewed changes

vs-code-engineering Bot assigned meganrogge May 10, 2026

meganrogge requested changes May 11, 2026

View reviewed changes

Address terminal output snapshot cleanup

f0bd874

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kevin-m-kent requested a review from meganrogge May 11, 2026 20:05

meganrogge added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label May 12, 2026

vs-code-engineering Bot removed the ~requires-eval-assessment Evals will be run and will generate a report upon completion label May 12, 2026

meganrogge added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label May 12, 2026

vs-code-engineering Bot removed the ~requires-eval-assessment Evals will be run and will generate a report upon completion label May 12, 2026

meganrogge added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label May 13, 2026

microsoft deleted a comment from vs-code-engineering Bot May 13, 2026

vs-code-engineering Bot removed the ~requires-eval-assessment Evals will be run and will generate a report upon completion label May 13, 2026

meganrogge approved these changes May 13, 2026

View reviewed changes

benvillalobos approved these changes May 13, 2026

View reviewed changes

meganrogge merged commit 91f7718 into microsoft:main May 13, 2026
25 checks passed

vs-code-engineering Bot added this to the 1.121.0 milestone May 13, 2026

NikolaRHristov pushed a commit to CodeEditorLand/Editor that referenced this pull request May 13, 2026

Experiment with terminal output deltas for repeated polls (microsoft#…

93c3806

…315543)

Conversation

kevin-m-kent commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

kevin-m-kent commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meganrogge commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

meganrogge left a comment

Choose a reason for hiding this comment

Uh oh!

kevin-m-kent commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bhavyaus commented May 11, 2026

Uh oh!

kevin-m-kent commented May 12, 2026

Uh oh!

isidorn commented May 12, 2026

Uh oh!

meganrogge commented May 12, 2026

Uh oh!

meganrogge commented May 12, 2026

Uh oh!

vs-code-engineering Bot commented May 12, 2026

Uh oh!

isidorn commented May 12, 2026

Uh oh!

meganrogge commented May 12, 2026

Uh oh!

vs-code-engineering Bot commented May 12, 2026

Uh oh!

vs-code-engineering Bot commented May 12, 2026

Uh oh!

vs-code-engineering Bot commented May 12, 2026

Uh oh!

vs-code-engineering Bot commented May 12, 2026

Uh oh!

isidorn commented May 13, 2026

Uh oh!

meganrogge commented May 13, 2026

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

meganrogge left a comment

Choose a reason for hiding this comment

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

vs-code-engineering Bot commented May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kevin-m-kent commented May 10, 2026 •

edited

Loading

kevin-m-kent commented May 11, 2026 •

edited

Loading

meganrogge commented May 11, 2026 •

edited

Loading

kevin-m-kent commented May 11, 2026 •

edited

Loading