Fix RunInTerminalTool hang when shell exits before/during execute() by meganrogge · Pull Request #313249 · microsoft/vscode

meganrogge · 2026-04-29T15:09:46Z

Problem

In benchmark eval run 25073061392, 16 tests failed with X_AGENT_STILL_RESPONDING (60-minute outer timeout). All 16 traced to the same hang in RunInTerminalTool rich execute strategy. The same bug was independently confirmed in run 25115115244, where 13/89 tasks timed out with the identical pattern — shell process died (exit codes 1/2/22/127/130), onExit already fired, Event.toPromise hung forever.

The build under test already had #312827 (onExit listener) and #312854 (downgrade rich → basic on broken shell integration), so the listener wasn't missing. The actual bug was a subscription-ordering / already-fired-emitter race.

RichExecuteStrategy.execute() wired onExit / onDisposed after await this._instance.xtermReadyPromise. Two failure modes from this:

Already-dead instance. If the pty had exited before execute() was entered, _onExit had already fired and been disposed (see terminalInstance._onProcessExit). Event.toPromise(onExit) then subscribed to a dead emitter and never resolved — hang until the agent's 60-min outer timeout.
Race during xtermReadyPromise. If the pty exits during xterm init, the events fire before our subscription is attached and are missed.

BasicExecuteStrategy already wired its race before the await, but had the same upfront-already-dead hole.

Fix

In both richExecuteStrategy.ts and basicExecuteStrategy.ts:

Synchronous early-out at the top of execute():
- instance.isDisposed → throw "The terminal was closed" (matches existing onDisposed race branch).
- instance.exitCode !== undefined → resolve immediately with the captured exit code and additionalInformation: 'Command exited with code N'.
In rich, move the Promise.race([...]) lifecycle subscriptions before await xtermReadyPromise so onExit / onDisposed are wired synchronously at function entry. Closes the in-flight death race.

Tests

Added unit tests to both richExecuteStrategy.test.ts and basicExecuteStrategy.test.ts:

returns immediately with captured exit code when pty has already exited before execute()
throws "The terminal was closed" when instance is already disposed before execute()

runCommand / sendText are stubbed to throw, proving the early-out path doesn't try to write to a dead shell.

The rich execute strategy subscribed to onExit/onDisposed AFTER awaiting xtermReadyPromise. If the pty had already exited before execute() was entered, those emitters had already fired and been disposed, so Event.toPromise() subscribed to a dead emitter and never resolved - hanging the run-in-terminal tool until the agent's 60-minute outer timeout (16 X_AGENT_STILL_RESPONDING failures observed in eval run 25073061392). - Add synchronous up-front check for instance.isDisposed / exitCode in both rich and basic strategies; resolve immediately with the captured exit code rather than subscribing to a fired-and-disposed emitter. - In the rich strategy, move the Promise.race lifecycle subscription setup BEFORE 'await this._instance.xtermReadyPromise' so onExit / onDisposed are wired synchronously at function entry, closing the race window where the pty exits during xterm initialization. - Add unit tests for both branches in rich and basic strategies. Fixes #313248

meganrogge · 2026-04-29T15:12:03Z

/requires-eval-assessment terminalbench2 gpt-5.4,claude-opus-4.6,claude-opus-4.7

vs-code-engineering · 2026-04-29T15:12:31Z

⏳ Queued vscode build for 05871ecad4d8b9be3ecbb92f5aee1aa85ec948f9 (step 1/2).

Build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=434890
When this succeeds, the eval-assessment publish build will be queued automatically.

Copilot

Pull request overview

Fixes a hang in the run_in_terminal tool’s rich/basic execute strategies when the terminal process has already exited (or exits during strategy setup), by ensuring lifecycle completion can’t be missed due to late event subscription.

Changes:

Add synchronous “already disposed / already exited” short-circuit handling at the start of execute() in both strategies.
In rich strategy, subscribe to lifecycle events before awaiting xtermReadyPromise to avoid missing onExit/onDisposed.
Add unit tests covering the “already exited” and “already disposed” early-out behavior for both strategies.

Show a summary per file

File	Description
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/richExecuteStrategy.ts	Adds early-out checks and moves lifecycle subscriptions ahead of `await xtermReadyPromise` to prevent hangs.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/basicExecuteStrategy.ts	Adds early-out checks to avoid waiting on already-fired lifecycle events.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/richExecuteStrategy.test.ts	Adds tests for immediate return on captured exit code and for disposed-instance rejection.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/basicExecuteStrategy.test.ts	Adds analogous tests for the basic strategy.

Copilot's findings

Files reviewed: 4/4 changed files
Comments generated: 2

…r/executeStrategy/richExecuteStrategy.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…r/executeStrategy/basicExecuteStrategy.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

vs-code-engineering · 2026-04-29T16:04:16Z

🚀 Queued eval-assessment publish build for ab48c2d6c59928e1a16d01cafce525816f064139 (step 2/2).

Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=434905
On success, publishes @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.ab48c2d6c5 on the dev tag.

vs-code-engineering · 2026-04-29T16:13:22Z

🔬 Queued eval-assessment benchmark for cd9772c658.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.ab48c2d6c5 (dev tag)
Benchmark: terminalbench2
Tracking issues:
- gpt-5.4: https://github.com/github/evald/issues/18797
- claude-opus-4.6: https://github.com/github/evald/issues/18798
- claude-opus-4.7: https://github.com/github/evald/issues/18799

Results will be posted back here when the run completes.

vs-code-engineering · 2026-04-29T16:15:34Z

✅ Eval-assessment build published.

Package: @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.ab48c2d6c5 (tag: dev)
Install: npm install @vscode/vscode-copilot-evaluation-agent@0.0.0-dev.ab48c2d6c5
Pipeline run: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=434905

vs-code-engineering · 2026-04-29T21:42:51Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/18799
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=434905

Analysis Results

Resolution Rate

Benchmark	Total Cases	Passed	Failed	Resolved Rate
terminalbench2	89	56	33	62.92%

Token Usage

Metric	Value
Total Tokens	79,996,111
Input Tokens	78,701,373
Output Tokens	1,294,738
Cached Tokens	75,538,682

Step Counts

Metric	Value
Total Steps	1,798
Mean Steps/Instance	20.20

vs-code-engineering · 2026-04-29T22:09:34Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/18798
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=434905

Analysis Results

Resolution Rate

Benchmark	Total Cases	Passed	Failed	Resolved Rate
terminalbench2	89	52	37	58.43%

Token Usage

Metric	Value
Total Tokens	89,462,678
Input Tokens	88,259,039
Output Tokens	1,203,639
Cached Tokens	85,560,673

Step Counts

Metric	Value
Total Steps	2,296
Mean Steps/Instance	25.80

vs-code-engineering · 2026-04-30T01:36:00Z

📊 Eval-assessment benchmark complete.

Tracking issue: https://github.com/github/evald/issues/18797
Publish build: https://dev.azure.com/monacotools/Monaco/_build/results?buildId=434905

Analysis Results

Resolution Rate

Benchmark	Total Cases	Passed	Failed	Resolved Rate
terminalbench2	89	48	41	53.93%

Token Usage

Metric	Value
Total Tokens	50,061,498
Input Tokens	49,314,270
Output Tokens	747,228
Cached Tokens	42,203,392

Step Counts

Metric	Value
Total Steps	1,323
Mean Steps/Instance	14.87

Copilot AI review requested due to automatic review settings April 29, 2026 15:09

meganrogge self-assigned this Apr 29, 2026

meganrogge added this to the 1.119.0 milestone Apr 29, 2026

meganrogge enabled auto-merge (squash) April 29, 2026 15:10

meganrogge added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Apr 29, 2026

Copilot started reviewing on behalf of meganrogge April 29, 2026 15:10 View session

meganrogge removed the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Apr 29, 2026

vs-code-engineering Bot added the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Apr 29, 2026

Copilot AI reviewed Apr 29, 2026

View reviewed changes

Comment thread ...kbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/richExecuteStrategy.ts Outdated

Comment thread ...bench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/basicExecuteStrategy.ts Outdated

meganrogge and others added 2 commits April 29, 2026 11:44

Update src/vs/workbench/contrib/terminalContrib/chatAgentTools/browse…

44a7c2d

…r/executeStrategy/richExecuteStrategy.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update src/vs/workbench/contrib/terminalContrib/chatAgentTools/browse…

0223ad4

…r/executeStrategy/basicExecuteStrategy.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

pwang347 approved these changes Apr 29, 2026

View reviewed changes

Merge branch 'main' into fix/run-in-terminal-tool-already-exited

31de020

vs-code-engineering Bot removed the ~requires-eval-assessment Evals will be run and will generate a report upon completion label Apr 29, 2026

meganrogge merged commit cdbdcc3 into main Apr 29, 2026
26 checks passed

meganrogge deleted the fix/run-in-terminal-tool-already-exited branch April 29, 2026 16:19

vs-code-engineering Bot mentioned this pull request Apr 29, 2026

RunInTerminalTool hangs for 60min when shell exits before/during execute() (rich strategy) #313248

Closed

meganrogge mentioned this pull request May 1, 2026

Test plan: terminal agent tool end-to-end stability (week of 4/23) #313618

Open

56 tasks

vs-code-engineering Bot added the on-testplan label May 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix RunInTerminalTool hang when shell exits before/during execute()#313249

Fix RunInTerminalTool hang when shell exits before/during execute()#313249
meganrogge merged 4 commits intomainfrom
fix/run-in-terminal-tool-already-exited

meganrogge commented Apr 29, 2026 •

edited

Loading

Uh oh!

meganrogge commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

meganrogge commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Tests

Uh oh!

meganrogge commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Uh oh!

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Analysis Results

Resolution Rate

Token Usage

Step Counts

Uh oh!

vs-code-engineering Bot commented Apr 29, 2026

Analysis Results

Resolution Rate

Token Usage

Step Counts

Uh oh!

vs-code-engineering Bot commented Apr 30, 2026

Analysis Results

Resolution Rate

Token Usage

Step Counts

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

meganrogge commented Apr 29, 2026 •

edited

Loading