Skip to content

Fix terminal output capture: strip command echo/prompt, fix premature idle detection, improve sandbox failure detection, force bash over sh#303754

Merged
alexdima merged 23 commits intomainfrom
alexdima/fix-303531-sandbox-no-output-leak
Mar 22, 2026
Merged

Fix terminal output capture: strip command echo/prompt, fix premature idle detection, improve sandbox failure detection, force bash over sh#303754
alexdima merged 23 commits intomainfrom
alexdima/fix-303531-sandbox-no-output-leak

Conversation

@alexdima
Copy link
Member

@alexdima alexdima commented Mar 21, 2026

Fixes #303531

Problem

When the run_in_terminal tool executes commands (especially sandbox-wrapped ones) that produce no output, the raw command echo and shell prompt leak into the tool result. The LLM and user see long, confusing strings like the full sandbox wrapper command instead of the expected "Command produced no output".

Additionally, several related issues with terminal command execution reliability were discovered and fixed:

  • Premature idle detection could cause the tool to capture output before the command even started executing
  • The sandbox execPath was incorrectly resolved in remote environments
  • Sandbox failures weren't detected when exit codes were unavailable

Root Cause

Without shell integration (NoneExecuteStrategy) or when shell integration markers misfire, the tool captures the terminal buffer between xterm markers using getContentsAsText(). This raw capture includes:

  1. The command echo line (the prompt + what sendText wrote, including the full sandbox wrapper)
  2. The actual command output (empty in these cases)
  3. The next shell prompt line(s)

Previously there was no stripping of (1) and (3), so the entire command echo was returned as "output".

Changes

1. Strip command echo and prompt from terminal output (strategyHelpers.ts)

New stripCommandEchoAndPrompt() function that removes leading command echo lines and trailing prompt lines from marker-based output:

  • findCommandEcho() — Locates the command text in the output, handling terminal line wrapping by stripping newlines and building an index mapping. Supports suffix matching for getOutput() cases where only the wrapped continuation appears
  • Prompt evidence narrowing — Uses the prompt prefix (text before the command echo) to determine the shell type (bash, zsh, PowerShell, cmd, Starship, Python REPL) and only checks relevant trailing prompt patterns, reducing false positives
  • Double-echo handling — When the shell re-echoes the command (prompt + echo appears twice), strips it a second time
  • Supports various prompt styles: user@host:path $, ] $, PS C:\>, C:\>, , >>>, and wrapped multi-line prompts

Applied in NoneExecuteStrategy, BasicExecuteStrategy, and RichExecuteStrategy (defensively, since getOutput() can also include echoes on some platforms).

2. Fix premature idle detection (noneExecuteStrategy.ts)

After sendText(), wait for the terminal cursor to move past the start marker line before beginning idle detection. Without this, the idle poll could resolve immediately on the existing prompt before the shell started processing the command.

3. Fix start marker recreation (strategyHelpers.ts)

setupRecreatingStartMarker() now returns an IDisposable that stops the recreation loop. Callers dispose it before sending a command so that prompt re-renders (e.g. PSReadLine transient prompts) don't move the start marker past the command output.

4. Sandbox reliability improvements

  • Heuristic sandbox failure detection (sandboxOutputAnalyzer.ts) — When exit code is unavailable (no shell integration), detect sandbox failures by pattern-matching output for "Operation not permitted", "Permission denied", "Read-only file system", etc.
  • Fix execPath resolution for remote environments (terminalSandboxService.ts) — Use remoteEnv.execPath directly instead of constructing a path to a node binary

5. Configurable idle poll interval

  • New chat.tools.terminal.idlePollInterval setting (default: 1000ms, range: 50–10000ms)
  • All three execute strategies read this setting instead of hardcoding 1000ms
  • Enables integration tests to use fast polling (50ms) to reduce test overhead

6. Cosmetic fix: suppress spurious "tool simplified the command" message

CommandLinePreventHistoryRewriter prepends a space to prevent shell history recording. This was triggering a "Note: The tool simplified the command to..." message. The comparison now normalizes display forms before diffing.

7. Address PR feedback: logging, performance, timeout

  • Strip sensitive data from debug logs (log metadata only, not full output)
  • Use array join instead of O(n²) string concat in stripNewLinesAndBuildMapping
  • Add 5s timeout to cursor-move wait to prevent indefinite hangs
  • Align shellIntegrationTimeout descriptions (0 = skip the wait)

8. Install bubblewrap and socat in Linux CI pipelines

These packages are required for terminal sandbox integration tests on Linux.

9. Force /bin/bash over /bin/sh for copilot terminal profile

When the default terminal profile resolves to /bin/sh, shell integration cannot be injected (pty host warns: "Shell integration cannot be enabled for executable /bin/sh"). This causes loss of exit code detection and degrades sandbox failure analysis.

Added a /bin/sh/bin/bash fallback in getCopilotProfile(), matching the existing cmd.exepowershell.exe override pattern. This fixes both CI environments (where the user's shell may be /bin/sh) and real users whose default profile resolves to /bin/sh.

Tests

Unit tests

  • strategyHelpers.test.ts — 20+ test cases for stripCommandEchoAndPrompt() covering bash, zsh, PowerShell, wrapped commands, sandbox-wrapped commands, edge cases
  • sandboxOutputAnalyzer.test.ts — Tests for outputLooksSandboxBlocked() heuristic detection
  • noneExecuteStrategy.test.ts — Strategy-level tests verifying "Command produced no output" and no sandbox wrapper leaks

Integration tests

  • chat.runInTerminal.test.ts — Comprehensive tests running under both shell integration modes (SI on/off), covering echo output, no-output commands, multi-line output, non-zero exit codes, special characters, and sandbox scenarios

cc @Tyriar @meganrogge

Prevent sandbox-wrapped command lines from leaking as output when

commands produce no actual output. Adds stripCommandEchoAndPrompt()

to isolate real output from marker-based terminal buffer captures.

Also adds configurable idle poll interval and shell integration

timeout=0 support for faster test execution.
Copilot AI review requested due to automatic review settings March 21, 2026 16:47
@vs-code-engineering vs-code-engineering bot added this to the 1.113.0 milestone Mar 21, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes run_in_terminal tool output sanitization when marker-based capture includes the echoed sandbox-wrapper command and the next prompt (most visible when the actual command produces no stdout/stderr), and adds configuration/testing support to reduce idle-detection latency in tests.

Changes:

  • Add stripCommandEchoAndPrompt() and apply it to marker-based output in NoneExecuteStrategy and BasicExecuteStrategy.
  • Introduce chat.tools.terminal.idlePollInterval and thread it through all execute strategies; extend trackIdleOnPrompt with a configurable fallback.
  • Update command-edit note detection to ignore cosmetic rewrites, allow terminal.integrated.shellIntegration.timeout=0, and add unit/integration coverage.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/strategyHelpers.ts Adds output stripping helper for command echo + prompt removal.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/noneExecuteStrategy.ts Uses configurable idle polling + strips echo/prompt from marker output.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/basicExecuteStrategy.ts Uses configurable idle polling + strips echo/prompt from marker output.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/richExecuteStrategy.ts Uses configurable idle polling for idle-on-prompt fallback.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/executeStrategy/executeStrategy.ts Adds optional promptFallbackMs to trackIdleOnPrompt.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/browser/tools/runInTerminalTool.ts Avoids “tool simplified the command” note for cosmetic rewrites by comparing display-normalized forms.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/common/terminalChatAgentToolsConfiguration.ts Adds chat.tools.terminal.idlePollInterval setting metadata.
src/vs/workbench/contrib/terminal/common/terminalEnvironment.ts Allows terminal.integrated.shellIntegration.timeout=0 to produce a 0ms wait.
src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/strategyHelpers.test.ts Unit tests for stripCommandEchoAndPrompt().
src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/noneExecuteStrategy.test.ts Unit tests ensuring no-output cases don’t leak sandbox wrapper text.
extensions/vscode-api-tests/src/singlefolder-tests/chat.runInTerminal.test.ts Integration coverage for shell integration on/off and sandbox scenarios.
Comments suppressed due to low confidence (1)

src/vs/workbench/contrib/terminalContrib/chatAgentTools/test/browser/noneExecuteStrategy.test.ts:109

  • Same issue as above: NoneExecuteStrategy now requires an IConfigurationService, but the test calls the constructor with only three arguments, which will fail to compile / mis-wire dependencies. Please adjust this instantiation to provide IConfigurationService (ideally via instantiationService + TestConfigurationService).
		const logService = createLogService();
		const strategy = store.add(new NoneExecuteStrategy(instance, () => false, logService));
		const cts = store.add(new CancellationTokenSource());

…utput

Anchor prompt-detection regexes to specific prompt shapes instead of
broadly matching any line ending with $, #, %, or >. This prevents
stripping real command output like "100%", "<div>", or "item #".
alexdima added 13 commits March 21, 2026 18:35
In CI, ^C cancellations leave stale prompt fragments before the actual
command echo line. The leading-strip loop now continues scanning past
unmatched lines until it finds the command echo, instead of breaking
on the first non-matching line.
- Add trailing prompt patterns for hostname:path user$ (no @ sign)
- Handle wrapped prompt fragments like "er$" at line boundaries
- Add stripCommandEchoAndPrompt to RichExecuteStrategy marker fallback
- Context-aware wrapped prompt continuation detection
…tripping

- Add bubblewrap and socat to Linux CI apt-get install
- Make sandbox test assertions platform-aware (macFileSystem vs linuxFileSystem)
- Make /etc/shells test accept both macOS and Linux first-line format
- Broaden wrapped prompt fragment regex to handle path chars (ts/testWorkspace$)
- Fix continuation pattern to match user@host:path wrapped lines
- Apply stripCommandEchoAndPrompt to getOutput() in BasicExecuteStrategy
  (basic shell integration lacks reliable 133;C markers so getOutput()
  can include command echo)
- Keep RichExecuteStrategy getOutput() unstripped (rich integration
  has reliable markers)
…ssage

- Handle /usr/bin/bash (Linux) vs /bin/bash (macOS) in /tmp write test
- Handle 'Read-only file system' (Linux) vs 'Operation not permitted' (macOS)
- Add 'Read-only file system' to outputLooksSandboxBlocked heuristic
- Replace newlines with spaces (not empty) to handle terminal wrapping
- Extract outputLooksSandboxBlocked as exported function with unit tests
Add execPath to IRemoteAgentEnvironment so the server sends its actual
process.execPath to the client. The sandbox service now uses this instead
of hardcoding appRoot + '/node', which only works in production builds.
…dle partial command echoes

- setupRecreatingStartMarker returns IDisposable to stop marker recreation
  before sending commands (prevents marker jumping on PSReadLine re-renders)
- noneExecuteStrategy waits for cursor to move past start line after sendText
  before starting idle detection (prevents end marker at same line as start)
- findCommandEcho supports suffix matching for partial command echoes from
  wrapped getOutput() results (shell integration ON with long commands)
- Suffix matching requires mid-word split to avoid false positives on output
  that happens to be a suffix of the command (e.g. echo output)
- Integration tests: use ; separator on Windows, add && conversion test,
  handle Windows exit code quirks with cmd /c
@alexdima alexdima changed the title Fix: strip command echo and prompt from terminal output when commands produce no output Fix terminal output capture: strip command echo/prompt, fix idle detection, and improve sandbox reliability Mar 22, 2026
@alexdima alexdima changed the title Fix terminal output capture: strip command echo/prompt, fix idle detection, and improve sandbox reliability Fix terminal output capture: strip command echo/prompt, fix premature idle detection, improve sandbox failure detection Mar 22, 2026
@alexdima alexdima requested a review from Copilot March 22, 2026 00:12
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.

- Strip sensitive data from debug logs (log metadata only)
- Use array join instead of O(n^2) string concat in stripNewLinesAndBuildMapping
- Add 5s timeout to cursor-move wait to prevent indefinite hangs
- Align shellIntegrationTimeout descriptions (0 = skip the wait)
These are required for terminal sandbox integration tests.
Shell integration cannot be injected into /bin/sh, causing loss of
exit code detection. This matches the existing cmd.exe -> powershell
override pattern.
@alexdima alexdima changed the title Fix terminal output capture: strip command echo/prompt, fix premature idle detection, improve sandbox failure detection Fix terminal output capture: strip command echo/prompt, fix premature idle detection, improve sandbox failure detection, force bash over sh Mar 22, 2026
@alexdima alexdima marked this pull request as ready for review March 22, 2026 01:41
@alexdima alexdima enabled auto-merge March 22, 2026 01:41
… lines

- Extend bracketed prompt patterns from isUnixAt to isUnix so prompts
  like [W007DV9PF9-1:~/path] are recognized (CI macOS prompt format)
- Cap trailing prompt stripping at 2 non-empty lines to prevent
  over-stripping legitimate output
- Add unit tests for bracketed prompt without @ format
@alexdima alexdima marked this pull request as draft March 22, 2026 08:23
auto-merge was automatically disabled March 22, 2026 08:23

Pull request was converted to draft

Split trailing prompt patterns into two categories:
- Complete prompts (user@host:~ $, PS C:\>, etc.) stop stripping
  immediately — anything above is command output, not a wrapped prompt
- Fragment patterns (er$, ] $, [host:~/path...) allow continued
  stripping to reassemble wrapped prompts

This prevents falsely stripping output lines that happen to end with
$ or # when a real complete prompt sits below them. Added adversarial
tests verifying correct behavior for output containing prompt-like
characters.
@alexdima alexdima marked this pull request as ready for review March 22, 2026 10:19
@alexdima alexdima enabled auto-merge March 22, 2026 10:20
@alexdima alexdima merged commit be95b65 into main Mar 22, 2026
43 of 45 checks passed
@alexdima alexdima deleted the alexdima/fix-303531-sandbox-no-output-leak branch March 22, 2026 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Commands with no output show the sandbox command instead of "no output"

3 participants