Skip to content

Chat: [Sub-]agent intervention #296867

@kfsone

Description

@kfsone

Type: Feature Request

Scenario: sub-agent screws up, such as emitting stupid and badly escaped shell commands to access something, which could be address by (a) writing a script file, (b) using the read-file commands, (c) some other sensible way, (d) correct shell syntax

Current Options:
1- Retry: if the current checkpoint is distant, this is agonizing, and no guarantee the agent won't make worse mistakes or discard preceeding high-quality work,
2- Discuss: a/ leaves mistake in context and poisons attention, and many models become distracted considering this last correction to be "the goal", b/ when the issue is with a sub-agent step/action, usually confuses the top-level agent, which assumes responsibility, addresses the issue directly, itself, and reports "done".

Proposal:
3- Intervene. Allow the user to reject the proposed action/command/change, etc, potentially work with a sub-agent on an alternative, and finally close with a summary so that the agent owning the problem gets feedback.

Example 1: Agent comes up with a stupid approach to line counting

[Allow v] ... `cd x/y/z; dir; foreach ($csv in Get-ChildItem *.csv) { Get-Content $csv | ConvertFrom-CSV ... } | Measure ...`

"Allow" has an "Intervene" option, user posts:

I'd noted that several of the files are over 2GB. Parsing every line and then counting the converted objects would be pathological. Below is a list of the line counts of each file.
<pastes>

Example 2: Agent follows the trained-in pattern of mixing powershell/unix commands to be corrected.

I need to check if we have any network connectivity.

[Allow v] `get-netadapter -includehidden | where status -eq up | head -5`

User intervenes with a helper sub-agent:

"head" is not powershell, and since the agent is remote it must already have network access. Can you tell what the actual problem is or find a better way to determine it?

Over several steps they determine the problem was that the original agent had ignored the 255 exit code of the server. What the agent's context sees is:

I need to check if we have any network connectivity.
[user interrupt]: actually the server had terminated immediately with a 255 exit code when you started it. We fixed the crash, restarted the server, and confirmed it is reachable. You can resume testing. Here are the results of the previous test command you ran:
$ testrun
.... output ...

Pathological cases:

  • Agents running in VSCode on Windows predicting the trained-in misuses of powershell:
Get-NetAdapter | Where Status -eq 'Up' | head -5

Sometimes agents go an entire workflow without this, but once it happens either the agent believes that correcting the mistake was the goal and reports "done", or it behaves like a Facebook engagement A/B test and causes the model to do it more often.

It shouldn't be surprising that corrections are a strong, ground-state pattern for all the current LLMs out thereand that leaving them in the context is toxic, as it allows the network to find a strong correlation with specific types of back-and-forth that are unproductive and slew to predicting along those.

  • How-To execution (reading files)
    Explanatory data in training material rarely contains agent-appropriate instruction. How to count the number of entries in a csv file? If 'powershell' has a heavy attention weight, most models love to go thru ConvertFrom-CSV rather than just "measure -line"; it looks like there are several instances in the training data where someone demonstrated that when capturing thru a variable, the models just skip that step in this context.
# training
$data = get-content yolo.csv | convertfrom-csv
$data | measure | select -exp count
# in vscode
foreach ($csv in Get-ChildItem *.csv) { Get-Content $csv | ConvertFrom-CSV | Measure | ... format the name + count } ...

instead of a simpler line-counting approach.

VS Code version: Code 1.109.5 (0725862, 2026-02-19T19:43:32.382Z)
OS version: Windows_NT x64 10.0.26200
Modes:

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions