Skip to content

feat: sandbox shell execution abstraction#14949

Merged
lgrammel merged 54 commits into
mainfrom
lg/7eWtPnS
May 7, 2026
Merged

feat: sandbox shell execution abstraction#14949
lgrammel merged 54 commits into
mainfrom
lg/7eWtPnS

Conversation

@lgrammel
Copy link
Copy Markdown
Collaborator

@lgrammel lgrammel commented May 4, 2026

Background

Many agents are using filesystems through shell and file read/write tools, often in separate sandbox environments. These agents are so common that a first-class sandbox abstraction would be beneficial.

Summary

  • add Sandbox type
  • add sandbox option to generateText, streamText, ToolLoopAgent
  • make sandbox available in ToolExecutionOptions

Example

Tool definition:

import { tool } from 'ai';
import { z } from 'zod';

export function sandboxShellTool() {
  return tool({
    description: 'Run a shell command',
    inputSchema: z.object({
      command: z.string(),
    }),

    execute: async ({ command }, { sandbox }) => {
      // TODO figure out type inference to turn the runtime error into a type error
      if (!sandbox) {
        throw new Error('Sandbox is not available');
      }
      return sandbox.executeCommand({ command });
    },
  });
}

Agent definition:

import { openai } from '@ai-sdk/openai';
import { ToolLoopAgent } from 'ai';
import { sandboxShellTool } from '../../tools/sandbox-shell-tool';

export const sandboxAgent = new ToolLoopAgent({
  model: openai('gpt-5.5'),

  tools: {
    shell: sandboxShellTool(),
  },

  prepareCall: ({ sandbox, ...rest }) => ({
    ...rest,
    instructions:
      `You are a helpful assistant that can run shell commands.\n` +
      `You are operating in the following sandbox: ${sandbox?.description}`,
  }),
});

Agent call:

import { Bash } from 'just-bash';
import { JustBashSandbox } from '../../sandbox/just-bash-sandbox';
import { sandboxAgent } from './sandbox-agent';

const sandbox = new JustBashSandbox(
  new Bash({
    cwd: '/home/user',
  }),
);

const result = await sandboxAgent.stream({
  prompt:
    'Create a file named greeting.txt with a short greeting, then list the files and show the file contents.',
  sandbox,
});

Manual Verification

  • agent - generate src/agent/openai/generate-local-sandbox
  • agent - stream src/agent/openai/stream-local-sandbox
  • test with ui examples/ai-e2e-next/app/chat/sandbox/page.tsx

Future Work

  • read/write files in sandboxes
  • explore generic typing of sandbox tools
  • move sandbox shell tool into ai package
  • allow choosing a sandbox in prepareStep
  • add timeout to executeCommand
  • figure out sandbox streaming
  • need onFinalize callback (or something) that is always invoked (on finish, abort, error etc)

@lgrammel lgrammel changed the title sandbox prototyping feat: sandbox abstraction May 4, 2026
@lgrammel lgrammel changed the title feat: sandbox abstraction feat: sandbox shell execution abstraction May 4, 2026
Comment thread packages/ai/src/agent/tool-loop-agent.ts
Comment thread examples/ai-e2e-next/agent/openai/sandbox-agent.ts
Comment thread content/docs/03-agents/02-building-agents.mdx
@lgrammel lgrammel merged commit 3015fc3 into main May 7, 2026
19 checks passed
@lgrammel lgrammel deleted the lg/7eWtPnS branch May 7, 2026 09:18
lgrammel added a commit that referenced this pull request May 7, 2026
)

## Background

We introduced a sandbox abstraction in #14949

When possible, provider defined tools should automatically leverage it
unless the users provide custom execution functions.

## Summary

Automatically use sandbox by default in Anthropic bash tools.

## Example

```ts
const result = await generateText({
  model: anthropic('claude-opus-4-7'),
  tools: {
    bash: anthropic.tools.bash_20250124(),
  },
  sandbox: new LocalSandbox({
    rootDirectory: `${process.env.HOME}/Downloads`,
  }),
  stopWhen: isStepCount(2),
  prompt: 'List the files in my home directory.',
});
```

## Manual Verification

- [x] run and verify
`examples/ai-functions/src/generate-text/anthropic/bash-tool.ts`

## Future Work

- sandbox type safety
- apply to bash tools from other providers

## Related Issues

Builds upon #14949
lgrammel added a commit that referenced this pull request May 7, 2026
## Background

We introduced a `Sandbox` abstraction in #14949

It should be possible to change/determine the sandbox in `prepareStep`.

## Summary

Allow changing the sandbox in `prepareStep`.

## Related Issues

Follow up from #14949
lgrammel added a commit that referenced this pull request May 8, 2026
… execution (#15123)

## Background
We introduced a Sandbox abstraction for bash execution in #14949

## Summary
Add `workingDirectory` option to `executeCommand`

## Related Issues
Builds on #14949
lgrammel added a commit that referenced this pull request May 8, 2026
…ion (#15124)

## Background
We introduced a Sandbox abstraction for bash execution in #14949

## Summary
Add `abortSignal` option to `executeCommand`

## Related Issues
Builds on #14949
felixarntz added a commit that referenced this pull request May 18, 2026
… wrappers to `Experimental_Sandbox` abstraction (#15345)

## Background

`Experimental_Sandbox` (previous: #14949, #15253, #15301) only exposed
`description` and `runCommand`, so tools that needed file I/O had to
wrap every read/write in a shell command (`cat`, `tee`, `echo > …`).
That is fragile for binary content and impossible to type properly.

## Summary

- Adds six file methods to `Experimental_Sandbox`: streaming
`readFile`/`writeFile` as the foundation, plus
`readBinaryFile`/`readTextFile` and `writeBinaryFile`/`writeTextFile` as
convenience wrappers.
- All methods take a single options object so additional fields can be
added without breaking the signature.
- Using a stream for the foundation is the most low-level and
future-proof primitive, plus it handles large files better.
- In a follow up PR, we'll add another reduced abstraction surface,
because technically a sandbox provider shouldn't have to implement the
convenience wrappers `readTextFile`, `writeTextFile`, etc. This can take
inspiration from (or continue with) #15311.
- Updates the three example sandbox implementations (`LocalSandbox`,
`JustBashSandbox`, both `VercelSandbox` copies) to implement the new
methods, using streaming `readFile`/`writeFile` as the foundation that
the binary and text variants delegate to.

## Checklist

- [x] All commits are signed (PRs with unsigned commits cannot be
merged)
- [x] Tests have been added / updated (for bug fixes / features)
- [x] Documentation has been added / updated (for bug fixes / features)
- [x] A _patch_ changeset for relevant packages has been added (for bug
fixes / features - run `pnpm changeset` in the project root)
- [x] I have reviewed this pull request (self-review)

## Future Work

See above: We'll need to add that reduced abstraction surface so that
only essential methods _have_ to be implemented by the provider. We can
provide a helper function to fill in the convenience wrappers
automatically.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants