Add E2BSandboxToolset integration for Haystack agents by tholor · Pull Request #448 · deepset-ai/haystack-experimental

tholor · 2026-03-11T07:46:07Z

Introduces E2BSandboxToolset, a Haystack Toolset subclass that
connects to an E2B cloud sandbox and exposes four tools to any
Haystack Agent: run_bash_command, read_file, write_file, and
list_directory.

Key design points:

Sandbox connection is established lazily via warm_up(), which is
called automatically by the Haystack pipeline/agent before the first
tool invocation and is idempotent.
close() shuts down the sandbox and releases resources.
API key is managed via Haystack's Secret (defaults to the
E2B_API_KEY environment variable).
Full to_dict / from_dict serialisation support; the live sandbox
instance is not serialised and is re-created on warm_up().
e2b added as an optional test dependency in pyproject.toml.
38 unit tests covering init, warm-up lifecycle, each tool operation,
error handling, and round-trip serialisation.

https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN

Introduces `E2BSandboxToolset`, a Haystack `Toolset` subclass that connects to an E2B cloud sandbox and exposes four tools to any Haystack Agent: `run_bash_command`, `read_file`, `write_file`, and `list_directory`. Key design points: - Sandbox connection is established lazily via `warm_up()`, which is called automatically by the Haystack pipeline/agent before the first tool invocation and is idempotent. - `close()` shuts down the sandbox and releases resources. - API key is managed via Haystack's `Secret` (defaults to the `E2B_API_KEY` environment variable). - Full `to_dict` / `from_dict` serialisation support; the live sandbox instance is not serialised and is re-created on `warm_up()`. - `e2b` added as an optional test dependency in `pyproject.toml`. - 38 unit tests covering init, warm-up lifecycle, each tool operation, error handling, and round-trip serialisation. https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN

https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN

sjrl · 2026-03-11T08:05:45Z

Hey @tholor thanks for working on this!

Quick high-level question/suggestion. I wonder if we should consider creating separate pre-made tools for the four tools you mention run_bash_command, read_file, write_file, and list_directory. This would in theory allow for the most customization (e.g. some users could only want to use a subset of the tools) instead requiring all four tools to always be loaded into the Agent with the custom Toolset class.

Copilot

Pull request overview

Adds an E2B-backed Haystack Toolset to allow Haystack agents/pipelines to execute bash commands and perform basic filesystem operations inside an E2B cloud sandbox, with lazy lifecycle management and serialization support.

Changes:

Introduces E2BSandboxToolset with warm_up()/close() lifecycle, 4 tools, and to_dict()/from_dict() serialization.
Adds unit tests covering lifecycle, tool behavior, error wrapping, and serialization round-trips.
Adds e2b to the test environment’s optional dependencies.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`haystack_experimental/tools/e2b/sandbox_toolset.py`	Implements the new `E2BSandboxToolset` and its tools, lifecycle, and serialization.
`haystack_experimental/tools/e2b/__init__.py`	Exposes `E2BSandboxToolset` via lazy import structure.
`haystack_experimental/tools/__init__.py`	Introduces the `tools` package (license header).
`test/tools/e2b/test_sandbox_toolset.py`	Adds unit tests for initialization, lifecycle, tool calls, and serialization.
`test/tools/e2b/__init__.py`	Makes `test.tools.e2b` a package (license header).
`test/tools/__init__.py`	Makes `test.tools` a package (license header).
`pyproject.toml`	Adds `e2b` to test env extra dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-11T08:06:08Z

test/tools/e2b/test_sandbox_toolset.py

+    def test_default_parameters(self):
+        toolset = _make_toolset()
+        assert toolset.sandbox_template == "base"
+        assert toolset.timeout == 120
+        assert toolset.environment_vars == {}
+        assert toolset._sandbox is None


test_default_parameters isn’t actually validating the toolset’s real defaults because _make_toolset() always overrides timeout (120) and sandbox_template ("base"). This can let regressions slip through if the class defaults change. Consider instantiating E2BSandboxToolset directly (only passing api_key) for the default-parameters test, or make _make_toolset() not override non-essential defaults.

tholor · 2026-03-11T08:24:38Z

Hey @tholor thanks for working on this!

Quick high-level question/suggestion. I wonder if we should consider creating separate pre-made tools for the four tools you mention run_bash_command, read_file, write_file, and list_directory. This would in theory allow for the most customization (e.g. some users could only want to use a subset of the tools) instead requiring all four tools to always be loaded into the Agent with the custom Toolset class.

good point, thanks! let's do that

sjrl · 2026-03-11T08:28:31Z

Hey @tholor thanks for working on this!
Quick high-level question/suggestion. I wonder if we should consider creating separate pre-made tools for the four tools you mention run_bash_command, read_file, write_file, and list_directory. This would in theory allow for the most customization (e.g. some users could only want to use a subset of the tools) instead requiring all four tools to always be loaded into the Agent with the custom Toolset class.

good point, thanks! let's do that

As a starting point for the implementation you could take a look at how we did this for pre-made GitHub tools https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/github/src/haystack_integrations/tools/github

Address reviewer feedback (sjrl, tholor) to expose individual pre-made Tool objects instead of a monolithic Toolset, so users can load any subset of the four tools into their agent. Changes: - Replace E2BSandboxToolset (Toolset subclass) with E2BSandbox (plain dataclass) that manages the sandbox lifecycle (warm_up / close / to_dict / from_dict). - Add four individual tool factory functions: create_run_bash_command_tool(sandbox) create_read_file_tool(sandbox) create_write_file_tool(sandbox) create_list_directory_tool(sandbox) - Add create_e2b_tools() convenience factory that returns (sandbox, tools) so callers can pass any subset; all tools share the same E2BSandbox instance, preserving filesystem / process state across invocations. - Update __init__.py to export the new public names. - Rewrite tests to match the new API and fix the Copilot review comment: test_class_defaults now instantiates E2BSandbox with only api_key to validate the real class defaults rather than helper-overridden values. https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN

RunBashCommandTool, ReadFileTool, WriteFileTool, ListDirectoryTool now subclass haystack.tools.Tool directly. Users instantiate them with a shared E2BSandbox instance, mirroring how chat generators are passed to Agent. The create_e2b_tools() convenience function is kept and updated to use the new classes.

- e2b_sandbox.py: E2BSandbox - bash_tool.py: RunBashCommandTool - read_file_tool.py: ReadFileTool - write_file_tool.py: WriteFileTool - list_directory_tool.py: ListDirectoryTool - sandbox_toolset.py: create_e2b_tools (convenience function only)

anakin87 · 2026-03-11T09:05:52Z

@tholor feel free to evaluate if it would make sense to directly create these new classes in haystack-core-integrations. We could create and E2B integration and release it with very experimental release numbers (e.g. 0.0.1).

In case you go this route, we recently added a scaffolding script that makes life easier for contributors: https://github.com/deepset-ai/haystack-core-integrations/blob/main/CONTRIBUTING.md#create-a-new-integration

julian-risch · 2026-03-16T14:58:51Z

I agree that this fits better into haystack-core-integrations. We have an open issue for a similar integration but with exec-sandbox deepset-ai/haystack-core-integrations#2933
Related to our earlier conversation Malte, I looked into agentfs for isolated editing of files for agents. I had in mind something similar to https://www.llamaindex.ai/blog/making-coding-agents-safe-using-llamaindex and worked on a draft locally but it doesn't cover run_bash_command so I'll pause that work.

tholor requested a review from a team as a code owner March 11, 2026 07:46

tholor requested review from anakin87 and removed request for a team March 11, 2026 07:46

claude added 2 commits March 11, 2026 07:49

fix: D205 docstring summary line for E2BSandboxToolset

ca293f9

https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN

fix: apply ruff format to sandbox_toolset.py

d053ff6

https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN

tholor requested a review from Copilot March 11, 2026 07:56

Copilot started reviewing on behalf of tholor March 11, 2026 07:57 View session

tholor removed the request for review from anakin87 March 11, 2026 08:01

Copilot AI reviewed Mar 11, 2026

View reviewed changes

claude added 2 commits March 11, 2026 08:31

Add E2B agent example script demonstrating shared sandbox across tools

2a061a6

tholor marked this pull request as draft March 11, 2026 08:53

claude added 3 commits March 11, 2026 08:57

Move e2b_agent_example.py into haystack_experimental/tools/e2b/

4e2491e

tholor added 2 commits March 11, 2026 18:06

adjust sandbox class and example script

714aa5c

fix serialization

4ab141a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add E2BSandboxToolset integration for Haystack agents#448

Add E2BSandboxToolset integration for Haystack agents#448
tholor wants to merge 10 commits intomainfrom
claude/e2b-sandbox-integration-3cWWo

tholor commented Mar 11, 2026

Uh oh!

sjrl commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 11, 2026

Uh oh!

tholor commented Mar 11, 2026

Uh oh!

sjrl commented Mar 11, 2026

Uh oh!

anakin87 commented Mar 11, 2026

Uh oh!

julian-risch commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

tholor commented Mar 11, 2026

Uh oh!

sjrl commented Mar 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

tholor commented Mar 11, 2026

Uh oh!

sjrl commented Mar 11, 2026

Uh oh!

anakin87 commented Mar 11, 2026

Uh oh!

julian-risch commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants