Add E2BSandboxToolset integration for Haystack agents#448
Add E2BSandboxToolset integration for Haystack agents#448
Conversation
Introduces `E2BSandboxToolset`, a Haystack `Toolset` subclass that connects to an E2B cloud sandbox and exposes four tools to any Haystack Agent: `run_bash_command`, `read_file`, `write_file`, and `list_directory`. Key design points: - Sandbox connection is established lazily via `warm_up()`, which is called automatically by the Haystack pipeline/agent before the first tool invocation and is idempotent. - `close()` shuts down the sandbox and releases resources. - API key is managed via Haystack's `Secret` (defaults to the `E2B_API_KEY` environment variable). - Full `to_dict` / `from_dict` serialisation support; the live sandbox instance is not serialised and is re-created on `warm_up()`. - `e2b` added as an optional test dependency in `pyproject.toml`. - 38 unit tests covering init, warm-up lifecycle, each tool operation, error handling, and round-trip serialisation. https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN
|
Hey @tholor thanks for working on this! Quick high-level question/suggestion. I wonder if we should consider creating separate pre-made tools for the four tools you mention |
There was a problem hiding this comment.
Pull request overview
Adds an E2B-backed Haystack Toolset to allow Haystack agents/pipelines to execute bash commands and perform basic filesystem operations inside an E2B cloud sandbox, with lazy lifecycle management and serialization support.
Changes:
- Introduces
E2BSandboxToolsetwithwarm_up()/close()lifecycle, 4 tools, andto_dict()/from_dict()serialization. - Adds unit tests covering lifecycle, tool behavior, error wrapping, and serialization round-trips.
- Adds
e2bto the test environment’s optional dependencies.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
haystack_experimental/tools/e2b/sandbox_toolset.py |
Implements the new E2BSandboxToolset and its tools, lifecycle, and serialization. |
haystack_experimental/tools/e2b/__init__.py |
Exposes E2BSandboxToolset via lazy import structure. |
haystack_experimental/tools/__init__.py |
Introduces the tools package (license header). |
test/tools/e2b/test_sandbox_toolset.py |
Adds unit tests for initialization, lifecycle, tool calls, and serialization. |
test/tools/e2b/__init__.py |
Makes test.tools.e2b a package (license header). |
test/tools/__init__.py |
Makes test.tools a package (license header). |
pyproject.toml |
Adds e2b to test env extra dependencies. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def test_default_parameters(self): | ||
| toolset = _make_toolset() | ||
| assert toolset.sandbox_template == "base" | ||
| assert toolset.timeout == 120 | ||
| assert toolset.environment_vars == {} | ||
| assert toolset._sandbox is None |
There was a problem hiding this comment.
test_default_parameters isn’t actually validating the toolset’s real defaults because _make_toolset() always overrides timeout (120) and sandbox_template ("base"). This can let regressions slip through if the class defaults change. Consider instantiating E2BSandboxToolset directly (only passing api_key) for the default-parameters test, or make _make_toolset() not override non-essential defaults.
good point, thanks! let's do that |
As a starting point for the implementation you could take a look at how we did this for pre-made GitHub tools https://github.com/deepset-ai/haystack-core-integrations/tree/main/integrations/github/src/haystack_integrations/tools/github |
Address reviewer feedback (sjrl, tholor) to expose individual pre-made
Tool objects instead of a monolithic Toolset, so users can load any
subset of the four tools into their agent.
Changes:
- Replace E2BSandboxToolset (Toolset subclass) with E2BSandbox (plain
dataclass) that manages the sandbox lifecycle (warm_up / close /
to_dict / from_dict).
- Add four individual tool factory functions:
create_run_bash_command_tool(sandbox)
create_read_file_tool(sandbox)
create_write_file_tool(sandbox)
create_list_directory_tool(sandbox)
- Add create_e2b_tools() convenience factory that returns (sandbox, tools)
so callers can pass any subset; all tools share the same E2BSandbox
instance, preserving filesystem / process state across invocations.
- Update __init__.py to export the new public names.
- Rewrite tests to match the new API and fix the Copilot review comment:
test_class_defaults now instantiates E2BSandbox with only api_key to
validate the real class defaults rather than helper-overridden values.
https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN
RunBashCommandTool, ReadFileTool, WriteFileTool, ListDirectoryTool now subclass haystack.tools.Tool directly. Users instantiate them with a shared E2BSandbox instance, mirroring how chat generators are passed to Agent. The create_e2b_tools() convenience function is kept and updated to use the new classes.
- e2b_sandbox.py: E2BSandbox - bash_tool.py: RunBashCommandTool - read_file_tool.py: ReadFileTool - write_file_tool.py: WriteFileTool - list_directory_tool.py: ListDirectoryTool - sandbox_toolset.py: create_e2b_tools (convenience function only)
|
@tholor feel free to evaluate if it would make sense to directly create these new classes in haystack-core-integrations. We could create and E2B integration and release it with very experimental release numbers (e.g. 0.0.1). In case you go this route, we recently added a scaffolding script that makes life easier for contributors: https://github.com/deepset-ai/haystack-core-integrations/blob/main/CONTRIBUTING.md#create-a-new-integration |
|
I agree that this fits better into haystack-core-integrations. We have an open issue for a similar integration but with exec-sandbox deepset-ai/haystack-core-integrations#2933 |
Introduces
E2BSandboxToolset, a HaystackToolsetsubclass thatconnects to an E2B cloud sandbox and exposes four tools to any
Haystack Agent:
run_bash_command,read_file,write_file, andlist_directory.Key design points:
warm_up(), which iscalled automatically by the Haystack pipeline/agent before the first
tool invocation and is idempotent.
close()shuts down the sandbox and releases resources.Secret(defaults to theE2B_API_KEYenvironment variable).to_dict/from_dictserialisation support; the live sandboxinstance is not serialised and is re-created on
warm_up().e2badded as an optional test dependency inpyproject.toml.error handling, and round-trip serialisation.
https://claude.ai/code/session_01DwDqKPEtssXgxqEaArcXiN