Refactor code interpreter tool#100
Conversation
There was a problem hiding this comment.
Pull request overview
Refactors the aieng-agents E2B CodeInterpreter tool API to separate sandbox lifetime, code execution timeout, and HTTP request timeout, while bumping the package to 0.3.0 and updating consumer dependency pins and examples.
Changes:
- Refactor
CodeInterpreterconstructor + execution flow (fresh sandbox perrun_code, validation, structured JSON error outputs when enabled). - Update integration tests to use
sandbox_timeout_seconds. - Bump
aieng-agentsto0.3.0and update monorepo dependency pins + README snippet.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
aieng-agents/aieng/agents/tools/code_interpreter.py |
Refactors API + timeout handling, adds validation and JSON error-return path, ensures sandbox cleanup. |
aieng-agents/tests/tools/test_code_interpreter.py |
Updates constructor usage for the breaking timeout parameter rename. |
aieng-agents/README.md |
Updates CodeInterpreter example (currently mismatched with new API/return type). |
aieng-agents/pyproject.toml |
Bumps package version to 0.3.0. |
pyproject.toml |
Updates dependency constraint to aieng-agents[all]>=0.3.0. |
uv.lock |
Locks aieng-agents to 0.3.0. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except TimeoutException as exc: | ||
| if self.return_errors_as_json: | ||
| return _failure_json( | ||
| "ExecutionTimeout", | ||
| f"{exc} (code read budget ~{self._code_execution_timeout_seconds:g}s; " | ||
| f"sandbox VM up to ~{self.sandbox_timeout_seconds}s). Retry with less work " | ||
| "per run—e.g. fewer downloads or model fits, smaller loops, or split logic " | ||
| "across multiple tool calls (each run starts from a fresh sandbox).", |
There was a problem hiding this comment.
The new return_errors_as_json behavior (converting TimeoutException, httpx.TimeoutException, and httpx.RemoteProtocolError into structured CodeInterpreterOutput JSON) is not exercised by the existing tests in tests/tools/test_code_interpreter.py. Please add tests that force each failure mode (or mock AsyncSandbox.run_code) and assert the returned JSON has error.name set and remains parseable by CodeInterpreterOutput.
Summary
Refactors the E2B
CodeInterpretertool: clearer constructor parameters (sandbox vs code vs HTTP timeouts), stronger typing and validation, structured JSON errors for timeouts/stream failures whenreturn_errors_as_json=True, and a fresh sandbox perrun_codewith cleanup infinally. Bumps aieng-agents to 0.3.0 and updates the monorepo dependency and README example accordingly.Clickup Ticket(s): N/A
Type of Change
Changes Made
timeout_secondswith explicitsandbox_timeout_seconds,code_execution_timeout_seconds, andrequest_timeout_seconds; validate ranges and document behavior in the class docstring.AsyncSandbox.create/run_codewith template, envs, metadata,allow_internet_access, and httpx-style request timeouts._failure_json/return_errors_as_jsonpath soTimeoutException,httpx.TimeoutException, andhttpx.RemoteProtocolErrorcan returnCodeInterpreterOutputJSON instead of raising._validate_str_dictforenvs/metadata; tightenCodeInterpreterOutputtyping (**kwargs: Any); usecontextlib.suppresswhen killing the sandbox infinally.sandbox_timeout_seconds=15.aieng-agentsto0.3.0inaieng-agents/pyproject.toml, rootpyproject.toml, anduv.lock; simplify the READMECodeInterpretersnippet.Testing
uv run pytest tests/)uv run mypy <src_dir>)uv run ruff check src_dir/)Related Issues
Relates to consumers that instantiated
CodeInterpreter(timeout_seconds=...)— they must migrate tosandbox_timeout_seconds(and optionally the new timeout knobs).Deployment Notes
CodeInterpreterAPI.aieng-agents[all]>=0.3.0and update constructor kwargs.Checklist