Skip to content

Refactor code interpreter tool#100

Merged
fcogidi merged 4 commits into
mainfrom
refactor/code_interpreter
Apr 30, 2026
Merged

Refactor code interpreter tool#100
fcogidi merged 4 commits into
mainfrom
refactor/code_interpreter

Conversation

@fcogidi
Copy link
Copy Markdown
Collaborator

@fcogidi fcogidi commented Apr 30, 2026

Summary

Refactors the E2B CodeInterpreter tool: clearer constructor parameters (sandbox vs code vs HTTP timeouts), stronger typing and validation, structured JSON errors for timeouts/stream failures when return_errors_as_json=True, and a fresh sandbox per run_code with cleanup in finally. Bumps aieng-agents to 0.3.0 and updates the monorepo dependency and README example accordingly.

Clickup Ticket(s): N/A

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📝 Documentation update
  • 🔧 Refactoring (no functional changes)
  • ⚡ Performance improvement
  • 🧪 Test improvements
  • 🔒 Security fix

Changes Made

  • Replace timeout_seconds with explicit sandbox_timeout_seconds, code_execution_timeout_seconds, and request_timeout_seconds; validate ranges and document behavior in the class docstring.
  • Wire E2B AsyncSandbox.create / run_code with template, envs, metadata, allow_internet_access, and httpx-style request timeouts.
  • Add _failure_json / return_errors_as_json path so TimeoutException, httpx.TimeoutException, and httpx.RemoteProtocolError can return CodeInterpreterOutput JSON instead of raising.
  • Add _validate_str_dict for envs / metadata; tighten CodeInterpreterOutput typing (**kwargs: Any); use contextlib.suppress when killing the sandbox in finally.
  • Update integration tests to use sandbox_timeout_seconds=15.
  • Bump aieng-agents to 0.3.0 in aieng-agents/pyproject.toml, root pyproject.toml, and uv.lock; simplify the README CodeInterpreter snippet.

Testing

  • Tests pass locally (uv run pytest tests/)
  • Type checking passes (uv run mypy <src_dir>)
  • Linting passes (uv run ruff check src_dir/)
  • Manual testing performed (describe below)

Related Issues

Relates to consumers that instantiated CodeInterpreter(timeout_seconds=...) — they must migrate to sandbox_timeout_seconds (and optionally the new timeout knobs).

Deployment Notes

  • Semver: package version 0.3.0 reflects the breaking CodeInterpreter API.
  • Downstream apps should pin or bump aieng-agents[all]>=0.3.0 and update constructor kwargs.

Checklist

  • Code follows the project's style guidelines
  • Self-review of code completed
  • Documentation updated (if applicable)
  • No sensitive information (API keys, credentials) exposed

@fcogidi fcogidi requested a review from Copilot April 30, 2026 19:23
@fcogidi fcogidi self-assigned this Apr 30, 2026
@fcogidi fcogidi changed the title Refactor/code interpreter Refactor code interpreter tool Apr 30, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the aieng-agents E2B CodeInterpreter tool API to separate sandbox lifetime, code execution timeout, and HTTP request timeout, while bumping the package to 0.3.0 and updating consumer dependency pins and examples.

Changes:

  • Refactor CodeInterpreter constructor + execution flow (fresh sandbox per run_code, validation, structured JSON error outputs when enabled).
  • Update integration tests to use sandbox_timeout_seconds.
  • Bump aieng-agents to 0.3.0 and update monorepo dependency pins + README snippet.

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
aieng-agents/aieng/agents/tools/code_interpreter.py Refactors API + timeout handling, adds validation and JSON error-return path, ensures sandbox cleanup.
aieng-agents/tests/tools/test_code_interpreter.py Updates constructor usage for the breaking timeout parameter rename.
aieng-agents/README.md Updates CodeInterpreter example (currently mismatched with new API/return type).
aieng-agents/pyproject.toml Bumps package version to 0.3.0.
pyproject.toml Updates dependency constraint to aieng-agents[all]>=0.3.0.
uv.lock Locks aieng-agents to 0.3.0.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +276 to +283
except TimeoutException as exc:
if self.return_errors_as_json:
return _failure_json(
"ExecutionTimeout",
f"{exc} (code read budget ~{self._code_execution_timeout_seconds:g}s; "
f"sandbox VM up to ~{self.sandbox_timeout_seconds}s). Retry with less work "
"per run—e.g. fewer downloads or model fits, smaller loops, or split logic "
"across multiple tool calls (each run starts from a fresh sandbox).",
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new return_errors_as_json behavior (converting TimeoutException, httpx.TimeoutException, and httpx.RemoteProtocolError into structured CodeInterpreterOutput JSON) is not exercised by the existing tests in tests/tools/test_code_interpreter.py. Please add tests that force each failure mode (or mock AsyncSandbox.run_code) and assert the returned JSON has error.name set and remains parseable by CodeInterpreterOutput.

Copilot uses AI. Check for mistakes.
Comment thread aieng-agents/README.md Outdated
Comment thread aieng-agents/README.md
Comment thread aieng-agents/aieng/agents/tools/code_interpreter.py
Comment thread aieng-agents/aieng/agents/tools/code_interpreter.py Outdated
@fcogidi fcogidi merged commit 22a2df3 into main Apr 30, 2026
7 checks passed
@fcogidi fcogidi deleted the refactor/code_interpreter branch April 30, 2026 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants