# ü§ñ Welcome to Your Coding Agent Sandbox

You're about to build a coding agent that generates and executes code end to end. The catch?
It's open-ended‚Äîchoose a domain (data analyzer, web app builder, or invent your own) and let your agent write code autonomously.

### üéØ By the end of this project, you will:

- ü§ñ **Build an autonomous coding agent** that uses LLM tool calling and iterative reasoning to solve multi-step tasks
- ‚òÅÔ∏è **Integrate E2B cloud sandboxes** to safely execute untrusted code in isolated environments
- ‚ú® **Leverage AI-assisted development** with JupyterAI to generate agent code from natural language prompts

## What you'll build

By the end of this project, your coding agent will autonomously generate and execute code in a secure cloud sandbox. Here are two example outcomes:

<table>
<tr>
<td width="50%" valign="top">
<strong>üìä Data Analyzer Agent</strong><br/>
Generates synthetic datasets, performs statistical analysis, and creates visualizations
<br/><br/>
<img src="images/histogram.png" width="90%" style="max-height: 300px; object-fit: contain;" alt="Data Analyzer Output" />
</td>
<td width="50%" valign="top">
<strong>üåê Web Builder Agent</strong><br/>
Creates interactive web applications with HTML, CSS, and JavaScript
<br/><br/>
<img src="images/calculator_app.png" width="90%" style="max-height: 300px; object-fit: contain;" alt="Calculator App Output" />
</td>
</tr>
</table>

**The example prompts work out-of-the-box!** Use them as-is to build the agents shown above, or copy and edit them to create something completely different. Your agent could manipulate files, process images, generate reports, or solve any problem you dream up.

---

**Your first step:** Choose your domain and write a specification defining your agent's capabilities, tools, and workflow.

> üí° **New to JupyterAI?** Learn more about coding with AI in notebooks at [JupyterAI: Coding in Notebooks](https://www.deeplearning.ai/short-courses/jupyter-ai-coding-in-notebooks/)

> ‚ö†Ô∏è **Important:** Your workspace session lasts 2 hours. Remember to download `project.ipynb` and `spec.md` regularly to save your progress!

## What you'll craft step by step
1. **Spec draft** üìù ‚Äî write a spec file to capture your use case
2. **Tool calling** üõ†Ô∏è ‚Äî define function schemas and execute tools
3. **Agent loop** üîÑ ‚Äî build iterative agent with max steps
4. **Sandbox execution** ‚òÅÔ∏è ‚Äî run agent code safely in E2B cloud

```
     What you'll craft step by step

             [ Spec üìù ]
                   |
                   v
           [ Tools üõ†Ô∏è ]
                   |
                   v
          [ Agent Loop üîÑ ]
                   |
                   v
        [ Sandbox Execution ‚òÅÔ∏è ]
```

üî¥ Need guidance? Use the left-hand chatbot for targeted questions at any stage.

* To open Jupyter chat, click on the chat bubble icon on the left sidebar of Jupyter Lab:

  <img src="images/jupyter_chat_bordered.png" width="150" style="vertical-align: middle;">

Pick a use case, define your spec, and let's build.

<details>
<summary><strong>üóÇÔ∏è Reminder: Attach context to every chatbot prompt</strong></summary>

When you ask the chatbot for help, always upload these files so it has full project context:

- `project.ipynb`: shares your latest code and notebook state
- `spec.md`: your coding agent specification
- E2B + OpenAI documentation (provided in resources)

Bringing all context keeps responses grounded in what you've built, E2B capabilities, and your project design.

**Debugging tip:** When troubleshooting issues, share error messages and relevant code snippets with the chatbot. Follow iterative debugging practices‚Äîtest small changes, verify outputs, and use the AI assistant to help diagnose problems step by step.
</details>

## ‚öôÔ∏è Setup and Imports

This project uses:
- **Jupyter-ai** - For AI assisted coding
- **E2B Code Interpreter** - For safe cloud code execution
- **OpenAI** - For LLM reasoning and tool calling
- **python-dotenv** - For environment configuration

### Useful Resources:
- **E2B documentation**: https://e2b.dev/docs
- **OpenAI Responses API**: https://platform.openai.com/docs/api-reference/responses
- **Function calling guide**: https://platform.openai.com/docs/guides/function-calling

### üî¥ Note ‚ÄºÔ∏è ‚ÄºÔ∏è You should also consult the chatbot on the left to ask more questions about this project or about the frameworks used.

In [None]:
import os
import json
import warnings
from typing import Dict, List, Callable

# Suppress warnings
warnings.filterwarnings('ignore')

# Load environment variables
from lib.helper import load_env
load_env()

# E2B and OpenAI imports
from openai import OpenAI
from e2b_code_interpreter import Sandbox

In [None]:
# Initialize OpenAI client
client = OpenAI()

# Check if API keys are set
if not os.getenv("OPENAI_API_KEY"):
    print("‚ö†Ô∏è  OPENAI_API_KEY not found!")
    print("Please create a .env file with your OpenAI API key.")
elif not os.getenv("E2B_API_KEY"):
    print("‚ö†Ô∏è  E2B_API_KEY not found!")
    print("Please add your E2B API key to the .env file.")
else:
    print("‚úÖ API keys loaded successfully")

# üìù Spec File

A spec file captures the entire project vision. Draft it _before_ writing code so you and your AI helpers stay aligned on what needs to be built.

**üî¥ Important: You must create a file called `spec.md` with your specification written inside it.**

Use the chatbot to help you draft the content, then save it to a file named `spec.md` in your workspace. You will attach this file to all further interactions with the chatbot.

- ‚úÖ **Purpose:** Keep the project grounded in a single source of truth that you can append to every chatbot prompt.
- üß† **Brainstorm:** What kind of coding agent do you want? Data analysis? Web development? File manipulation? Choose a domain that interests you.
- üß© **Focus:** Define what tools your agent needs, what tasks it should accomplish, and what outputs it should produce.

> **Recommendation:** Start simple. You can always add more complex tasks once the MVP works.

## ‚úçÔ∏è What your `spec.md` should include
- **Project name** and one-sentence description
- **Use case summary** (what problem does your agent solve?)
- **Runtime inputs** (what parameters does the user provide?)
- **Agent capabilities** (what can your agent do?)
- **Tools needed** (execute_code, read_file, write_file, etc.)
- **Expected outputs** (what files or results does it produce?)
- **Example scenario** (walk through one complete use case)

<details>
<summary><strong>üìä Example Spec ‚Äî Data Analyzer Agent (click to expand)</strong></summary>

```markdown
# Data Analyzer Agent

## Project Summary
A coding agent that generates synthetic datasets, performs statistical analysis, and creates visualizations.

## Use Case
Data scientists and analysts who need quick exploratory analysis on synthetic data for prototyping and testing.

## Runtime Inputs
- `dataset_type`: Type of data to generate ("sales", "weather", "users")
- `num_records`: Number of records to generate
- `analysis_type`: Type of analysis ("summary", "trends", "correlations")

## Agent Capabilities
- Generate synthetic datasets matching specified schema
- Perform statistical calculations (mean, median, std dev)
- Create visualizations (bar charts, line plots, scatter plots)
- Save results to files (CSV data, PNG charts, TXT reports)

## Tools
- **execute_code**: Run Python code for data generation and analysis
- **write_file**: Save datasets and reports
- **read_file**: Load previously generated data (optional)

## Expected Outputs
- `data.csv`: Synthetic dataset
- `analysis.txt`: Statistical summary report
- `chart.png`: Visualization (if plotting library available)

## Example Scenario
User requests: "Generate 100 sales records and analyze monthly trends"
1. Agent generates synthetic sales data (product, quantity, price, date)
2. Agent calculates monthly revenue totals
3. Agent creates trend visualization
4. Agent saves data.csv, analysis.txt, and chart.png
```

</details>

<details>
<summary><strong>üåê Example Spec ‚Äî Web Builder Agent (click to expand)</strong></summary>

```markdown
# Web Builder Agent

## Project Summary
A coding agent that creates simple HTML/CSS/JS web applications based on user requirements.

## Use Case
Developers who need quick prototypes or simple interactive web tools (calculators, to-do lists, games).

## Runtime Inputs
- `app_type`: Type of app to build ("calculator", "todo", "quiz", "snake_game")
- `features`: List of features to include
- `style`: Styling preference ("minimal", "colorful", "dark")

## Agent Capabilities
- Generate HTML structure with semantic tags
- Write CSS for layout and styling
- Create JavaScript for interactivity and logic
- Implement game mechanics (for simple games)
- Create responsive layouts

## Tools
- **execute_code**: Test JavaScript logic
- **write_file**: Create index.html, style.css, script.js files

## Expected Outputs
- `index.html`: Complete web application
- Optionally: Separate CSS/JS files for larger apps

## Example Scenario
User requests: "Build a calculator app with basic operations"
1. Agent creates HTML with calculator button grid
2. Agent writes CSS for clean, centered layout
3. Agent implements JavaScript for arithmetic operations
4. Agent saves index.html with embedded CSS/JS
```

</details>

Keep your own spec concise‚Äîsomething you can paste into prompts and skim in under two minutes.

In [None]:
# Create your spec.md file using the chatbot
# This file will be attached to all future prompts

# üõ†Ô∏è Tool Calling

üéØ **Expected output:** Defines tool calling system with function schemas and execution handler

---

Time to define the tools your agent can use. Every tool extends what your agent can do‚Äîfrom executing code to manipulating files.

- üéØ **Goal:** Create function schemas that tell the LLM what tools are available and how to call them
- üîÅ **Workflow:** Reference your `spec.md` to see what tools you defined, then implement their schemas and execution logic
- üí° **Remember:** Tools are called by the LLM via function calling, so schemas must be precise

---

## üîç Function Schema Pattern

Every tool needs:
1. **Schema** - JSON object describing the function signature
2. **Implementation** - Python function that executes the tool
3. **Executor** - Handler that routes LLM tool calls to implementations

---

## üß± Tool Components

**Schema Structure:**
```python
{
    "type": "function",
    "name": "execute_code",
    "description": "Execute Python code and return result",
    "parameters": {
        "type": "object",
        "properties": {
            "code": {"type": "string", "description": "Python code"}
        },
        "required": ["code"],
        "additionalProperties": False
    }
}
```

**Implementation:** Function that actually does the work

**Executor:** Parses LLM's tool call and invokes the implementation

üìå Reference your spec file for the tools you defined

<details>
<summary><strong>üìä Prompt Example ‚Äî Data Analyzer Tools (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to define the tool calling system for my Data Analyzer Agent.
Return code only-no explanations, comments, or markdown.

Context: See attached spec.md

Requirements:
1. Import sys, StringIO, json, and Callable from typing
2. Define execute_code(code: str) -> dict function that runs code locally using exec() and captures stdout
3. Define execute_code_schema as dict with type, name, description, and parameters
4. Define tools dict mapping "execute_code" to the function
5. Define execute_tool(name: str, args: str, tools: dict) that parses JSON args and calls the tool
6. Handle errors gracefully (JSONDecodeError, KeyError, Exception) and return error messages
7. Return execution dict with "results" and "errors" keys

**Attachments:**
- `spec.md` for project info
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

<details>
<summary><strong>üåê Prompt Example ‚Äî Web Builder Tools (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to define the tool calling system for my Web Builder Agent.
Return code only-no explanations, comments, or markdown.

Context: See attached spec.md

Requirements:
1. Import sys, StringIO, json, os, and Callable from typing
2. Define execute_code(code: str) -> dict that runs Python code locally
3. Define write_file(content: str, file_path: str) -> dict that writes content to file and creates directories if needed
4. Define execute_code_schema and write_file_schema as function schema dicts
5. Create tools dict mapping "execute_code" and "write_file" to their functions
6. Define execute_tool(name: str, args: str, tools: dict) handler with error handling
7. Handle JSONDecodeError, KeyError, PermissionError, and general exceptions

**Attachments:**
- `spec.md` for project info
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

# üîÑ Agent Loop

üéØ **Expected output:** Agent iterates through steps, calling tools and processing results until task completion

---

Now build the iterative loop that powers your coding agent. This is where the LLM reasons, calls tools, receives results, and decides what to do next.

- üéØ **Goal:** Implement a multi-step agent that iterates until task completion or max steps
- üîÅ **Workflow:** Create a loop that alternates between LLM calls and tool execution
- üí° **Remember:** Use max_steps to prevent infinite loops, and stop when the LLM doesn't call any functions

---

## üîç Agent Loop Pattern

The agent follows this cycle:
1. **Send query** to LLM with system prompt and conversation history
2. **Process response** - check if LLM wants to call tools
3. **Execute tools** - run functions and add results to conversation
4. **Repeat** until LLM responds without tool calls or max_steps reached

---

## üß± Key Components

**Message History:**
```python
messages = [
    {"role": "user", "content": "Create a function that adds two numbers"},
    # LLM responses and tool results get appended here
]
```

**Stopping Conditions:**
- `steps >= max_steps`: Prevent infinite loops
- `not has_function_call`: LLM finished reasoning

**Function Call Result:**
```python
{
    "type": "function_call_output",
    "call_id": part.call_id,
    "output": json.dumps(result)
}
```

üìå Test with simple queries first, then try multi-step tasks

<details>
<summary><strong>üìä Prompt Example ‚Äî Data Analyzer Agent Loop (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to implement the agent loop for my Data Analyzer Agent.
Return code only-no explanations, comments, or markdown.

Context: See attached spec.md

Requirements:
1. Define coding_agent(client: OpenAI, query: str, system: str, tools: dict, tools_schemas: list, messages: list = None, max_steps: int = 5)
2. Initialize messages list with user query if not provided
3. Create while loop that runs up to max_steps iterations
4. Call client.responses.create() with model="gpt-4.1-mini", developer role with system prompt, messages, and tools
5. Iterate over response.output parts and append to messages
6. For message parts, print the content
7. For function_call parts, execute the tool and append function_call_output to messages with call_id and JSON output
8. Track has_function_call flag and break loop if no function calls
9. Return final messages list

**Attachments:**
- `spec.md` for project info
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

<details>
<summary><strong>üåê Prompt Example ‚Äî Web Builder Agent Loop (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to implement the agent loop for my Web Builder Agent.
Return code only-no explanations, comments, or markdown.

Context: See attached spec.md

Requirements:
1. Define coding_agent(client: OpenAI, query: str, system: str, tools: dict, tools_schemas: list, max_steps: int = 5)
2. Initialize messages with user query dict
3. Create iteration loop with step counter up to max_steps
4. Call LLM with developer system prompt, messages history, and tool schemas
5. Process each output part: append to messages, print text content, execute function calls
6. For each function call, use execute_tool helper and append result with proper structure
7. Set has_function_call flag and break if False
8. Print step number and tool execution results for debugging
9. Return messages after loop completes

**Attachments:**
- `spec.md` for project info
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

# ‚òÅÔ∏è Sandbox Execution

üéØ **Expected output:** Creates secure E2B sandbox, executes agent-generated code in isolated cloud environment, displays final results in notebook (visualizations or web applications)

---

Finally, move your agent to the cloud! Instead of running code locally, execute everything in an E2B sandbox‚Äîa secure, isolated environment perfect for untrusted code.

- üéØ **Goal:** Integrate E2B sandbox with your coding agent for safe cloud execution
- üîÅ **Workflow:** Create sandbox, modify execute_code to use sandbox, update agent to pass sandbox to tools
- üí° **Remember:** Sandboxes are persistent‚Äîyou can reconnect, query by metadata, and serve websites from them

---

## üîç E2B Sandbox Features

1. **Isolated execution** - Code runs in secure microVM
2. **File system** - Create, read, write, delete files
3. **Multiple languages** - Python, JavaScript, Bash
4. **Web hosting** - Serve applications on custom ports
5. **Persistent** - Reconnect to existing sandboxes by ID

---

## üß± Sandbox Integration

**Create Sandbox:**
```python
sbx = Sandbox.create(timeout=60 * 60)  # 1 hour
```

**Execute Code:**
```python
execution = sbx.run_code("print('Hello')")
result = execution.to_json()
```

**Modified Agent:**
- Pass `sbx` parameter to agent function
- Update execute_code to use `sbx.run_code()`
- Handle execution results and metadata (images, charts)

üìå Test with a simple task first, then try complex multi-step projects

<details>
<summary><strong>üìä Prompt Example ‚Äî Data Analyzer Sandbox (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to integrate E2B sandbox with my Data Analyzer Agent.
Return code only-no explanations, comments, or markdown.

Context: See attached spec.md

Requirements:
1. Import display and Image from IPython.display, and import base64
2. Modify execute_code function to accept sbx: Sandbox parameter
3. Replace exec() with sbx.run_code(code) and capture execution object
4. Handle execution.results and execution.error
5. For results with PNG data, store in metadata dict and set result.png = None
6. Return execution.to_json() and metadata dict as tuple
7. Define execute_code_schema (same as before, LLM doesn't know about sbx parameter)
8. Update execute_tool to pass sbx=sbx kwarg to tools
9. Update coding_agent to accept sbx parameter and pass it to execute_tool
10. Create test: sbx = Sandbox.create(timeout=3600), run agent with query "Generate 50 random numbers and plot histogram"
11. After agent completes, decode and display images: for each png_data in metadata["images"], use display(Image(data=base64.b64decode(png_data))) to convert the base64 string from E2B to binary data for IPython

**Attachments:**
- `spec.md` for project info
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

<details>
<summary><strong>üåê Prompt Example ‚Äî Web Builder Sandbox (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to integrate E2B sandbox with my Web Builder Agent and serve the result.
Return code only-no explanations, comments, or markdown.

Context: See attached spec.md

Requirements:
1. Import display and IFrame from IPython.display, and import time
2. Modify execute_code to use sbx.run_code() instead of local exec()
3. Return execution.to_json() and metadata dict as tuple
4. Define execute_code_schema (same as before, LLM doesn't know about sbx parameter)
5. Modify execute_tool to pass sbx to all tool functions
6. Update coding_agent signature to include sbx: Sandbox parameter
7. Create sandbox with Sandbox.create(timeout=3600)
8. Define system prompt: "You are a web development agent. You MUST use execute_code to write all files to /home/user/ directory using Python code with open(). Never return code as text - always execute Python to write files."
9. Run agent with query "Create a simple calculator app with HTML/CSS/JS in index.html"
10. After agent completes, start HTTP server in /home/user directory: sbx.commands.run("cd /home/user && python -m http.server 3000 --bind 0.0.0.0", background=True)
11. Wait 3 seconds for server to start: time.sleep(3)
12. Get host URL with sbx.get_host(3000)
13. Use display(IFrame(f"https://{host}", width=800, height=600)) to render the calculator in notebook output

**Attachments:**
- `spec.md` for project info
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

---

## üìã Optional Feedback Survey

We'd love to hear about your experience! Your feedback helps us create more valuable educational experiences.

**[Take the short survey ‚Üí](https://rebrand.ly/e2b-course)**

This optional survey asks about:
- Project quality and engagement
- What you found most valuable
- How we can improve future projects

Thank you for your time! üôè