#### Reference:
- https://learn.deeplearning.ai/courses/building-coding-agents-with-tool-execution/

## Overview: 
- This notebook shows how to build a Coding Agent that is a Web Builder

You'll build this agent step-by-step, defining the tools it needs and implementing the agent loop that lets it reason through tasks.

You can write code by hand if you wish, but this project item includes access to a chatbot built right into the Jupyter notebook, which you can ask for help or instruct to write code for you. 

## üìö About the Project

You'll build an agent that autonomously generates and executes code in a secure cloud sandbox to respond to user queries.

<details>
<summary><strong>Your project should include</strong></summary>

- A function that accepts a user query as input
- Tool functions with schemas that the LLM can call (like execute_code)
- Code that handles conversation memory and the **agent loop** described in the "Inside a coding agent" video
- Access to an E2B sandbox to enable safe code execution

</details>

Here's what the final product might look like:

<strong>üåê Web Builder Agent</strong><br/>
Creates interactive web applications with HTML, CSS, and JavaScript
<br/><br/>
<img src="images/calculator_app.png" width="90%" style="max-height: 300px; object-fit: contain;" alt="Calculator App Output" />


## Your project workflow

To build your coding agent, you'll carry out the following workflow:
1. **Tool calling** üõ†Ô∏è ‚Äî define tool schemas and functions to help your agent interact with files
2. **Agent loop** üîÑ ‚Äî build the iterative loop that let's your agent work through your task
3. **Sandbox execution** ‚òÅÔ∏è ‚Äî give your agent access to an E2B sandbox

```

           [ Specify tools üõ†Ô∏è ]
                   |
                   v
          [ Implement agent loop üîÑ ]
                   |
                   v
        [ Set up sandbox execution ‚òÅÔ∏è ]
```

<p><summary><strong>üóÇÔ∏è Reminder: Attach context to every chatbot prompt</strong></summary>

<p>When you ask the chatbot for help, always upload these files so it has full project context:

- `<Current project notebook>.ipynb`: shares your latest code and notebook state
- `docs.md`: E2B + OpenAI documentation (linked in the next section)</p>

<p>Bringing all context keeps responses grounded in what you've built and E2B capabilities.</p>

<p>**Debugging tip:** When troubleshooting issues, share error messages and relevant code snippets with the chatbot. Follow iterative debugging practices‚Äîtest small changes, verify outputs, and use the AI assistant to help diagnose problems step by step.</p>

</div>

# Step 1: üõ†Ô∏è Tool Calling

It's time to define the tools your agent can use! Every tool extends what your agent can do‚Äîfrom executing code to manipulating files.

- üéØ **Goal:** Create function schemas that tell the LLM what tools are available and how to call them
- üîÅ **Workflow:** Think about what tools your agent needs (like execute_code for running Python, or write_file for creating files), then implement their schemas and execution logic
- üí° **Remember:** Tools are called by the LLM via function calling, so schemas must be precise
- üí° **Tip:** Use the JupyterAI chatbot with the **prompt examples below**
- üìé **Attach these files:** `project.ipynb` and `docs.md` when asking the chatbot for help

---

## ‚úÖ Your tool system should include:

- ‚úì Function schemas that describe each tool's name, description, and parameters
- ‚úì Implementation functions that execute the actual tool logic
- ‚úì An executor handler that routes LLM tool calls to implementations
- ‚úì Error handling for invalid tool calls or execution failures

---

<details>
<summary><strong>üìö Refresher: Function Schema Pattern (click to expand)</strong></summary>

Every tool needs three components:

1. **Schema** - JSON object describing the function signature for the LLM
2. **Implementation** - Python function that executes the tool
3. **Executor** - Handler that routes LLM tool calls to implementations

**Schema Structure:**
```python
{
    "type": "function",
    "name": "execute_code",
    "description": "Execute Python code and return result",
    "parameters": {
        "type": "object",
        "properties": {
            "code": {"type": "string", "description": "Python code"}
        },
        "required": ["code"],
        "additionalProperties": False
    }
}
```

**Why function calling?**
- Gives the LLM structured ways to interact with your system
- Ensures type safety and validation
- Enables the LLM to use tools autonomously during reasoning

</details>

<details>
<summary><strong>üìö Refresher: Tool Components (click to expand)</strong></summary>

**Implementation function:**
```python
def execute_code(code: str) -> dict:
    # Execute the code and capture output
    # Return dict with "results" and "errors" keys
    pass
```

**Executor function:**
```python
def execute_tool(name: str, args: str, tools: dict) -> dict:
    # Parse JSON args
    # Look up tool by name
    # Call tool function with args
    # Handle errors gracefully
    pass
```

**Tools dictionary:**
```python
tools = {
    "execute_code": execute_code,
    "write_file": write_file,
    # ... more tools
}
```

</details>

---

<details>
<summary><strong>üåê Prompt Example ‚Äî Web Builder Tools (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to define the tool calling system for my Web Builder Agent.
Return code only-no explanations, comments, or markdown.

Requirements:
1. Import warnings and suppress warnings with warnings.filterwarnings('ignore')
2. Import sys, StringIO, json, os, and Callable from typing
3. Import OpenAI from openai
4. Initialize client = OpenAI()
5. Define execute_code(code: str) -> dict that runs Python code locally
6. Define write_file(content: str, file_path: str) -> dict that writes content to file and creates directories if needed
7. Define execute_code_schema and write_file_schema as function schema dicts
8. Create tools dict mapping "execute_code" and "write_file" to their functions
9. Define execute_tool(name: str, args: str, tools: dict) handler with error handling
10. Handle JSONDecodeError, KeyError, PermissionError, and general exceptions

**Attachments:**
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

In [1]:
import warnings
warnings.filterwarnings('ignore')
import sys
from io import StringIO
import json
import os
from typing import Callable
from openai import OpenAI

client = OpenAI()

def execute_code(code: str) -> dict:
    execution = {"results": [], "errors": []}
    old_stdout = sys.stdout
    try:
        sys.stdout = StringIO()
        exec(code)
        result = sys.stdout.getvalue()
        execution["results"].append(result)
    except Exception as e:
        execution["errors"].append(str(e))
    finally:
        sys.stdout = old_stdout
    return execution

def write_file(content: str, file_path: str) -> dict:
    try:
        os.makedirs(os.path.dirname(file_path), exist_ok=True)
        with open(file_path, "w", encoding="utf-8") as f:
            f.write(content)
        return {"results": [f"File written to {file_path}"], "errors": []}
    except PermissionError as e:
        return {"results": [], "errors": [f"Permission denied: {str(e)}"]}
    except Exception as e:
        return {"results": [], "errors": [str(e)]}

execute_code_schema = {
    "type": "function",
    "name": "execute_code",
    "description": "Execute Python code locally and return results or errors.",
    "parameters": {
        "type": "object",
        "properties": {
            "code": {"type": "string", "description": "Python code to execute"}
        },
        "required": ["code"],
        "additionalProperties": False
    }
}

write_file_schema = {
    "type": "function",
    "name": "write_file",
    "description": "Write content to a file and create directories if needed.",
    "parameters": {
        "type": "object",
        "properties": {
            "content": {"type": "string", "description": "Content to write"},
            "file_path": {"type": "string", "description": "Destination path"}
        },
        "required": ["content", "file_path"],
        "additionalProperties": False
    }
}

tools = {
    "execute_code": execute_code,
    "write_file": write_file
}

def execute_tool(name: str, args: str, tools: dict) -> dict:
    try:
        args = json.loads(args)
        if name not in tools:
            return {"error": f"Tool {name} not found."}
        result = tools[name](**args)
        return result
    except json.JSONDecodeError as e:
        return {"error": f"JSON decode error: {str(e)}"}
    except KeyError as e:
        return {"error": f"Missing argument: {str(e)}"}
    except PermissionError as e:
        return {"error": f"Permission denied: {str(e)}"}
    except Exception as e:
        return {"error": str(e)}

# Step 2: üîÑ Agent Loop

Now build the iterative loop that powers your coding agent! This is where the LLM reasons, calls tools, receives results, and decides what to do next.

- üéØ **Goal:** Implement a multi-step agent that iterates until task completion or max steps
- üîÅ **Workflow:** Create a loop that alternates between LLM calls and tool execution
- üí° **Remember:** Use max_steps to prevent infinite loops, and stop when the LLM doesn't call any functions
- üí° **Tip:** Use the JupyterAI chatbot with the **prompt examples below**
- üìé **Attach these files:** `project.ipynb` and `docs.md` when asking the chatbot for help

---

## ‚úÖ Your agent loop should include:

- ‚úì Message history that accumulates conversation context
- ‚úì LLM API calls with developer system prompt, messages, and tool schemas
- ‚úì Processing of response parts (text messages and function calls)
- ‚úì Tool execution and result injection back into conversation
- ‚úì Stopping conditions (max steps or no function calls)

---

<details>
<summary><strong>üìö Refresher: Agent Loop Pattern (click to expand)</strong></summary>

The agent follows this cycle:

1. **Send query** to LLM with system prompt and conversation history
2. **Process response** - check if LLM wants to call tools
3. **Execute tools** - run functions and add results to conversation
4. **Repeat** until LLM responds without tool calls or max_steps reached

**Why an agent loop?**
- Enables multi-step reasoning and tool use
- Allows the LLM to see tool results and adapt its strategy
- Prevents infinite loops with max_steps safeguard

</details>

<details>
<summary><strong>üìö Refresher: Key Components (click to expand)</strong></summary>

**Message History:**
```python
messages = [
    {"role": "user", "content": "Create a function that adds two numbers"},
    # LLM responses and tool results get appended here
]
```

**Stopping Conditions:**
- `steps >= max_steps`: Prevent infinite loops
- `not has_function_call`: LLM finished reasoning

**Function Call Result:**
```python
{
    "type": "function_call_output",
    "call_id": part.call_id,
    "output": json.dumps(result)
}
```

**Loop structure:**
```python
for step in range(max_steps):
    # 1. Call LLM
    response = client.responses.create(...)
    
    # 2. Process response parts
    for part in response.output:
        # Append to messages
        # Execute function calls
    
    # 3. Check if done
    if not has_function_call:
        break
```

</details>

---

üìå **Tip:** Test with simple queries first, then try multi-step tasks!

---

<details>
<summary><strong>üåê Prompt Example ‚Äî Web Builder Agent Loop (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to implement the agent loop for my Web Builder Agent.
Return code only-no explanations, comments, or markdown.

Requirements:
1. Define coding_agent(client: OpenAI, query: str, system: str, tools: dict, tools_schemas: list, max_steps: int = 5)
2. Initialize messages with user query dict
3. Create iteration loop with step counter up to max_steps
4. Call LLM with developer system prompt, messages history, and tool schemas
5. Process each output part: append to messages, print text content, execute function calls
6. For each function call, use execute_tool helper and append result with proper structure
7. Set has_function_call flag and break if False
8. Print step number and tool execution results for debugging
9. Return messages after loop completes

**Attachments:**
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

In [2]:
def coding_agent(client: OpenAI, query: str, system: str, tools: dict, tools_schemas: list, max_steps: int = 5):
    messages = [{"role": "user", "content": query}]
    steps = 0
    while steps < max_steps:
        print(f"\n--- Step {steps + 1} ---")
        response = client.responses.create(
            model="gpt-4.1-mini",
            input=[
                {"role": "developer", "content": system},
                *messages
            ],
            tools=tools_schemas
        )
        has_function_call = False
        for part in response.output:
            messages.append(part.to_dict())
            if part.type == "message":
                print(part.content)
            elif part.type == "function_call":
                has_function_call = True
                name = part.name
                args = part.arguments
                result = execute_tool(name, args, tools)
                print(f"Executed tool: {name}")
                messages.append({
                    "type": "function_call_output",
                    "call_id": part.call_id,
                    "output": json.dumps(result)
                })
        if not has_function_call:
            break
        steps += 1
    return messages

# Step 3: ‚òÅÔ∏è Sandbox Execution

Finally, move your agent to the cloud! Instead of running code locally, execute everything in an E2B sandbox‚Äîa secure, isolated environment perfect for untrusted code.

- üéØ **Goal:** Integrate E2B sandbox with your coding agent for safe cloud execution
- üîÅ **Workflow:** Create sandbox, modify execute_code to use sandbox, update agent to pass sandbox to tools
- üí° **Remember:** Sandboxes are persistent‚Äîyou can reconnect, query by metadata, and serve websites from them
- üí° **Tip:** Use the JupyterAI chatbot with the **prompt examples below**
- üìé **Attach these files:** `project.ipynb` and `docs.md` when asking the chatbot for help

---

## ‚úÖ Your sandbox integration should include:

- ‚úì Sandbox creation with appropriate timeout
- ‚úì Modified execute_code function that uses `sbx.run_code()`
- ‚úì Metadata handling for images and visualizations
- ‚úì Agent function updated to accept and pass sandbox parameter
- ‚úì Test execution with sample query

---

<details>
<summary><strong>üìö Refresher: E2B Sandbox Features (click to expand)</strong></summary>

E2B provides secure cloud sandboxes with these capabilities:

1. **Isolated execution** - Code runs in secure microVM
2. **File system** - Create, read, write, delete files
3. **Multiple languages** - Python, JavaScript, Bash
4. **Web hosting** - Serve applications on custom ports
5. **Persistent** - Reconnect to existing sandboxes by ID

**Why use sandboxes?**
- Execute untrusted LLM-generated code safely
- Avoid polluting your local environment
- Access pre-installed packages and tools
- Serve web applications with public URLs

</details>

<details>
<summary><strong>üìö Refresher: Sandbox Integration (click to expand)</strong></summary>

**Create Sandbox:**
```python
sbx = Sandbox.create(timeout=60 * 60)  # 1 hour
```

**Execute Code:**
```python
execution = sbx.run_code("print('Hello')")
result = execution.to_json()
```

**Handle Results:**
```python
# Access stdout/stderr
print(execution.results)

# Access images (PNG data)
for result in execution.results:
    if result.png:
        # Store base64 PNG data
        png_data = result.png
```

**Modified Agent Pattern:**
```python
def coding_agent(..., sbx: Sandbox):
    # Pass sandbox to execute_tool
    result = execute_tool(name, args, tools, sbx=sbx)
```

</details>

---

üìå **Tip:** Test with a simple task first, then try complex multi-step projects!

---

<details>
<summary><strong>üåê Prompt Example ‚Äî Web Builder Sandbox (click to expand)</strong></summary>

```
You are my coding assistant. Generate Python code to integrate E2B sandbox with my Web Builder Agent and serve the result.
Return code only-no explanations, comments, or markdown.

Requirements:
1. Import display and IFrame from IPython.display, and import time
2. Modify execute_code to use sbx.run_code() instead of local exec()
3. Return execution.to_json() and metadata dict as tuple
4. Define execute_code_schema (same as before, LLM doesn't know about sbx parameter)
5. Modify execute_tool to pass sbx to all tool functions
6. Update coding_agent signature to include sbx: Sandbox parameter
7. Create sandbox with Sandbox.create(timeout=3600)
8. Define system prompt: "You are a web development agent. You MUST use execute_code to write all files to /home/user/ directory using Python code with open(). Never return code as text - always execute Python to write files."
9. Run agent with query "Create a simple calculator app with HTML/CSS/JS in index.html"
10. After agent completes, start HTTP server in /home/user directory: sbx.commands.run("cd /home/user && python -m http.server 3000 --bind 0.0.0.0", background=True)
11. Wait 3 seconds for server to start: time.sleep(3)
12. Get host URL with sbx.get_host(3000)
13. Use display(IFrame(f"https://{host}", width=800, height=600)) to render the calculator in notebook output

**Attachments:**
- `docs.md` for E2B + OpenAI documentation
- `project.ipynb` to see the progress of my project
```

</details>

In [6]:
from IPython.display import display, IFrame
import time
from openai import OpenAI
from e2b_code_interpreter import Sandbox
import json

def execute_code(code: str, sbx: Sandbox):
    execution = sbx.run_code(code)
    metadata = {}
    return execution.to_json(), metadata

execute_code_schema = {
    "type": "function",
    "name": "execute_code",
    "description": "Execute Python code",
    "parameters": {
        "type": "object",
        "properties": {
            "code": {"type": "string", "description": "Python code to execute"}
        },
        "required": ["code"],
        "additionalProperties": False
    }
}

def execute_tool(name: str, args: str, tools: dict, **kwargs):
    try:
        args = json.loads(args)
        result, metadata = tools[name](**args, **kwargs)
        return result, metadata
    except json.JSONDecodeError as e:
        return {"error": f"JSON decode error: {str(e)}"}, {}
    except KeyError as e:
        return {"error": f"Missing argument: {str(e)}"}, {}
    except Exception as e:
        return {"error": str(e)}, {}

def coding_agent(client: OpenAI, query: str, system: str, tools: dict, tools_schemas: list, sbx: Sandbox, max_steps: int = 5):
    messages = [{"role": "user", "content": query}]
    for step in range(max_steps):
        print(f"\n--- Step {step + 1} ---")
        response = client.responses.create(
            model="gpt-4.1-mini",
            input=[{"role": "developer", "content": system}, *messages],
            tools=tools_schemas
        )
        has_function_call = False
        for part in response.output:
            messages.append(part.to_dict())
            if part.type == "message":
                print(part.content)
            elif part.type == "function_call":
                has_function_call = True
                name = part.name
                result, metadata = execute_tool(name, part.arguments, tools, sbx=sbx)
                print(f"Executed tool: {name}")
                messages.append({
                    "type": "function_call_output",
                    "call_id": part.call_id,
                    "output": json.dumps(result)
                })
        if not has_function_call:
            break
    return messages

client = OpenAI()
sbx = Sandbox.create(timeout=3600)

tools = {"execute_code": execute_code}

system = "You are a web development agent. You MUST use execute_code to write all files to /home/user/ directory using Python code with open(). Never return code as text - always execute Python to write files."

query = "Create a simple calculator app with HTML/CSS/JS in index.html. The calculator should have result on top, '=' operator on bottom, and the rest in 4x4 layout"

messages = coding_agent(client, query, system, tools, [execute_code_schema], sbx=sbx)

sbx.commands.run("cd /home/user && python -m http.server 3000 --bind 0.0.0.0", background=True)
time.sleep(3)
host = sbx.get_host(3000)
display(IFrame(f"https://{host}", width=800, height=600))


--- Step 1 ---


Executed tool: execute_code

--- Step 2 ---


[ResponseOutputText(annotations=[], text="The simple calculator app with the result on top, the '=' operator on the bottom spanning all columns, and the rest of the buttons in a 4x4 layout has been created in the file /home/user/index.html. You can open and use it in a web browser. Let me know if you need any changes or additional features!", type='output_text', logprobs=[])]


---

## You've learned how to:

‚úÖ Define tool schemas and execution functions for LLM function calling  
‚úÖ Implement an agent loop with conversation memory and iterative reasoning  
‚úÖ Integrate E2B sandboxes for safe cloud code execution  
‚úÖ Handle tool results and build multi-step autonomous agents  

## Next Steps

Want to extend your project? Try:
- Adding more tools (file operations, web scraping, API calls)
- Implementing conversation memory persistence across sessions
- Creating specialized agents for different domains (data science, web dev, DevOps)
- Building multi-agent systems where agents collaborate on complex tasks

**Resources:**
- [E2B Documentation](https://e2b.dev/docs)
- [OpenAI Function Calling Guide](https://platform.openai.com/docs/guides/function-calling)
- [E2B GitHub](https://github.com/e2b-dev/e2b)