# Code Interpreting with OpenAi models
This example uses the E2B's [Code Interpreter](https://github.com/e2b-dev/code-interpreter) as a tool for OpenAI's model. You can choose from models with function-calling support, such as o1 or o3-mini.
We let the LLM write the code to train a machine learning model on a dataset from Kaggle. We use the E2B Code Interpreter SDK for running the LLM-generated code tasks in a secure and isolated cloud environment.

In [21]:
%pip install openai e2b_code_interpreter==1.0.0


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [22]:
import os
from dotenv import load_dotenv
from openai import OpenAI
import json
from e2b_code_interpreter import Sandbox

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
E2B_API_KEY = os.getenv("E2B_API_KEY")

SYSTEM_PROMPT = """
## your job & context
you are a python data scientist. you are given tasks to complete and you run python code to solve them.

Information about the temperature dataset:
- It's in the `/home/user/city_temperature.csv` file
- The CSV file is using `,` as the delimiter
- It has following columns (examples included):
  - `Region`: "North America", "Europe"
  - `Country`: "Iceland"
  - `State`: for example "Texas" but can also be null
  - `City`: "Prague"
  - `Month`: "June"
  - `Day`: 1-31
  - `Year`: 2002
  - `AvgTemperature`: temperature in Celsius, for example 24

- the python code runs in jupyter notebook.
- every time you call `execute_python` tool, the python code is executed in a separate cell. it's okay to multiple calls to `execute_python`.
- display visualizations using matplotlib or any other visualization library directly in the notebook. don't worry about saving the visualizations to a file.
- you have access to the internet and can make api requests.
- you also have access to the filesystem and can read/write files.
- you can install any pip package (if it exists) if you need to but the usual packages for data analysis are already preinstalled.
- you can run any python code you want, everything is running in a secure sandbox environment.
"""

tools = [
    {
        "type": "function",
        "function": {
            "name": "execute_python",
            "description": "Execute python code in a Jupyter notebook cell and returns any result, stdout, stderr, display_data, and error.",
            "parameters": {
                "type": "object",
                "properties": {
                    "code": {
                        "type": "string",
                        "description": "The python code to execute in a single cell."
                    }
                },
                "required": ["code"]
            }
        }
    }
]

In [23]:
def code_interpret(code_interpreter, code):
    print("Running code interpreter...")
    
    exec = code_interpreter.run_code(
        code,
        on_stderr=lambda stderr: print("[Code Interpreter]", stderr),
        on_stdout=lambda stdout: print("[Code Interpreter]", stdout)
    )
    
    if exec.error:
        print("[Code Interpreter ERROR]", exec.error)
        raise Exception(exec.error.value)
        
    return exec.results

In [24]:
client = OpenAI(api_key=OPENAI_API_KEY)

def process_tool_call(code_interpreter, tool_call):
    if tool_call.function.name == "execute_python":
        code = json.loads(tool_call.function.arguments)["code"]
        return code_interpret(code_interpreter, code)
    return []

def chat_with_llm(code_interpreter, user_message):
    print(f"\n{'='*50}\nUser Message: {user_message}\n{'='*50}")
    
    print('Waiting for the LLM to respond...')
    completion = client.chat.completions.create(
        model="o3", #Choose different model by uncommenting
        # model="o1",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": user_message}
        ],
        tools=tools,
        tool_choice="auto"
    )
    
    message = completion.choices[0].message
    print('\nInitial Response:', message)
    
    if message.tool_calls:
        tool_call = message.tool_calls[0]
        print(f"\nTool Used: {tool_call.function.name}\nTool Input: {tool_call.function.arguments}")
        
        code_interpreter_results = process_tool_call(code_interpreter, tool_call)
        print(f"Tool Result: {code_interpreter_results}")
        return code_interpreter_results
    
    raise Exception('Tool calls not found in message content.')


def upload_dataset(code_interpreter):
    print('Uploading dataset to Code Interpreter sandbox...')
    dataset_path = './city_temperature.csv'
    
    if not os.path.exists(dataset_path):
        raise Exception('Dataset file not found')
    
    with open(dataset_path, 'rb') as f:
        file_buffer = f.read()
    
    try:
        remote_path = code_interpreter.files.write('city_temperature.csv', file_buffer)
        if not remote_path:
            raise Exception('Failed to upload dataset')
        print('Uploaded at', remote_path)
        return remote_path
    except Exception as error:
        print('Error during file upload:', error)
        raise error

In [25]:
import base64

def main():
    code_interpreter = Sandbox(api_key=E2B_API_KEY)
    
    try:
        # First upload the dataset
        remote_path = upload_dataset(code_interpreter)
        print('Remote path of the uploaded dataset:', remote_path)
        
        # Then execute your analysis
        code_interpreter_results = chat_with_llm(
            code_interpreter,
            'Analyze the temperature data for the top 5 hottest cities globally. Create a visualization showing their average temperatures over the years.'
        )
        
        result = code_interpreter_results[0]
        print('Result:', result)
        if hasattr(result, 'png') and result.png:
            with open('temperature_analysis.png', 'wb') as f:
                f.write(base64.b64decode(result.png))
            print('Success: Image generated and saved as temperature_analysis.png')
        else:
            print('Error: No PNG data available.')
            
    except Exception as error:
        print('An error occurred:', error)
        raise error
    finally:
        code_interpreter.kill()

if __name__ == "__main__":
    main()

Uploading dataset to Code Interpreter sandbox...
Uploaded at EntryInfo(name='city_temperature.csv', type='file', path='/home/user/city_temperature.csv')
Remote path of the uploaded dataset: EntryInfo(name='city_temperature.csv', type='file', path='/home/user/city_temperature.csv')

User Message: Analyze the temperature data for the top 5 hottest cities globally. Create a visualization showing their average temperatures over the years.
Waiting for the LLM to respond...

Initial Response: ChatCompletionMessage(content=None, refusal=None, role='assistant', audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_OTCrG066LQJYJ8vBrYkmuPaJ', function=Function(arguments='{"code":"import pandas as pd\\nimport matplotlib.pyplot as plt\\nimport seaborn as sns\\n\\n# 1) Read the data\\nfilepath = \'/home/user/city_temperature.csv\'\\ndf = pd.read_csv(filepath)\\n\\n# 2) Convert the AvgTemperature column to numeric if needed\\n#   (assuming it\'s already numeric in Celsiu