# Part 15

# Using Code Interpreter

Universal code for the entire notebook

In [3]:
# Uncomment the line below to make sure you have all the packages needed
# %pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [1]:
# Import necessary libraries
from openai import OpenAI  # Used for interacting with OpenAI's API
from typing_extensions import override  # Used for overriding methods in subclasses
from openai import AssistantEventHandler  # Used for handling events related to OpenAI assistants

In [2]:
# Create an instance of the OpenAI class to interact with the API.
# This assumes you have set the OPENAI_API_KEY environment variable.
client = OpenAI() 

In [20]:
# Event handler class to handle events related to streaming output from the assistant
class EventHandler(AssistantEventHandler):
    @override
    def on_text_created(self, text) -> None:
        print(f"\nASSISTANT MESSAGE >\n", end="", flush=True)

    @override
    def on_tool_call_created(self, tool_call):
        print(f"\nASSISTANT MESSAGE >\n{tool_call.type}\n", flush=True)

    @override
    def on_message_done(self, message) -> None:
        # print a citation to the file searched
        message_content = message.content[0].text
        annotations = message_content.annotations
        citations = []
        for index, annotation in enumerate(annotations):
            message_content.value = message_content.value.replace(
                annotation.text, f"[{index}]"
            )
            if file_citation := getattr(annotation, "file_citation", None):
                cited_file = client.files.retrieve(file_citation.file_id)
                citations.append(f"[{index}] {cited_file.filename}")

        print(message_content.value)
        print("\n".join(citations))

## Creating an Assistant with Code Interpreter Enabled

Our first step is to create an Assistant that can use Code Interpreter

In [4]:
# Create an assistant using the client library.
assistant = client.beta.assistants.create(
    model="gpt-4o",  # Specify the model to be used.
    
    instructions=""" 
        You are a helpful assistant.
    """,
    
    name="Code Interpreter Assistant",  # Give the assistant a name.
    
    tools=[{"type": "code_interpreter"}], # Add the code interpreter capability to the assistant.
    
    metadata={  # Add metadata about the assistant's capabilities.
        "can_be_used_for_code_analysis": "True",
        "can_do_python": "True",
    },
    temperature=1,  # Set the temperature for response variability.
    top_p=1,  # Set the top_p for nucleus sampling.
)

# Print the details of the created assistant to check its properties.
print(assistant)  # Print the full assistant object.
print("\n\n")
print(assistant.name)  # Print the name of the assistant.
print(assistant.metadata)  # Print the metadata of the assistant.

Assistant(id='asst_GtoNnhX1Wkgpsr4SfxZ5xn6Y', created_at=1717707009, description=None, instructions=' \n        You are a helpful assistant.\n    ', metadata={'can_be_used_for_code_analysis': 'True', 'can_do_python': 'True'}, model='gpt-4o', name='Code Interpreter Assistant', object='assistant', tools=[CodeInterpreterTool(type='code_interpreter')], response_format='auto', temperature=1.0, tool_resources=ToolResources(code_interpreter=ToolResourcesCodeInterpreter(file_ids=[]), file_search=None), top_p=1.0)



Code Interpreter Assistant
{'can_be_used_for_code_analysis': 'True', 'can_do_python': 'True'}


## Passing Files to Code Interpreter

There are a variety of ways to get files for Code Interpreter to use. 
- Assistant files - viewable by all runs that use the assistant.
- Thread files - only viewable by runs that use the thread. 

Let's review the code for the two main approaches.

### Getting Files to the Assistant

First, you have to have a file that has been uploaded so we can pass it to our assistant.

In [6]:
# Upload a file with an "assistants" purpose
assistant_file = client.files.create(
    file=open("./artifacts/penguins_size.csv", "rb"),
    purpose='assistants'
)

print(assistant_file)

FileObject(id='file-aubKXKYvgIqRGBNIIqGp5tkA', bytes=13519, created_at=1717716265, filename='penguins_size.csv', object='file', purpose='assistants', status='processed', status_details=None)


Next, we need to modify our Assistant with the new file information. 

In [8]:
assistant = client.beta.assistants.update(
    assistant_id=assistant.id,
    tools=[{"type": "code_interpreter"}],
    tool_resources={
        "code_interpreter": {
            "file_ids": [assistant_file.id]
        }
    }
)

print(assistant)

Assistant(id='asst_GtoNnhX1Wkgpsr4SfxZ5xn6Y', created_at=1717707009, description=None, instructions=' \n        You are a helpful assistant.\n    ', metadata={'can_be_used_for_code_analysis': 'True', 'can_do_python': 'True'}, model='gpt-4o', name='Code Interpreter Assistant', object='assistant', tools=[CodeInterpreterTool(type='code_interpreter')], response_format='auto', temperature=1.0, tool_resources=ToolResources(code_interpreter=ToolResourcesCodeInterpreter(file_ids=['file-aubKXKYvgIqRGBNIIqGp5tkA']), file_search=None), top_p=1.0)


Finally, let's run a message and see if it is working.

In [21]:
# Need a thread to send message and get output
assistant_thread = client.beta.threads.create(
    messages=[
        {
            "role": "user",
            "content": "Give me a summary of the file penguins_size.csv."
        },
    ]
)

In [25]:
# stream the output from the assistant
with client.beta.threads.runs.stream(
    thread_id=assistant_thread.id,
    assistant_id=assistant.id,
    event_handler=EventHandler(),
) as stream:
    stream.until_done()


ASSISTANT MESSAGE >
Here is a concise summary of the penguin data:

- **Total Entries:** 344
- **Columns and non-null counts:**
  - `species`: 344 (Three unique species: Adelie, Chinstrap, Gentoo)
  - `island`: 344 (Three unique islands: Biscoe, Dream, Torgersen)
  - `culmen_length_mm`: 342 (length of the bill, mean: 43.92 mm, range: 32.1 - 59.6 mm)
  - `culmen_depth_mm`: 342 (depth of the bill, mean: 17.15 mm, range: 13.1 - 21.5 mm)
  - `flipper_length_mm`: 342 (length of the flipper, mean: 200.92 mm, range: 172 - 231 mm)
  - `body_mass_g`: 342 (body mass, mean: 4201.75 g, range: 2700 - 6300 g)
  - `sex`: 334 (MALE, FEMALE)

The dataset contains measurements related to penguins' physical features from different islands. The data includes some missing values in the `culmen_length_mm`, `culmen_depth_mm`, `flipper_length_mm`, `body_mass_g`, and `sex` columns.

Would you like any specific analysis or visualization of this data?



### Getting Files to the Thread

First, we need a file uploaded.


In [9]:
# Upload a file with an "assistants" purpose
thread_file = client.files.create(
    file=open("./artifacts/daily-bike-share.csv", "rb"),
    purpose='assistants'
)

print(thread_file)

FileObject(id='file-fEyocpUVgvY1tdBm8VJUwYfe', bytes=43599, created_at=1717717460, filename='daily-bike-share.csv', object='file', purpose='assistants', status='processed', status_details=None)


Second, we need a thread to attach the file to

In [26]:
thread = client.beta.threads.create(
    messages=[
        {
            "role": "user",
            "content": "Give me a summary of the daily-bike-share.csv file."
        },
    ]
)

print(thread)

Thread(id='thread_kmjCFINMwQwy57imb6Fw3io5', created_at=1717719858, metadata={}, object='thread', tool_resources=ToolResources(code_interpreter=None, file_search=None))


Third, we can update the thread with the file information

In [27]:
updated_thread = client.beta.threads.update(
    thread_id=thread.id,
    tool_resources={
        "code_interpreter": {
            "file_ids": [thread_file.id]
        }
    }
)

print(updated_thread)

Thread(id='thread_kmjCFINMwQwy57imb6Fw3io5', created_at=1717719858, metadata={}, object='thread', tool_resources=ToolResources(code_interpreter=ToolResourcesCodeInterpreter(file_ids=['file-fEyocpUVgvY1tdBm8VJUwYfe']), file_search=None))


Finally, let's run it against a new assistant and see the results

In [31]:
# Create an assistant using the client library.
thread_assistant = client.beta.assistants.create(
    model="gpt-4o",  # Specify the model to be used.
    
    instructions=""" 
        You are a helpful assistant.
    """,
    
    name="Code Interpreter Assistant Using Thread Data",  # Give the assistant a name.
    
    tools=[{"type": "code_interpreter"}], # Add the code interpreter capability to the assistant.
    
    metadata={  # Add metadata about the assistant's capabilities.
        "can_be_used_for_code_analysis": "True",
        "can_do_python": "True",
    },
    temperature=1,  # Set the temperature for response variability.
    top_p=1,  # Set the top_p for nucleus sampling.
)

# Print the details of the created assistant to check its properties.
print(assistant)  # Print the full assistant object.
print("\n\n")
print(assistant.name)  # Print the name of the assistant.
print(assistant.metadata)  # Print the metadata of the assistant.

Assistant(id='asst_ku6IFnWYEXdVbtf6PMOnenz5', created_at=1717719872, description=None, instructions=' \n        You are a helpful assistant.\n    ', metadata={'can_be_used_for_code_analysis': 'True', 'can_do_python': 'True'}, model='gpt-4o', name='Code Interpreter Assistant Using Thread Data', object='assistant', tools=[CodeInterpreterTool(type='code_interpreter')], response_format='auto', temperature=1.0, tool_resources=ToolResources(code_interpreter=ToolResourcesCodeInterpreter(file_ids=[]), file_search=None), top_p=1.0)



Code Interpreter Assistant Using Thread Data
{'can_be_used_for_code_analysis': 'True', 'can_do_python': 'True'}


In [32]:
# stream the output from the assistant
with client.beta.threads.runs.stream(
    thread_id=updated_thread.id,
    assistant_id=thread_assistant.id,
    event_handler=EventHandler(),
) as stream:
    stream.until_done()


ASSISTANT MESSAGE >
code_interpreter


ASSISTANT MESSAGE >
Here's a summary of the `daily-bike-share.csv` file:

- **day**: Day of the month
  - Total entries: 731
  - Mean: 15.74
  - Standard Deviation: 8.81
  - Min: 1
  - Max: 31

- **mnth**: Month (1 to 12)
  - Total entries: 731
  - Mean: 6.52
  - Standard Deviation: 3.45
  - Min: 1
  - Max: 12

- **year**: Year (2011 or 2012)
  - Total entries: 731
  - Mean: 2011.50
  - Standard Deviation: 0.5
  - Min: 2011
  - Max: 2012

- **season**: Season (1: winter, 2: spring, 3: summer, 4: fall)
  - Total entries: 731
  - Mean: 2.50
  - Standard Deviation: 1.11
  - Min: 1
  - Max: 4

- **holiday**: Whether the day is a holiday (0: No, 1: Yes)
  - Total entries: 731
  - Mean (percentage of holidays): ~2.87%
  - Standard Deviation: 0.17 (most days are not holidays)
  - Min: 0
  - Max: 1

- **weekday**: Day of the week (0 to 6)
  - Total entries: 731
  - Mean: 3
  - Standard Deviation: 2.00
  - Min: 0 (Sunday)
  - Max: 6 (Saturday)

- **workin