# OpenAI Assistants - Building Agentic RAG with the Function Calling, Retrieval, and Code Interpreter Tools

Today we'll explore using OpenAI's Python SDK to create, manage, and use the OpenAI Assistant API!

We'll be doing the following in today's notebook:

1. Task 1: Simple Assistant
2. Task 2: Adding Tools
  - Task 2a: Creating an Assistant with File Search Tool
  - Task 2b: Creating an Assistant with Code Interpreter Tool
  - Task 2c: Creating an Assistant with Function Calling Tool



---

Colab Specific Instructions:

To get started, please make a copy of this notebook using `File > Save a copy in Drive`

![image](https://i.imgur.com/rNzMEfs.png)

You will be expected to submit a GitHub link to the completed notebook, so if you're completing the assignment on Colab - you'll want to download the notebook when you have completed it.

## Dependencies

We'll start, as we usually do, with some dependiencies and our API key!

In [1]:
!pip install -qU openai


[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: C:\Users\sanjeev\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [2]:
from getpass import getpass
import os

os.environ["OPENAI_API_KEY"] = getpass("OpenAI API Key:")

## Task 1: Simple Assistant

Let's create a simple Assistant to understand more about how the API works to start!

### OpenAI Client

At the core of the OpenAI Python SDK is the Client!

> NOTE: For ease of use, we'll start with the synchronous `OpenAI()`. OpenAI does provide an `AsyncOpenAI()` that you could leverage as well!

In [3]:
from openai import OpenAI

client = OpenAI()

### Creating An Assistant

Leveraging what we know about the OpenAI API from previous sessions - we're going to start by simply initializing an Assistant.

Before we begin, we need to think about a few customization options we have:

- `name` - Straight forward enough, this is what our Assistant's name will be
- `instructions` - similar to a system message, but applied at an Assistant level, this is how we can guide the Assistant's tone, behaviour, functionality, and more!
- `model` - this will allow us to choose which model we would prefer to use for our Assistant

Let's start by setting some instructions for our Assistant.



In [4]:
# @markdown #### 🏗️ Build Activity 🏗️
# @markdown Fill out the fields below to add your Assistant's name, instructions, and desired model!

name = "openAI-Assistant" # @param {type: "string"}
instructions = "you have been an awesome assistant, perform through validations of inputs" # @param {type: "string"}
model = "gpt-4o" # @param ["gpt-3.5-turbo", "gpt-4-turbo-preview", "gpt-4", "gpt-4o"]

### Initialize Assistant

Now that we have our desired name, instruction, and model - we can initialize our Assistant!

In [5]:
assistant = client.beta.assistants.create(
    name=name,
    instructions=instructions,
    model=model,
)

Let's examine our `assistant` object and see what we find!

In [6]:
assistant

Assistant(id='asst_g3AVZnuT0koO2fkvCwo3KcLt', created_at=1720936525, description=None, instructions='you have been an awesome assistant, perform through validations of inputs', metadata={}, model='gpt-4o', name='openAI-Assistant', object='assistant', tools=[], response_format='auto', temperature=1.0, tool_resources=ToolResources(code_interpreter=None, file_search=None), top_p=1.0)

There are a number of useful parameters here, but we'll call out a few:

- `id` - since we may have multiple Assistant's, knowing which Assistant we're interacting with will help us ensure the desired user experience!
- `description` - A natrual language description of our Assistant could help others understand what it's supposed to do!
- `file_ids` - if we wanted to use the Retrieval tool, this would let us know what files we had given our Assistant

### Creating a Thread

Behind the scenes our Assistant is powered by the idea of "threads".

You can think of threads as individual conversations that interact with the Assistant.

Let's create a thread now!

In [7]:
thread = client.beta.threads.create()

Let's look at our `thread` object.

In [8]:
thread

Thread(id='thread_mai08DKi5T8Q95mV3ZVd2M4Q', created_at=1720936564, metadata={}, object='thread', tool_resources=ToolResources(code_interpreter=None, file_search=None))

Notice some key attributes:

- `id` - since each Thread is like a conversation, we need some way to specify which thread we're dealing with when interacting with them
- `tool_resources` - this will become more relevant as we add tools since we'll need a way to verify which tools we have access to when interacting with our Assistant

### Adding Messages to Our Thread

Now that we have our Thread (or conversation) we can start adding messages to it!

Let's add a simple message that asks about how our Assistant is feeling.

Notice the parameters we're leveraging:

- `thread_id` - since each Thread is like a conversation, we need some way to address a specific conversation. We can use `thread.id` to do this.
- `role` - similar to when we used our chat completions endpoint, this parameter specifies who the message is coming from. You can leverage this in the same ways you would through the chat completions endpoint.
- `content` - this is where we can place the actual text our Assistant will interact with

> NOTE: Feel free to substitute a relevant message based on the Assistant you created

In [9]:
message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content=f"How old are you?"
)

Again, let's examine our `message` object!

In [10]:
message

Message(id='msg_2tqdxnIkpXE5bF1nM88t5lVN', assistant_id=None, attachments=[], completed_at=None, content=[TextContentBlock(text=Text(annotations=[], value='How old are you?'), type='text')], created_at=1720936648, incomplete_at=None, incomplete_details=None, metadata={}, object='thread.message', role='user', run_id=None, status=None, thread_id='thread_mai08DKi5T8Q95mV3ZVd2M4Q')

### Running Our Thread

Now that we have an Assistant, and we've given that Assistant a Thread, and we've added a Message to that Thread - we're ready to run our Assistant!

Notice that this process lets us add (potentially) multiple messages to our Assistant. We can leverage that behaviour for few/many-shot examples, and more!

In [11]:
# @markdown #### 🏗️ Build Activity 🏗️
# @markdown We can also override the Assistant's instructions when we run a thread.

# @markdown Use one of the [Prompt Principles for Instruction](https://arxiv.org/pdf/2312.16171v1.pdf) to improve the likeliehood of a correct or valuable response from your Assistant.

additional_instructions = "step by step instructions validations would be absolutely required" # @param {type: "string"}

Let's run our Thread!

In [12]:
run = client.beta.threads.runs.create(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions=additional_instructions
)

Now that we've run our thread, let's look at the object!

In [13]:
run

Run(id='run_xZ7dcuoznCKZ9bxorTjuhdfX', assistant_id='asst_g3AVZnuT0koO2fkvCwo3KcLt', cancelled_at=None, completed_at=None, created_at=1720936694, expires_at=1720937294, failed_at=None, incomplete_details=None, instructions='step by step instructions validations would be absolutely required', last_error=None, max_completion_tokens=None, max_prompt_tokens=None, metadata={}, model='gpt-4o', object='thread.run', parallel_tool_calls=True, required_action=None, response_format='auto', started_at=None, status='queued', thread_id='thread_mai08DKi5T8Q95mV3ZVd2M4Q', tool_choice='auto', tools=[], truncation_strategy=TruncationStrategy(type='auto', last_messages=None), usage=None, temperature=1.0, top_p=1.0, tool_resources={})

Notice we have access to a few very powerful parameters in this `run` object.

- `completed_at` - this will help us determine when we can expect to retrieve a response
- `failed_at` - this can highlight any issues our run ran into
- `status` - is another way we can understand how the flow is going

### Retrieving Our Run

Now that we've created our run, let's retrieve it.

We're going to wrap this in a simple loop to make sure we're not retrieving it too early.

In [14]:
import time

while run.status == "in_progress" or run.status == "queued":
  time.sleep(1)
  run = client.beta.threads.runs.retrieve(
    thread_id=thread.id,
    run_id=run.id
  )

In [15]:
print(run.status)

completed


Now that our run is completed - we can retieve the messages from our thread!

Notice that our run helps us understand how things are going - but it isn't where we're going to find our responses or messages. Those are added on the backend into our thread.

This leads to a simple, but important, flow:

1. We add messages to a thread.
2. We create a run on that thread.
3. We wait until the run is finished.
4. We check our thread for the new messages.

### Checking Our Thread

Now we can get a list of messages from our thread!

In [16]:
messages = client.beta.threads.messages.list(
  thread_id=thread.id
)

In [17]:
messages.data[0]

Message(id='msg_gosx71HgMM0VdTsXmS8T0UFf', assistant_id='asst_g3AVZnuT0koO2fkvCwo3KcLt', attachments=[], completed_at=None, content=[TextContentBlock(text=Text(annotations=[], value='I\'m an artificial intelligence and don\'t age in the same way humans do. I was created by OpenAI and have been trained on data up to October 2021, but I don\'t have a specific "birthday" or age. How can I assist you today?'), type='text')], created_at=1720936695, incomplete_at=None, incomplete_details=None, metadata={}, object='thread.message', role='assistant', run_id='run_xZ7dcuoznCKZ9bxorTjuhdfX', status=None, thread_id='thread_mai08DKi5T8Q95mV3ZVd2M4Q')

### Streaming Our Runs

With recent upgrades to the Assistant API - we can now *stream* our outputs!

In order to do this - we'll need something called an `EventHandler` which will help us to decide on what actions to take based on the output of the LLM.

Let's build it below!

In [18]:
from typing_extensions import override
from openai import AssistantEventHandler

class EventHandler(AssistantEventHandler):
  @override
  def on_text_created(self, text) -> None:
    print(f"\nassistant > ", end="", flush=True)

  @override
  def on_text_delta(self, delta, snapshot):
    print(delta.value, end="", flush=True)

Now we can create our `run` and stream the output as it comes in!

In [19]:
with client.beta.threads.runs.stream(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions=additional_instructions,
  event_handler=EventHandler(),
) as stream:
  stream.until_done()


assistant > I'm an artificial intelligence and don't age in the same way humans do. My training data goes up until October 2023, but I don't have a specific "birthday" or age. How can I assist you today?

## Task 2: Adding Tools

Now that we have an understanding of how Assistant works, we can start thinking about adding tools.

We'll go through 3 separate tools and explore how we can leverage them!

Let's start with the most familiar tool - the Retriever!


### Task 2a: Creating an Assistant with the File Search Tool

The first thing we'll want to do is create an assistant with the File Search tool.

This is also going to require some data. We'll provided data - but you're very much encouraged to use your own files to explore how the Assistant works for your use case.

#### Collect and Add Data to Vector Store

1. First, we need some data.
2. Second, we need to add the data to our Assistant!

Let's start with grabbing some data!

In [20]:
pip install wget

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: C:\Users\sanjeev\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [21]:
!wget https://github.com/dbredvick/paul-graham-to-kindle/blob/main/paul_graham_essays.txt

'wget' is not recognized as an internal or external command,
operable program or batch file.


Now we can upload our file to our Vector Store!

Pay attention to [this](https://platform.openai.com/docs/assistants/tools/file-search/supported-files) documentation to see what kinds of files can be uploaded.

> NOTE: Per the OpenAI [docs](https://platform.openai.com/docs/assistants/tools/file-search/vector-stores) The maximum file size is 512 MB and no more than 2,000,000 tokens (computed automatically when you attach a file)

In [22]:
vector_store = client.beta.vector_stores.create(name="Paul Graham Essay Compilation")

In [23]:
file_paths = ["paul_graham_essays.txt"]
file_streams = [open(path, "rb") for path in file_paths]

file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
  vector_store_id=vector_store.id, files=file_streams
)

Let's look at what our `file_batch` contains!

In [24]:
while file_batch.status != "completed":
  time(1)

#### Create and Use Assistant

Now that we have our file - we can attach it to an Assistant, and we can give that Assistant the ability to use it for retrieval through the Retrieval tool!

> NOTE: Your first GB is free and beyond that, usage is billed at $0.10/GB/day of vector storage. There are no other costs associated with vector store operations.

In [25]:
fs_assistant = client.beta.assistants.create(
  name=name,
  instructions=instructions,
  model=model,
  tools=[{"type": "file_search"}],
)

In [26]:
fs_assistant = client.beta.assistants.update(
  assistant_id=fs_assistant.id,
  tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
)

In [27]:
fs_thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": "What did Paul Graham say about Silicon Valley?",
    }
  ]
)

We can use an extension of the `EventHandler` that we created above to stream our `run`!

Let's add a few things:

1. A `on_tool_call_created` function which tells us which tool is being used.
2. A `on_message_done` that includes citations that were used by our File Search tool - this is like return the context *and* the response that we saw in our Pythonic RAG implementation.

In [29]:
class FSEventHandler(AssistantEventHandler):
  @override
  def on_text_created(self, text) -> None:
    print(f"\nassistant > ", end="", flush=True)

  @override
  def on_tool_call_created(self, tool_call):
    print(f"\nassistant > {tool_call.type}\n", flush=True)

  @override
  def on_message_done(self, message) -> None:
    message_content = message.content[0].text
    annotations = message_content.annotations
    citations = []
    for index, annotation in enumerate(annotations):
      message_content.value = message_content.value.replace(
        annotation.text, f"[{index}]"
      )
      if file_citation := getattr(annotation, "file_citation", None):
        cited_file = client.files.retrieve(file_citation.file_id)
        citations.append(f"[{index}] {cited_file.filename}")

    print(message_content.value)
    print("\n".join(citations))

Let's look at the final result!

In [30]:
with client.beta.threads.runs.stream(
  thread_id=fs_thread.id,
  assistant_id=fs_assistant.id,
  event_handler=FSEventHandler(),
) as stream:
  stream.until_done()


assistant > file_search


assistant > Paul Graham has emphasized several key points about Silicon Valley in his writings:

1. **Startup Hub Superiority**: Graham argues that startups would generally perform better if they moved to Silicon Valley as it offers a more conducive environment for growth and success compared to other cities, including other major U.S. cities like Boston. He explains that this is not a nationalistic view, but a pragmatic one based on the advantages of being in a place where there is extensive infrastructure and a dense concentration of knowledgeable people and resources specific to startup needs[0][1].

2. **Unique Culture**: The unique cultural environment of Silicon Valley encourages innovation and support among startups. Unlike other industries where competition might mean secrecy and cutthroat tactics, the culture in Silicon Valley is more collaborative, with a high degree of information sharing and mutual aid among entrepreneurs. This collaborative natur

### Task 2b: Creating an Assistant with the Code Interpreter Tool

Now that we've explored the Retrieval Tool - let's try the Code Interpreter tool!

The process will be almost exactly the same - but we can explore a different query, and we'll add our file at the Message level!

In [31]:
ci_assistant = client.beta.assistants.create(
  name=name + "+ Code Interpreter",
  instructions=instructions,
  model=model,
  tools=[{"type": "code_interpreter"}],
)

In [32]:
!git clone https://github.com/ali-ce/datasets.git

fatal: destination path 'datasets' already exists and is not an empty directory.


In [33]:
file = client.files.create(
  file=open("datasets/Y-Combinator/Startups.csv", "rb"),
  purpose='assistants'
)

In the following example, we'll also see how we can package the Thread creation with the Message adding step!

> NOTE: Files added at the message/thread level will not be available to the Assistant outside of that Thread.

In [34]:
ci_thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": "What kind of file is this?",
      "attachments": [
          {
              "file_id" : file.id,
              "tools" : [{"type" : "code_interpreter"}]
          }
      ]
    }
  ]
)

We'll once again need to create an `EventHandler`, except this time it will have an `on_tool_call_delta` method which will let us see the output of the code interpreter tool as well!

> NOTE: Remember that we create runs at the *thread* level - and so don't need the message object to continue.

In [35]:
class CIEventHandler(AssistantEventHandler):
  @override
  def on_text_created(self, text) -> None:
    print(f"\nassistant > ", end="", flush=True)

  @override
  def on_text_delta(self, delta, snapshot):
    print(delta.value, end="", flush=True)

  def on_tool_call_created(self, tool_call):
    print(f"\nassistant > {tool_call.type}\n", flush=True)

  def on_tool_call_delta(self, delta, snapshot):
    if delta.type == 'code_interpreter':
      if delta.code_interpreter.input:
        print(delta.code_interpreter.input, end="", flush=True)
      if delta.code_interpreter.outputs:
        print(f"\n\noutput >", flush=True)
        for output in delta.code_interpreter.outputs:
          if output.type == "logs":
            print(f"\n{output.logs}", flush=True)

Once again, we can use the streaming interface thanks to creating our `EventHandler`!

In [36]:
with client.beta.threads.runs.stream(
  thread_id=ci_thread.id,
  assistant_id=ci_assistant.id,
  instructions=additional_instructions,
  event_handler=CIEventHandler(),
) as stream:
  stream.until_done()


assistant > code_interpreter

import mimetypes

# Determine the type of the uploaded file
file_path = "/mnt/data/file-evZbY2DKmFeLNxzrz78hXEsO"
file_type, _ = mimetypes.guess_type(file_path)

file_type
assistant > It appears that the file type could not be determined using the standard method. Let's read the file in binary mode and analyze the first few bytes to make an educated guess about its format.# Open the file in binary mode and read the first few bytes
with open(file_path, "rb") as file:
    file_header = file.read(20)

file_header
assistant > The file header suggests that this is a text-based file, likely a CSV (Comma-Separated Values) file, based on the pattern of data visible: "Company,Satus,Year F". 

Let's read the first few rows to confirm its structure.import pandas as pd

# Read the first few rows of the file to inspect its structure
df = pd.read_csv(file_path)
df.head()
assistant > The file is indeed a CSV (Comma-Separated Values) file. It contains data related to var

And there you go!

We've fit our Assistant with an awesome Code Interpreter that lets our Assistant run code on our provided files!

### Task 2c: Creating an Assistant with a Function Calling Tool

Let's finally create an Assistant that utilizes the Function Calling API.

We'll start by creating a function that we wish to be called.

We'll utilize DuckDuckGo search to allow our Assistant to have the most up to date information!

In [37]:
!pip install -qU duckduckgo_search


[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: C:\Users\sanjeev\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [38]:
from duckduckgo_search import DDGS

def duckduckgo_search(query):
  with DDGS() as ddgs:
    results = [r for r in ddgs.text(query, max_results=5)]
    return "\n".join(result["body"] for result in results)

Let's test our function to make sure it behaves as we expect it to.

In [39]:
duckduckgo_search("Who is the current captain of the Indian Team?")

"Hardik Himanshu Pandya (born 11 October 1993) is an Indian international cricketer who is the current captain of the Indian cricket team in T20I and vice-captain in One Day International (ODI). He is the captain of Mumbai Indians in IPL.A batting all-rounder who bowls right-arm fast-medium deliveries, Pandya has represented India in all 3 formats. He occasionally plays for his regional team ...\nODI[edit] This is a list of cricketers who have captained the Indian men's cricket team for at least one ODI. 27 players have captained the Indian men's team in ODIs of which MS Dhoni is the most successful captain of men's cricket team with 110 wins. Kapil Dev captained India to win in the 1983 Cricket World Cup, the first ever ICC trophy and ...\nRohit Gurunath Sharma is an Indian international cricketer and the current captain of the Indian cricket team in the Test and One-Day International (ODI) formats. He was the captain of India's Twenty20 International team until he announced his retir

Now we need to express how our function works in a way that is compatible with the OpenAI Function Calling API.

We'll want to provide a `JSON` object that includes what parameters we have, how to call them, and a short natural language description.

In [40]:
ddg_function = {
    "name" : "duckduckgo_search",
    "description" : "Answer non-technical questions. ",
    "parameters" : {
        "type" : "object",
        "properties" : {
            "query" : {
                "type:" : "string",
                "description" : "The search query to use. For example: 'Who is the current wicketkeeper of the India T20 Cricket Team?'"
            }
        },
        "required" : ["query"]
    }
}

####❓ Question

Why does the description key-value pair matter?

Function Specification is written in json and function is parsed and only particular keys such as description is included.
Its a key values because it tells to the assistant exactly what the function is meant to achieve and return

Now when we create our Assistant - we'll want to include the function description as a tool using the following format.

In [41]:
fc_assistant = client.beta.assistants.create(
    name=name + " + Function Calling",
    instructions=instructions,
    tools=[
        {"type": "function",
         "function" : ddg_function
        }
    ],
    model=model
)

Now we can create our thread, and attach our message to it - just as we've been doing!

In [42]:
fc_thread = client.beta.threads.create()
fc_message = client.beta.threads.messages.create(
  thread_id=fc_thread.id,
  role="user",
  content="Can you describe the Twitter beef between Elon and LeCun?",
)

Once again, we'll need an `EventHandler` for the streaming response - notice that this time we're utilizing a few new methods:

1. `handle_requires_action` - this will handle whatever action needs to take place in *our local environment*.
2. `submit_tool_outputs` - this will let us submit the resultant outputs from our local function call back to the Assistant run in the required format.

To be very clear and explicit - we'll be following this pattern:

1. Make a call to the LLM which will decide if a local function call is required.
2. If a local function call is required - send a response that indicates a local function call is required.
3. Call the function using the arguments provided by the LLM.
4. Return the response from the function call with LLM provided arguements.
5. Receive a response based on the output of the functional call from the LLM.

In [43]:
class FCEventHandler(AssistantEventHandler):
  @override
  def on_event(self, event):
    # Retrieve events that are denoted with 'requires_action'
    # since these will have our tool_calls
    if event.event == 'thread.run.requires_action':
      run_id = event.data.id  # Retrieve the run ID from the event data
      self.handle_requires_action(event.data, run_id)

  def handle_requires_action(self, data, run_id):
    tool_outputs = []

    for tool in data.required_action.submit_tool_outputs.tool_calls:
      print(tool.function.arguments)
      if tool.function.name == "duckduckgo_search":
        tool_outputs.append({"tool_call_id": tool.id, "output": duckduckgo_search(tool.function.arguments)})

    # Submit all tool_outputs at the same time
    self.submit_tool_outputs(tool_outputs, run_id)

  def submit_tool_outputs(self, tool_outputs, run_id):
    # Use the submit_tool_outputs_stream helper
    with client.beta.threads.runs.submit_tool_outputs_stream(
      thread_id=self.current_run.thread_id,
      run_id=self.current_run.id,
      tool_outputs=tool_outputs,
      event_handler=EventHandler(),
    ) as stream:
      for text in stream.text_deltas:
        print(text, end="", flush=True)
      print()

Thanks to our event handler - we can stream this process in our notebook!

In [44]:
with client.beta.threads.runs.stream(
  thread_id=fc_thread.id,
  assistant_id=fc_assistant.id,
  event_handler=FCEventHandler()
) as stream:
  stream.until_done()

{"query":"Twitter beef between Elon Musk and Yann LeCun"}

assistant > TheThe Twitter Twitter dispute dispute between between Elon Elon Musk Musk and and Yann Yann Le LeCCunun,, prominent prominent voices voices in in the the tech tech industry industry,, centers centers around around differing differing perspectives perspectives on on artificial artificial intelligence intelligence ( (AIAI)) safety safety and and development development.

.

11.. ** **AIAI Safety Safety Conc Concernserns****:: Elon Elon Musk Musk,, the the CEO CEO of of companies companies like like Tesla Tesla and and Space SpaceXX,, has has historically historically been been vocal vocal about about the the potential potential threats threats posed posed by by advanced advanced AI AI.. He He has has frequently frequently called called for for regulatory regulatory oversight oversight to to ensure ensure AI AI doesn't doesn't develop develop in in ways ways that that could could harm harm humanity humanity.

.

22.. 