<a href="https://colab.research.google.com/github/sensationalspace/colab/blob/main/openai_assistants_files_api_based_rag.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# OpenAI Assistants and Files API based RAG Application

**Assistants API** allows developers to build AI assistants that can handle the stateful operations required for LLM based applications, including persistent threads and messages, files, and automatic RAG.

An Assistant has instructions and can leverage models, tools, and knowledge to respond to user queries. The Assistants API currently supports three types of tools:

- Code Interpreter
- Retrieval
- Function calling

Notice that, when we give `Assistants` access to OpenAI-hosted tools listed above, the usage of the tools comes at an additional fee.

**RAG** (Retrieval Augmented Generation) is a technique used in natural language processing that employes the capabilities of retrieval-based models and generative models to improve the quality and relevance of generated text.

Document based QA bot is a classic use case of RAG. The mainstream LLM frameworks for example `LangChain` and `LlamaIndex` support building such RAG application.

In this tutorial, I will show you how to develop a RAG application with OpenAI `Assistants` and `Files` API. The code may be cleaner, the solution may be more elegant, and it may cost a bit more, considering the extra charge for using OpenAI hosted tool - **`Retrieval`**.

With this solution, you don't have to deal with the following tedious operations:

- Split text with proper strategy
- Vectorize text chunks
- Persist vector data set
- Similarity search


## Prepare the environment

In order to run the example code below, make sure you have a valid OpenAI API key with access to the model `gpt-4-1106-preview`.

You also should have the `.env` file ready in current directory with the content in the pattern below:

```shell
OPENAI_API_KEY=sk-xxxxxx
```

Now let's install the necessary Python packages.

In [1]:
!pip install openai python-dotenv -U -q

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m311.2/311.2 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m6.5 MB/s[0m eta [36m0:00:00[0m
[?25h

## Coding Time

We can start coding. In this example, we will use the following PDF file as the knowledge base:

[2023 Venture Capital Report](https://www.wilmerhale.com/-/media/files/shared_content/editorial/publications/documents/2023-wilmerhale-vc-report.pdf)

This is a Venture Capital report of 2023 issued by WilmerHale.

### 1. Load

In [2]:
from dotenv import load_dotenv
load_dotenv()

import os

In [4]:
from google.colab import userdata
openai_api_key = userdata.get('OPENAI_API_KEY')

In [3]:
openai_api_key = os.environ["OPENAI_API_KEY"]

KeyError: 'OPENAI_API_KEY'

In [5]:
import openai

# You don't have to explicitly fetch the API key from environmental variables and assign it.
# OpenAI class will load it from env var OPENAI_API_KEY automatically.

openai_client = openai.OpenAI(api_key=openai_api_key)

### 2. Retrieve existing files

The files uploaded via `OpenAI Files API` are persisted. They can be surely reused by referring to their ids.

In this step, we will retrieve the files already uploaded and see if the PDF file `2023-WilmerHale-VC-Report.pdf` exists.

If so, let's delete and upload again.

In [6]:
# Retrieve the file list

uploaded_files = openai_client.files.list()

In [7]:
uploaded_files.data

[FileObject(id='file-OBCZJllc4DvZQw1COreqhT0M', bytes=1365871, created_at=1713637948, filename='Campbell Soup 10-k.pdf', object='file', purpose='assistants', status='processed', status_details=None),
 FileObject(id='file-ukOWB1ivgQxx8D66UEXb99LL', bytes=1365871, created_at=1711210141, filename='Campbell_Soup_10k_2023.pdf', object='file', purpose='assistants', status='processed', status_details=None),
 FileObject(id='file-cOs8oTdD8tH3JMVS0Y5IwyfP', bytes=2032131, created_at=1708867333, filename='Smucker_10k_2023.pdf', object='file', purpose='assistants', status='processed', status_details=None),
 FileObject(id='file-CjM8XqjUfrsQG4MAU88YVSMQ', bytes=1076676, created_at=1708867333, filename='Pepsi_10k_2022.pdf', object='file', purpose='assistants', status='processed', status_details=None),
 FileObject(id='file-9WXU1TptyAtUGiPFbOQzo8fC', bytes=2299116, created_at=1708867332, filename='Nestle_10k_2022.pdf', object='file', purpose='assistants', status='processed', status_details=None),
 File

In [None]:
# Find the file by name

filename_to_find = '2023-WilmerHale-VC-Report.pdf'
the_file_id = None

file_objects = list(filter(lambda x: x.filename == filename_to_find, uploaded_files.data))

if len(file_objects) > 0:
  the_file_id = file_objects[0].id

In [None]:
the_file_id

'file-1zy5eXDx5nDotjUp4tybtNGl'

Delete if it already exists.

Notice that, this is for demonstration purpose.

In [None]:
if the_file_id:
  delete_status = openai_client.files.delete(the_file_id)
  delete_status

### 2. Upload the PDF file

This PDF file will be the knowledge base of the RAG application.

In [None]:
file = openai_client.files.create(
  file=open("2023-WilmerHale-VC-Report.pdf", "rb"),
  purpose='assistants'
)

In [None]:
file

FileObject(id='file-p1e3aGURKtzkR83g91YUTBhn', bytes=1310948, created_at=1700171663, filename='2023-WilmerHale-VC-Report.pdf', object='file', purpose='assistants', status='processed', status_details=None)

### 3. Retrieve the file by id

This is to make sure the file is successfully uploaded.

In [None]:
retrieved_file = openai_client.files.retrieve(file.id)
retrieved_file

FileObject(id='file-p1e3aGURKtzkR83g91YUTBhn', bytes=1310948, created_at=1700171663, filename='2023-WilmerHale-VC-Report.pdf', object='file', purpose='assistants', status='processed', status_details=None)

### 4. Create an Assistant

This Assistant will use the tool `Retrieval` and get associated with the PDF file uploaded by its id.

In [None]:
assistant = openai_client.beta.assistants.create(
  instructions="Use the file provided as your knowledge base to best respond to customer queries.",
  model="gpt-4-1106-preview",
  tools=[
      { "type": "retrieval" }
    ],
  file_ids=[retrieved_file.id]
)

### 5. Retrieve the created Assistant

It's retrieved by the assistant id.

We must make sure in the response, we can see expected tool and file are associated with it.

In [None]:
my_assistant = openai_client.beta.assistants.retrieve(assistant.id)
my_assistant

Assistant(id='asst_YYN7UKBkyuFs5UghtNfzxO3z', created_at=1700171729, description=None, file_ids=['file-p1e3aGURKtzkR83g91YUTBhn'], instructions='Use the file provided as your knowledge base to best respond to customer queries.', metadata={}, model='gpt-4-1106-preview', name=None, object='assistant', tools=[ToolRetrieval(type='retrieval')])

### 6. (Optional) Update the Assistant

I noticed once that the created assistant didn't have the tool and file associated.

When it happens to you, use the `update` function to associate them again.

In [None]:
updated_assistant = openai_client.beta.assistants.update(
  assistant.id,
  tools=[{"type": "retrieval"}],
  file_ids=[retrieved_file.id],
)

updated_assistant

Assistant(id='asst_YYN7UKBkyuFs5UghtNfzxO3z', created_at=1700171729, description=None, file_ids=['file-p1e3aGURKtzkR83g91YUTBhn'], instructions='Use the file provided as your knowledge base to best respond to customer queries.', metadata={}, model='gpt-4-1106-preview', name=None, object='assistant', tools=[ToolRetrieval(type='retrieval')])

### 7. Create a Thread

In [None]:
thread = openai_client.beta.threads.create()

thread

Thread(id='thread_PIN1poM9aRX4sYjnnohlvZNa', created_at=1700171825, metadata={}, object='thread')

### 8. Create a Message

We are going to use a message object to request the Assistant to extract the content architecture out of the PDF file.

In [None]:
thread_message = openai_client.beta.threads.messages.create(
  thread_id=thread.id,
  role="user",
  content="Please show me the content architecture of this report",
)
thread_message

ThreadMessage(id='msg_r00GeAZ3q4zaQYxaYlGsjMmd', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='Please show me the content architecture of this report'), type='text')], created_at=1700171844, file_ids=[], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_PIN1poM9aRX4sYjnnohlvZNa')

### 9. Create a Run

A `Run` triggers the interaction to the LLM.

In [None]:
run = openai_client.beta.threads.runs.create(
  thread_id=thread.id,
  assistant_id=updated_assistant.id
)

### 10. Retrieve the Run

The `Run` is done in async mode, so we need to query the status of the Run by id.

In [None]:
retrieved_run = openai_client.beta.threads.runs.retrieve(
  thread_id=thread.id,
  run_id=run.id
)

In [None]:
retrieved_run

Run(id='run_uGwTGAemL5rqFwWlHc02Wxwl', assistant_id='asst_YYN7UKBkyuFs5UghtNfzxO3z', cancelled_at=None, completed_at=1700171869, created_at=1700171854, expires_at=None, failed_at=None, file_ids=['file-p1e3aGURKtzkR83g91YUTBhn'], instructions='Use the file provided as your knowledge base to best respond to customer queries.', last_error=None, metadata={}, model='gpt-4-1106-preview', object='thread.run', required_action=None, started_at=1700171854, status='completed', thread_id='thread_PIN1poM9aRX4sYjnnohlvZNa', tools=[ToolAssistantToolsRetrieval(type='retrieval')])

### 11. Retrieve the message list of the Thread

Until the Run is completed, retrieve the message list and fetch the latest message of the list which is the response of the LLM.

In [None]:
thread_messages = openai_client.beta.threads.messages.list(thread.id)
thread_messages.data

[ThreadMessage(id='msg_UYQX4LrKPLpVNjhjBQRUGO07', assistant_id='asst_YYN7UKBkyuFs5UghtNfzxO3z', content=[MessageContentText(text=Text(annotations=[TextAnnotationFileCitation(end_index=817, file_citation=TextAnnotationFileCitationFileCitation(file_id='file-p1e3aGURKtzkR83g91YUTBhn', quote='2023 Venture Capital Report – What’s Inside\n\n\n2  US Market Review and Outlook\n\n\n6  Regional Market Review and Outlook\n\n\n10 Selected WilmerHale Venture Capital Financings\n\n\n12 New Law Requires Federal Reporting of Private Company Ownership\nMany Startups and Life Sciences Companies Will be Subject to Beneficial Ownership\nReporting\n\n\n13 Show Me the Money\nWhat Employers Need to Know About New Salary Disclosure Laws\n\n\n14 Navigating the Quiet Period Shoals\nSafe Harbors Aid Compliance With Quiet Period Requirements\n\n\n16 State Taxation of Qualified Small Business Stock\nFederal Tax Exclusion Not Always Replicated at State Level\n\n\n17 Trends in VC-Backed Company M&A Deal Terms\n\n\n1

In [None]:
print(thread_messages.data[0].content[0].text.value)

The content architecture of the 2023 Venture Capital Report includes the following sections:

1. US Market Review and Outlook
2. Regional Market Review and Outlook
3. Selected WilmerHale Venture Capital Financings
4. New Law Requires Federal Reporting of Private Company Ownership (Many Startups and Life Sciences Companies Will be Subject to Beneficial Ownership Reporting)
5. Show Me the Money (What Employers Need to Know About New Salary Disclosure Laws)
6. Navigating the Quiet Period Shoals (Safe Harbors Aid Compliance With Quiet Period Requirements)
7. State Taxation of Qualified Small Business Stock (Federal Tax Exclusion Not Always Replicated at State Level)
8. Trends in VC-Backed Company M&A Deal Terms
9. Trends in Convertible Note and SAFE Terms
10. Trends in Venture Capital Financing Terms【7†source】.
