# OpenAI Assistants APIs

The Assistants' API lets you create AI assistants in your applications. These assistants follow instruction. They use models, tools, and knowledge to answer user questions. In this notebook we are going to use one of the tools, retriever, to query against two pdf documents we will upload.

The architecture and data flow diagram below depicts the interaction among all components that comprise OpenAI Assistant APIs. Central to understand is the Threads and Runtime that executes asynchronously, adding and reading messages to the Threads.

For integrating the Assistants API:

1. Creat an Assistant with custom instructions and select a model. Optionally, enable tools like Code Interpreter, Retrieval, and Function Calling.

2. Initiate a Thread for each user conversation.
    
3. Add user queries as Messages to the Thread.

4.  Run the Assistant on the Thread for responses, which automatically utilizes the enabled tools

Below we follow those steps to demonstrate how to integrate Assistants API, using Retrieval tool, to a) upload a couple of pdf documents and b) use Assistant to query the contents of the document. Consider this as a mini Retrieval Augmented Generation (RAG). 

The OpenAI documentation describes in details [how Assistants work](https://platform.openai.com/docs/assistants/how-it-works).

<img src="./images/assistant_ai_tools_retriever.png">


## How to use Assistant API using Tools: Retriever using multiple documents

In [4]:
import warnings
import os
import time

import openai
from openai import OpenAI

from dotenv import load_dotenv, find_dotenv
from typing import List
from assistant_utils import print_thread_messages, upload_files, \
                            loop_until_completed, create_assistant_run

Load our *.env* file with respective API keys and base url endpoints. Here you can either use OpenAI or Anyscale Endpoints. 

**Note**: Assistant API calling for Anyscale Endpoints (which serves only OS modles) is not yet aviable).

In [5]:
warnings.filterwarnings('ignore')

_ = load_dotenv(find_dotenv()) # read local .env file

openai.api_base = os.getenv("ANYSCALE_API_BASE", os.getenv("OPENAI_API_BASE"))
openai.api_key = os.getenv("ANYSCALE_API_KEY", os.getenv("OPENAI_API_KEY"))
MODEL = os.getenv("MODEL")
print(f"Using MODEL={MODEL}; base={openai.api_base}")

Using MODEL=gpt-4-1106-preview; base=https://api.openai.com/v1


Upload two pdfs. OpenAI allows up twenty files

In [34]:
DOCS_TO_LOAD = ["docs/llm_survey_halluciantions.pdf",
                "docs/1001-math-problems-2nd.pdf"]

In [35]:
from openai import OpenAI

client = OpenAI(
    api_key = openai.api_key,
    base_url = openai.api_base
)

### Step 1: Create our knowledgebase
This entails uploading your pdfs as your knowledgebase for the retrievers to use. Once you upload a file, the Assistant from OpenAI will break it into smaller chuncks, sort and save these chuncks, index and store the embeddings as vectors. 

The retrievers use your query to retrieve the best semantic matches on vectors in the knowledgebase, and then feed the LLM, along with the original query, to generate the consolidated and comprehesive answer, similarly to how a large-scale RAG retriever operates.

Upload the data files from your storage.

In [36]:
file_objects = upload_files(client, DOCS_TO_LOAD)
file_objects

[FileObject(id='file-5BgVZhmHgOvNFwLZl3NJhK6s', bytes=1870663, created_at=1704922137, filename='llm_survey_halluciantions.pdf', object='file', purpose='assistants', status='processed', status_details=None),
 FileObject(id='file-K4ztRDmVvqem23JNYpZDiTdY', bytes=1857163, created_at=1704922138, filename='1001-math-problems-2nd.pdf', object='file', purpose='assistants', status='processed', status_details=None)]

In [37]:
# Extract file ids 
file_obj_ids = []
for idx, f_obj in enumerate(file_objects):
    file_obj_ids.append(file_objects[idx].id)
file_obj_ids

['file-5BgVZhmHgOvNFwLZl3NJhK6s', 'file-K4ztRDmVvqem23JNYpZDiTdY']

### Step 2: Create an Assistant 
Before you can start interacting with the Assistant to carry out any tasks, you need an AI assistant object. Supply the Assistant with a model to use, tools, and file ids to use for its knowledge base.

In [38]:
assistant = client.beta.assistants.create(name="AI Math and LLM survey Chatbot",
                                           instructions="""You are a knowledgeable chatbot trained to respond 
                                               inquires on documents accessible to you. 
                                               Use a professional advisory tone, 
                                               and only respond by consulting the 
                                               two files you are granted access to. 
                                               Do not make up answers. If you don't know answer, respond with 'Sorry, I'm afraid
                                               I don't have access to that information.'""",
                                           model=MODEL,
                                           tools = [{'type': 'retrieval'}],  # use the retrieval tool
                                           file_ids=file_obj_ids # use these files uploaded as part of your knowledge base
)                                        
assistant

Assistant(id='asst_dGpV5uEoPqiK9oZF37JBAzAv', created_at=1704922151, description=None, file_ids=['file-5BgVZhmHgOvNFwLZl3NJhK6s', 'file-K4ztRDmVvqem23JNYpZDiTdY'], instructions="You are a knowledgeable chatbot trained to respond \n                                               inquires on documents accessible to you. \n                                               Use a professional advisory tone, \n                                               and only respond by consulting the \n                                               two files you are granted access to. \n                                               Do not make up answers. If you don't know answer, respond with 'Sorry, I'm afraid\n                                               I don't have access to that information.'", metadata={}, model='gpt-4-1106-preview', name='AI Math and LLM survey Chatbot', object='assistant', tools=[ToolRetrieval(type='retrieval')])

### Step 3: Create a thread 
As the diagram above shows, the Thread is the object with which the AI Assistant runs will interact with, by fetching messages and putting messages to it. Think of a thread as a "conversation session between an Assistant and a user. Threads store Messages and automatically handle truncation to fit content into a model’s context window."

In [39]:
thread = client.beta.threads.create()
thread

Thread(id='thread_eP6P7EeIhFHFpaGMCS5ysCI9', created_at=1704922156, metadata={}, object='thread')

### Step 4: Add your message query to the thread for the Assistant

In [40]:
message = client.beta.threads.messages.create(
    thread_id=thread.id, 
    role="user",
    content="""
    # OBJECTIVE#
    Use "llm_survey_halluciantions" document for this 
    query. Give me a three paragraph overview of the 
    document.
    
    # STYLE #
    Use simple and compound-complex sentences for each paragraph. 
    """,
)
message.model_dump()

{'id': 'msg_2C3NfhUqUFTTrkTJvhDTgaX7',
 'assistant_id': None,
 'content': [{'text': {'annotations': [],
    'value': '\n    # OBJECTIVE#\n    Use "llm_survey_halluciantions" document for this \n    query. Give me a three paragraph overview of the \n    document.\n    \n    # STYLE #\n    Use simple and compound-complex sentences for each paragraph. \n    '},
   'type': 'text'}],
 'created_at': 1704922157,
 'file_ids': [],
 'metadata': {},
 'object': 'thread.message',
 'role': 'user',
 'run_id': None,
 'thread_id': 'thread_eP6P7EeIhFHFpaGMCS5ysCI9'}

### Step 5: Create a Run for the Assistant
A Run is an invocation of an Assistant on a Thread. The Assistant uses its configuration and the Thread’s Messages to perform tasks by calling models and tools. As part of a Run, the Assistant appends Messages to the Thread.

Note that Assistance will run asychronously: the run has the following
lifecycle and states: [*expired, completed, failed, cancelled*]. Run objects can have multiple statuses.

<img src="https://cdn.openai.com/API/docs/images/diagram-1.png">

In [41]:
instruction_msg = """Please address the user as Jules Dmatrix.  
    Do not provide an answer to the question if the information was not retrieved from 
    the knowledge base.
"""
run = create_assistant_run(client, assistant, thread, instruction_msg)
print(run.model_dump_json(indent=2))

{
  "id": "run_FwKWtCqM7e3Jlcpm5sNPAVIN",
  "assistant_id": "asst_dGpV5uEoPqiK9oZF37JBAzAv",
  "cancelled_at": null,
  "completed_at": null,
  "created_at": 1704922157,
  "expires_at": 1704922757,
  "failed_at": null,
  "file_ids": [
    "file-5BgVZhmHgOvNFwLZl3NJhK6s",
    "file-K4ztRDmVvqem23JNYpZDiTdY"
  ],
  "instructions": "Please address the user as Jules Dmatrix.  \n    Do not provide an answer to the question if the information was not retrieved from \n    the knowledge base.\n",
  "last_error": null,
  "metadata": {},
  "model": "gpt-4-1106-preview",
  "object": "thread.run",
  "required_action": null,
  "started_at": null,
  "status": "queued",
  "thread_id": "thread_eP6P7EeIhFHFpaGMCS5ysCI9",
  "tools": [
    {
      "type": "retrieval"
    }
  ]
}


### Step 6: Loop through the Assistant run until status is 'completed'

In [42]:
run_status = client.beta.threads.runs.retrieve(
    thread_id = thread.id,
    run_id = run.id
)
print(run_status.model_dump_json(indent=4))

{
    "id": "run_FwKWtCqM7e3Jlcpm5sNPAVIN",
    "assistant_id": "asst_dGpV5uEoPqiK9oZF37JBAzAv",
    "cancelled_at": null,
    "completed_at": null,
    "created_at": 1704922157,
    "expires_at": 1704922757,
    "failed_at": null,
    "file_ids": [
        "file-5BgVZhmHgOvNFwLZl3NJhK6s",
        "file-K4ztRDmVvqem23JNYpZDiTdY"
    ],
    "instructions": "Please address the user as Jules Dmatrix.  \n    Do not provide an answer to the question if the information was not retrieved from \n    the knowledge base.\n",
    "last_error": null,
    "metadata": {},
    "model": "gpt-4-1106-preview",
    "object": "thread.run",
    "required_action": null,
    "started_at": 1704922157,
    "status": "in_progress",
    "thread_id": "thread_eP6P7EeIhFHFpaGMCS5ysCI9",
    "tools": [
        {
            "type": "retrieval"
        }
    ]
}


#### Poll until Assistant run is completed

In [43]:
loop_until_completed(client, thread, run_status)

in_progress
in_progress
in_progress
completed


### Step 7: Retrieve the message returned by the assistance
Only when the run is **completed** can you fetch the messages from the Thread

In [44]:
print_thread_messages(client, thread)

('assistant:The survey on hallucinations in Large Language Models (LLMs) '
 'delves into the phenomenon where state-of-the-art models in natural language '
 'processing generate content that, while seemingly plausible, is not '
 'supported by facts or user inputs. As the capability of LLMs to comprehend, '
 'generate, and reason with language has grown, so too has the tendency for '
 'these models to produce these so-called "hallucinations." The paper provides '
 'a novel taxonomy for categorizing hallucinations and examines the factors '
 'leading to their occurrence. By offering a cohesive overview of methods for '
 'detection and mitigation, alongside an evaluation of existing challenges and '
 'prospective pathways for future research, the survey endeavors to provide '
 'clarity and direction in the ongoing discussion surrounding LLM reliability '
 'and practical applicability.\n'
 '\n'
 'The introduction of a refined taxonomy distinguishes between "factuality '
 'hallucination," w

### Repeat the process for any additional messages
To add more query messages to the thread for the Assistant,
repeat steps 5 - 7

### Add another message to for the Assistant

In [55]:
message = client.beta.threads.messages.create(
    thread_id=thread.id, 
    role="user",
    content="""Use 1001-math-problems-2nd document for 
    this query.
    
    Select three random math problems from sections 2 on fractions).
    """,
)
message

ThreadMessage(id='msg_otx6Z4OMEx0etXLbnRwDQr40', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='Use 1001-math-problems-2nd document for \n    this query.\n    \n    Select three random math problems from sections 2 on fractions).\n    '), type='text')], created_at=1704922532, file_ids=[], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_eP6P7EeIhFHFpaGMCS5ysCI9')

### Create another run for the Assistant for the second message

In [56]:
run = create_assistant_run(client, assistant, thread, instruction_msg)

In [57]:
run_status = client.beta.threads.runs.retrieve(
    thread_id = thread.id,
    run_id = run.id
)

print(run_status.status)

in_progress


In [58]:
loop_until_completed(client, thread, run_status)

in_progress
in_progress
in_progress
in_progress
in_progress
completed


In [59]:
print_thread_messages(client, thread)

('assistant:Here are three random math problems from Section 2 on fractions in '
 'the "1001-math-problems-2nd" document:\n'
 '\n'
 '1. **Problem 118:** In the music department at a school, a music teacher '
 'counted the musical instruments and supplies in storage. There was:\n'
 '   - 1 violin valued at \\$1200\n'
 '   - 2 violin bows each valued at \\$350\n'
 '   - 3 music stands each valued at \\$55\n'
 '   - 1 trumpet valued at \\$235\n'
 '\n'
 '   In addition, there were a number of supplies totaling \\$125 and some '
 'sheet music worth \\$75. What was the total value of the musical supplies '
 'and instruments in storage?\n'
 '   - a. \\$2040\n'
 '   - b. \\$2500\n'
 '   - c. \\$3040【60†source】.\n'
 '\n'
 '2. **Problem 119:** What is the value of the expression 5(40)?\n'
 '   - a. 0\n'
 '   - b. 1\n'
 '   - c. 5\n'
 '   - d. 20【62†source】.\n'
 '\n'
 '3. **Problem 120:** If a population of yeast cells grows from 10 to 320 in a '
 'period of 5 hours, what is the rate of growth?\n

In [60]:
# Delete the assistant. Optionally, you can delete any files
# associated with it that you have uploaded onto the OpenAI platform

response = client.beta.assistants.delete(assistant.id)
print(response)

for file_id in file_obj_ids:
    print(f"deleting file id: {file_id}...")
    response = client.files.delete(file_id)
    print(response)

AssistantDeleted(id='asst_dGpV5uEoPqiK9oZF37JBAzAv', deleted=True, object='assistant.deleted')
deleting file id: file-5BgVZhmHgOvNFwLZl3NJhK6s...
FileDeleted(id='file-5BgVZhmHgOvNFwLZl3NJhK6s', deleted=True, object='file')
deleting file id: file-K4ztRDmVvqem23JNYpZDiTdY...
FileDeleted(id='file-K4ztRDmVvqem23JNYpZDiTdY', deleted=True, object='file')
