## Assistants API - Knowledge Retrievel

Retrieval augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users. Once a file is uploaded and passed to the Assistant, OpenAI will automatically chunk your documents, index and store the embeddings, and implement vector search to retrieve relevant content to answer user queries.

https://platform.openai.com/docs/assistants/tools/knowledge-retrieval

#### How it works
The model then decides when to retrieve content based on the user Messages. The Assistants API automatically chooses between two retrieval techniques:

    1. It either passes the file content in the prompt for short documents
    2. Performs a vector search for longer documents

Retrieval currently optimizes for quality by adding all relevant content to the context of model calls. We plan to introduce other retrieval strategies to enable developers to choose a different tradeoff between retrieval quality and model usage cost.

https://platform.openai.com/docs/assistants/tools/how-it-works

![ALT TEXT](https://raw.githubusercontent.com/panaverse/learn-generative-ai/788f968387b0ad38bc379c3a4718400e8e42948d/03_chatgpt/07_assistants_knowledge_retrieval/ret.jpg)

![ALT TEXT](https://raw.githubusercontent.com/panaverse/learn-generative-ai/788f968387b0ad38bc379c3a4718400e8e42948d/03_chatgpt/07_assistants_knowledge_retrieval/objects.jpeg)


https://cobusgreyling.medium.com/openai-assistant-with-retriever-tool-08e9158ca900

In [1]:
from dotenv import load_dotenv,find_dotenv
import os

load_dotenv(find_dotenv('C:/Code_Apps/Learn-Generative-AI/03_chatgpt/.env'))

api_key = os.environ.get('OPENAI_API_KEY')

In [2]:
from openai import OpenAI
client = OpenAI()

### Step 1: Upload the file and Create an Assistant

In [29]:
from openai.types.beta import Assistant

#Upload a file with an "assistants" purpose
file1 = client.files.create(
    file=open("Muhammad_Talha_Khalid_013.pdf","rb"),
    purpose="assistants"
)

file2 = client.files.create(
    file=open("fastapi-modern-python-web-development.pdf","rb"),
    purpose="assistants"
)

list([file1,file2])

In [6]:
assistant : Assistant = client.beta.assistants.create(
    name="File Searching",
    model="gpt-3.5-turbo-1106",
    tools=[{"type":"retrieval"}],
    instructions="You are an expert teacher, understand concepts and explain it in simple words",
    file_ids=[file1.id,file2.id]
)

### Step 2: Create a Thread

In [7]:
from openai.types.beta.thread import Thread

thread : Thread = client.beta.threads.create()

dict(thread)

{'id': 'thread_6rjJIohXvSZowWkFZZvxFLH1',
 'created_at': 1706980067,
 'metadata': {},
 'object': 'thread'}

### Step 3: Add Messages to Thread

In [20]:
from openai.types.beta.threads.thread_message import ThreadMessage

message1 : ThreadMessage = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="What are main teaching methods?"
)
message1

ThreadMessage(id='msg_z1kWa32A4NxL4RIyhvl3VQOo', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='What are main teaching methods?'), type='text')], created_at=1706980732, file_ids=[], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_6rjJIohXvSZowWkFZZvxFLH1')

In [21]:
message2 : ThreadMessage = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="What are multiple workers"
)
message2

ThreadMessage(id='msg_vCkGZWmk4xUjuscU68cAXQmR', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='What are multiple workers'), type='text')], created_at=1706980740, file_ids=[], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_6rjJIohXvSZowWkFZZvxFLH1')

### Step 4: Run The Assistant

In [26]:
from openai.types.beta.threads.run import Run

run:Run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id,
    instructions="Try to answer in short way"
)
run

Run(id='run_Wt5SIL72FDSaJJd8wzaGlrM5', assistant_id='asst_yCs8XcnbUZ2QUv5ozJ7iG4he', cancelled_at=None, completed_at=None, created_at=1706981016, expires_at=1706981616, failed_at=None, file_ids=['file-UcGhriFsd9Uoze7k9o5DhtkQ', 'file-afX2QALZrDqMP6bWENbCuVLd'], instructions='Try to answer in short way', last_error=None, metadata={}, model='gpt-3.5-turbo-1106', object='thread.run', required_action=None, started_at=None, status='queued', thread_id='thread_6rjJIohXvSZowWkFZZvxFLH1', tools=[ToolAssistantToolsRetrieval(type='retrieval')], usage=None)

### Step 5: Check The Run Status

In [27]:
client.beta.threads.runs.retrieve(
    run_id=run.id,
    thread_id=thread.id
)

Run(id='run_Wt5SIL72FDSaJJd8wzaGlrM5', assistant_id='asst_yCs8XcnbUZ2QUv5ozJ7iG4he', cancelled_at=None, completed_at=None, created_at=1706981016, expires_at=1706981616, failed_at=None, file_ids=['file-UcGhriFsd9Uoze7k9o5DhtkQ', 'file-afX2QALZrDqMP6bWENbCuVLd'], instructions='Try to answer in short way', last_error=None, metadata={}, model='gpt-3.5-turbo-1106', object='thread.run', required_action=None, started_at=1706981016, status='in_progress', thread_id='thread_6rjJIohXvSZowWkFZZvxFLH1', tools=[ToolAssistantToolsRetrieval(type='retrieval')], usage=None)

### Step 6: Display The Assistant Response

In [28]:
response : list[ThreadMessage] = client.beta.threads.messages.list(
    thread_id=thread.id
)

display(list(response))

for msg in reversed(response.data):
    print(f"{msg.role}: {msg.content[0].text.value}")


response.data

[ThreadMessage(id='msg_AO5a4FjVRB8piR54efVlqajE', assistant_id='asst_yCs8XcnbUZ2QUv5ozJ7iG4he', content=[MessageContentText(text=Text(annotations=[], value='Multiple workers in the context of FastAPI and ASGI (Asynchronous Server Gateway Interface) refer to the ability to manage multiple worker processes to handle requests. FastAPI is based on ASGI, and it uses a special Uvicorn worker class that can be managed by Gunicorn to supervise multiple workers. Both Gunicorn and Uvicorn can start multiple worker processes, with Gunicorn typically needing 4-12 worker processes to handle hundreds or thousands of requests per second【18†source】.'), type='text')], created_at=1706981017, file_ids=[], metadata={}, object='thread.message', role='assistant', run_id='run_Wt5SIL72FDSaJJd8wzaGlrM5', thread_id='thread_6rjJIohXvSZowWkFZZvxFLH1'),
 ThreadMessage(id='msg_q0RLfJerbDDQU4VGKvZsrJjH', assistant_id='asst_yCs8XcnbUZ2QUv5ozJ7iG4he', content=[MessageContentText(text=Text(annotations=[], value='The ma

user: What are main teaching methods?
user: What are multiple workers
user: What are main teaching methods?
user: What are multiple workers
assistant: The main teaching methods include:

1. Demonstration method: Involves the explanation and illustration of a concept or skill through performing a task, explaining the steps involved, and using visual aids to enhance understanding【9†source】.

2. Project method: Involves students working on an extended in-depth project over a period, emphasizing hands-on experiential learning, research, problem-solving, and presentation of their projects【9†source】.

3. Micro-teaching: Involves short focused teaching sessions in a controlled environment, allowing novice teachers to practice and refine their teaching skills, receive constructive feedback, and engage in self-reflection【9†source】.

As for "multiple workers," I will look for relevant information.
assistant: Multiple workers in the context of FastAPI and ASGI (Asynchronous Server Gateway Interfa

[ThreadMessage(id='msg_AO5a4FjVRB8piR54efVlqajE', assistant_id='asst_yCs8XcnbUZ2QUv5ozJ7iG4he', content=[MessageContentText(text=Text(annotations=[], value='Multiple workers in the context of FastAPI and ASGI (Asynchronous Server Gateway Interface) refer to the ability to manage multiple worker processes to handle requests. FastAPI is based on ASGI, and it uses a special Uvicorn worker class that can be managed by Gunicorn to supervise multiple workers. Both Gunicorn and Uvicorn can start multiple worker processes, with Gunicorn typically needing 4-12 worker processes to handle hundreds or thousands of requests per second【18†source】.'), type='text')], created_at=1706981017, file_ids=[], metadata={}, object='thread.message', role='assistant', run_id='run_Wt5SIL72FDSaJJd8wzaGlrM5', thread_id='thread_6rjJIohXvSZowWkFZZvxFLH1'),
 ThreadMessage(id='msg_q0RLfJerbDDQU4VGKvZsrJjH', assistant_id='asst_yCs8XcnbUZ2QUv5ozJ7iG4he', content=[MessageContentText(text=Text(annotations=[], value='The ma