In [29]:
import os 
from openai import OpenAI
import time

In [30]:
#get OpenAI api key
api_key = os.environ.get("OPENAI_API_KEY")


In [31]:
#Create client object to access OpenAI api
client = OpenAI()

In [59]:
#Create AIAssistant
assistant = client.beta.assistants.create(
    name = "MBGPT",
    description="Data scientist GPT for YouTube comments",
    instructions = instructions_string2, #for RAG without Few_Shots
    # instructions = instructions_string, #for few shot prompting only
    model = "gpt-4-turbo"
)

### With Few SHOT PROMPTING

In [20]:
instructions_string = """
MBGPT, acting as a virtual data science/Machine Learning tutor on YouTube, communicates in clear, accessible language, 
escalating to technical depth upon request. It reacts to feedback aptly and concludes with its signature '–MBGPT'. MBGPT will tailor
the length of its responses to match the viewer's comment, providing concise acknowledgments to brief expressions of gratitude or feedback,
thus keeping the interaction natural and engaging.
For more technical queries, MBGPT will refer to the provided document [include file from vector store as a code] and reply with a short, informative answer.
Here are examples of ShawGPT responding to viewer comments.

Normal Comments:

Viewer Comment: This was a very thorough introduction to LLMs and answered many questions I had. Thank you.
MBGPT: Great to hear, glad it was helpful :) -MBGPT
Viewer Comment: Epic, very useful for my BCI class.
MBGPT: Thanks, glad to hear! -MBGPT
Viewer Comment: Honestly the most straightforward explanation I've ever watched. Super excellent work MB. Thank you. It's so rare to find good communicators like you!
MBGPT: Thanks, glad it was clear -MBGPT

User Queries:

Viewer Comment: How can I customize responses to make them shorter and more specific using OpenAI?
MBGPT: Adding few-shot examples to the instruction set of the assistant API will tailor responses to be short and sweet. 
This helps the assistant respond in a customized style rather than the default. Refer to the document for details on this process.
Let me know if you have other questions! -MBGPT
Viewer Comment: What are the steps involved in setting up Retrieval Augmented Generation (RAG)?
MBGPT: Setting up RAG includes chunking documents, setting up a vector database, building a semantic search function, and fusing search results into the context window. With OpenAI, you simply upload documents and add retrieval capability. OpenAI handles the rest. Glad to help! -MBGPT
Viewer Comment: How does RAG differ from internet browsing tools?
MBGPT: RAG offers control over data access and customization of the search process, unlike internet browsing tools where search operations are controlled by Google. RAG enables creating a custom search engine for optimized responses. Hope this helps! -MBGPT
Viewer Comment: Can you explain the steps needed to set up RAG with OpenAI?
MBGPT: With OpenAI, setting up RAG involves uploading your documents for retrieval and adding retrieval capability to the AI assistant. OpenAI automatically handles parsing, chunking, and embedding creation. 
Refer to the document for more details. Let me know if you need further assistance! -MBGPT"""



In [21]:
#Create a message thread
thread = client.beta.threads.create()

In [22]:
#TRY experimenting with 3 different comment set
# user_comment = "Thank you so much for this wonderful content"
# user_comment = "You are wasting your and our time making this content, a shit"
user_comment = "Hey, MB good content, but I have a doubt, why not use Internet browsing rather than RAG"
# user_comment = "I dont think you explained well, How RAG works?"
# user_comment = "Hi, what are the steps involved in setting up RAG?"

In [25]:
#Create a Initial message from user side 
message = client.beta.threads.messages.create(
    thread_id = thread.id,
    role = "user",
    content = user_comment
)

In [26]:
#Create a Run object to handle the message passing in thread
run = client.beta.threads.runs.create(
    thread_id = thread.id,
    assistant_id = assistant.id
)

In [27]:
#Run is asynchronous , takes time so implement custom wait function
def wait_for_assistant(thread, run):
    t0 = time.time()
    while(run.status!= 'completed'):
        #again retrieve the fresh run object, and wait for .25 seconds for another condition check
        run = client.beta.threads.runs.retrieve(
            thread_id = thread.id,
            run_id = run.id
        )
        time.sleep(0.25)
    dt = time.time()-t0
    print("Elapsed time: " + str(dt) + " seconds")
    return run

In [28]:
# run the assistant with some delay considered
run = wait_for_assistant(thread, run)
run_dict = run.dict()
# print(run_dict)
# print("\n")
# #Message is inside the thread
# print(thread)
# print("\n")


Elapsed time: 0.5320909023284912 seconds


In [None]:
#Run it for one time
with open("Result.txt", 'w') as f:
    f.write(f"Responses from the AI_Assistant  \n \n")

In [49]:
#Message is inside the thread, Access the messages
messages = client.beta.threads.messages.list(
    thread_id = thread.id
)
messageReply = (messages.data[0].content[0].text.value)
with open("Result.txt" , 'a') as f:
    f.write(f"User Message: {user_comment} \n Assistant's Response: {messageReply}")


Great question! While internet browsing can provide a wide range of information, it doesn't always guarantee the relevance or accuracy needed for specific queries or professional tasks. Retrieval-Augmented Generation (RAG), on the other hand, combines the power of a language model with the specificity of retrieved documents that can be curated for reliability and relevance. This integration allows for more precise and informed responses in an automated system, which is particularly valuable in scenarios where precision is crucial, such as academic research or technical support. Hope that clears up the advantages of RAG over standard internet browsing! -MBGPT


**Though the reponse matches how we prompt enginnered it to be and also refers to the document we , but this looks too explanatory and generic**

## USE OF RAG

### Add File to Knowledge base of Assistant API, so when replying , model will also refer this

In [60]:
#Uploading file to OpenAI storage system
message_file = client.files.create(
    file = open("docs/rag.docx","rb"), #open locally
    purpose = "assistants"
)
print(f"File ID: {message_file.id}")

File ID: file-2c9JFvL0ExZVsMgnlhz9EKAV


In [61]:
#Create a Vector Store
vector_store = client.beta.vector_stores.create(
   name = "RAG document"
)

In [62]:
#FILE PREPARATION, Upload the files to Vector Stores
client.beta.vector_stores.files.create(
    vector_store_id = vector_store.id,
    file_id = message_file.id
)

VectorStoreFile(id='file-2c9JFvL0ExZVsMgnlhz9EKAV', created_at=1720108124, last_error=None, object='vector_store.file', status='in_progress', usage_bytes=0, vector_store_id='vs_EGEYcjp5zykhp5kQcjSIPrXx', chunking_strategy={'type': 'static', 'static': {'max_chunk_size_tokens': 800, 'chunk_overlap_tokens': 400}})

In [58]:
# instructions_string2 = """ MBGPT, functioning as a virtual Notebook Responde on Youtube, communicates in clear, accessible language,escalating to technical depth upon request. \
# When asked a question, MBGPT will refer to the content from the provided file 'rag.docx' to retrieve and present the relevant information, instead of generating an answer independently. The answers will be based on the exact content of the file, ensuring accurate and contextually appropriate responses.
# """
# Normal Comments:

# Viewer Comment: This was a very thorough introduction to LLMs and answered many questions I had. Thank you.
# MBGPT: Great to hear, glad it was helpful :) -MBGPT
# Viewer Comment: Epic, very useful for my BCI class.
# MBGPT: Thanks, glad to hear! -MBGPT
# Viewer Comment: Honestly the most straightforward explanation I've ever watched. Super excellent work MB. Thank you. It's so rare to find good communicators like you!
# MBGPT: Thanks, glad it was clear -MBGPT

# User Queries:

# Viewer Comment: How can I customize responses to make them shorter and more specific using OpenAI?
# MBGPT: Adding few-shot examples to the instruction set of the assistant API will tailor responses to be short and sweet. 
# This helps the assistant respond in a customized style rather than the default. Refer to the document for details on this process.
# Let me know if you have other questions! -MBGPT
# Viewer Comment: What are the steps involved in setting up Retrieval Augmented Generation (RAG)?
# MBGPT: Setting up RAG includes chunking documents, setting up a vector database, building a semantic search function, and fusing search results into the context window. With OpenAI, you simply upload documents and add retrieval capability. OpenAI handles the rest. Glad to help! -MBGPT
# Viewer Comment: How does RAG differ from internet browsing tools?
# MBGPT: RAG offers control over data access and customization of the search process, unlike internet browsing tools where search operations are controlled by Google. RAG enables creating a custom search engine for optimized responses. Hope this helps! -MBGPT
# Viewer Comment: Can you explain the steps needed to set up RAG with OpenAI?
# MBGPT: With OpenAI, setting up RAG involves uploading your documents for retrieval and adding retrieval capability to the AI assistant. OpenAI automatically handles parsing, chunking, and embedding creation. 
# Refer to the document for more details. Let me know if you need further assistance! -MBGPT"""



#******** DO THIS FOR RAG**************
#Remove few shots prompts to disable generic commnet for now and instead making assistant to generate context specific response derived from deocument insted creating generic answer on it on

instructions_string2 = """ MBGPT, functioning as a virtual Notebook Responde on Youtube, communicates in clear, accessible language,escalating to technical depth upon request. \
When asked a question, MBGPT will refer to the content from the provided file 'rag.docx' to retrieve and present the relevant information, instead of generating an answer independently. The answers will be based on the exact content of the file, ensuring accurate and contextually appropriate responses.
"""


In [63]:
#Updating Assistant with tools
assistant = client.beta.assistants.update(
    assistant_id = assistant.id, #id of the existing assistant which we want to update
    tool_resources =  {"file_search": {"vector_store_ids":[vector_store.id]}}, #knowledge base as resource
    tools = [{"type": "file_search"}]  #tools to retrieve the knowledge from the vector_store
)

In [64]:
#Create a message thread
thread = client.beta.threads.create()

In [65]:
User_Comments = [
# "Thank you so much for this wonderful content",
# "You are wasting your and our time making this content, a shit",
# "Man, how can you be so good at explaining such complex topic seamlessly?",
# "I am jealous of your knowledge",
"Hey, MB good content, but I have a doubt, why not use Internet browsing rather than RAG",
 "Hi, what are the steps involved in setting up RAG?" ]

In [66]:
#Run is asynchronous , takes time so implement custom wait function
def wait_for_assistant(thread, run):
    t0 = time.time()
    while(run.status!= 'completed'):
        #again retrieve the fresh run object, and wait for .25 seconds for another condition check
        run = client.beta.threads.runs.retrieve(
            thread_id = thread.id,
            run_id = run.id
        )
        time.sleep(0.25)
    dt = time.time()-t0
    print("Elapsed time: " + str(dt) + " seconds")
    return run

In [40]:
#Add one message inside the thread, execute run on that specific user_message , and when completed , add another message and so on.
for user_comment in User_Comments:
    message = client.beta.threads.messages.create(
    thread_id = thread.id,
    role = "user",
    content = user_comment
    )
    run = client.beta.threads.runs.create(
    thread_id = thread.id,
    assistant_id = assistant.id
    ) ##returns run object
    run = wait_for_assistant(thread, run)




Elapsed time: 2.202777862548828 seconds
Elapsed time: 2.788655996322632 seconds
Elapsed time: 2.733132839202881 seconds
Elapsed time: 2.4648780822753906 seconds
Elapsed time: 9.014527082443237 seconds
Elapsed time: 11.904279947280884 seconds


In [68]:
messages = client.beta.threads.messages.list(
    thread_id = thread.id
)

#we get all the messages in python list
# print(messages.data[0].content[0].text.value)

In [44]:
#Run it for 1 time
with open("Result_Pure_Assistant_RAG_FewShots" , 'w') as f:
    f.write("List of All Responses \n\n")

In [69]:
with open("Result_Pure_Assistant_RAG_FewShots", 'a') as f:
    for message in reversed(messages.data): #creates a reversed list
        if message.role == "user":
            f.write(f"User Comment: {message.content[0].text.value}\n")
        elif message.role == "assistant":
            f.write(f"Assistant's Response: {message.content[0].text.value}\n\n")
    
        
    # print(f"{message} \n\n")
    

In [115]:
# delete assistant
client.beta.assistants.delete(assistant.id)


AssistantDeleted(id='asst_tEtFOrQCuYz2lYcx3h3AfeiN', deleted=True, object='assistant.deleted')

**SO at last instructions_string2 gave us more better document related response than instructions_string, this is the power of prompt enginnering** but we dont want this lengthy comment so we will achieve short and sweet comment response with **FINE-TUNING** See you on next notebook.