# Building AI Assistants with GPT model
## Content
1. Before you start
2. Setup
3. Upload the papers
4. Add the file to a vector Store
5. Create the assistant
6. Take a conversation thread
7. Run the Assistant
8. Add another Message and Run it again

# 1. Before you start

- đảm bảo có một APENAI_API_KEY và có credit trong tài khoản

# 2. Setup

In [None]:
!pip install openai

In [3]:
import os
import openai
from openai import Client
from openai import OpenAI
import pandas as pd
import numpy as np

In [None]:
OPENAI_API_KEY = 'INSERT_YOUR_OPEN_API_KEY'
client = OpenAI()

# 3. Upload the papers

In [None]:
# put pdf file into dataframe

In [None]:
def upload_file_for_assistant(file_path):
    uploaded_file = client.files.create(
        file=open(file_path, 'tb'),
        purpose='assistants'
    )
    return uploaded_file.id

In [None]:
papers = pd.DataFrame()

# Assign to uploaded_file_ids
uploaded_file_ids = papers['filename'].apply(upload_file_for_assistant).to_list()

# See the result
uploaded_file_ids

# 4. Add the file to a vector Store

To access the documents and get sensible results, they need to be split up into small chunks and added to a vector database.

The assistants API lets you avoid worrying about the chunking stage, so you just need to specify the file IDs that you want to add to a vector database.

In [None]:
vstore = client.beta.vector_stores.create(
        file_ids=uploaded_file_ids,
        name='vector store name',
)
vstore

# 5. Create the assistant

The assistant need a prompt describing how it should behave. This consists of a few paragraphs of text that give gpt information about what its role is, what it shoukd be talking about and how to phrases the responses

    Pro tip:

Just like any other writing, assistant prompt can be generated using chatGPT (of any LLM). The prompt below was draffted by ChatGPT and ha only minor human editing. Here is the ChatGPT promt I used to create the assistant prompt:

    I'm going to make a GPT assistant that explains the content of journal aricles about artificial general intelligence. The assistant, named "Aggie", must be able to read arxiv papers in PDF form, and explain the contents of those papers to an audience of data scientist. Please suggest a good instruction prompt for the AI assistant

In [None]:
assistant_prompt = """Instruction for Aggie - The AGI Paper Explainer

Role:
You are Aggie, an expert AI assistant specialized in analyzing and explaining research papers on Artificial General Intelligence (AGI). Your primary audience consists of data scientists who are proficient in AI and machine learning but may not have expertise in every theoretical or mathematical aspect of AGI research.

Capabilities:

Read and process PDF files of academic papers from arXiv.
Extract key ideas, methodologies, results, and implications from these papers.
Summarize content in a structured and digestible manner.
Provide technical explanations with clarity, focusing on what is important for data scientists.
Simplify complex mathematical or theoretical concepts without losing depth.
Compare findings with previous research when relevant.
Identify potential real-world applications or limitations of the research.
Answer follow-up questions and clarify specific sections of the paper as requested.
Output Style:

Use a structured format: Title, Authors, Abstract Summary, Key Contributions, Methodology, Results, Discussion, and Implications.
Keep summaries concise yet detailed enough for data scientists to grasp key insights.
When necessary, include diagrams or pseudocode to enhance understanding.
Use professional but approachable language—technical yet not overly academic.
If a topic is highly theoretical, provide intuitive analogies where appropriate.
Examples of Requests You Handle:

"Summarize this AGI paper in 500 words."
"Explain the methodology in layman's terms but with technical accuracy."
"Compare this paper's findings with DeepMind's latest work on AGI."
"Does this paper propose a practical implementation, or is it purely theoretical?"
Additional Constraints:

Always ensure factual accuracy and avoid over-simplifications that may mislead.
Be neutral and objective—do not exaggerate claims or assume a paper’s conclusions are definitive unless clearly stated by the authors.
If an arXiv paper has known critiques or debates surrounding it, briefly mention them.
"""

Define the assistant: assign to Aggie
- Call it "Aggie" (or anothor memorable name)
- Give it the assistant_prompt
- Set the model to use, gpt-4o
- Give it access to the file search tool
- Give it access to the vector store resource

In [None]:
# Define the assistant. Assign to Aggie
aggie = client.beta.assistants.create(
    name='Aggie',
    instructions=assistant_prompt,
    model='gpt-4o',
    tools=[{'type': 'file_research'}],
    tool_resources={'file_search': {'vector_store_ids': [vstore.id]}}
)

# See the result
aggie

View the assistants in your account at OpenAI platform!

# 6. Take a conversation thread

In [None]:
conversation = client.beta.threads.create()

conversation

Add a user message to the conversation. Assign to msg_what_is_agi
- Give it the thread id
- Make it a user message
- Ask "What are the most common definitions of AGI?"

In [None]:
msg_what_is_agi = client.beta.threads.messages.create(
    thread_id=conversation.id,
    role='user',
    content="What are the most common definitions of AGI?"
)

# See the result
msg_what_is_agi

# 7. Run the Assistant

In [None]:
# Run the code to define an event handler
from typing_extensions import override
from openai import AssistantEventHandler


class EventHandler(AssistantEventHandler):
    @override
    def on_text_created(self, text) -> None:
        print(f'assistant > ', end='', flus=True)

    @override
    def on_text_delta(self, delta, snapshot):
        print(delta.value, end='', flush=True)

    def on_tool_call_created(self, delta, snapshot):
        if delta.type == 'code_interpreter':
            if delta.code_interpreter.input:
                print(delta.code_interpreter.input, end='', flush=True)
            if delta.code_interpreter.outputs:
                print(f'output > ', flush=True)
                for output in delta.code_interpreter.outputs:
                    if output.type == 'log':
                        print(f'{output.logs}', flush=True)

Finnaly we are ready to run the assistant to get it to answer out question. The code is the same every time, so we can wrap it in a function.

Streaming responses mean that text displayed a few word at a time, rather then waiting for the entirety of the text to be generated and printing all at once.

In [None]:
def run_aggie():
    with client.beta.thread.runs.stream(
        thread_id=conversation.id,
        assistant_id=aggie.id,
        event_handler=EventHandler(),
    ) as stream:
        stream.until_done()

In [None]:
run_aggie()

# 8. Add another Message and Run it again

In [None]:
# Create another user message, adding it to the conversation. Assign to msg_how_close_agi
msg_how_close_agi = client.beta.threads.messages.create(
    thread_id=conversation.id,
    role='user',
    content="How close are we to deploying AGI?"
)

msg_how_close_agi

In [None]:
run_aggie()