# GenerativeAI4DS-I
## Lab. Financial Assistant


##  What I hope you'll get out of this lab
* The feeling that you'll "know where to start" when you have to consume OpenAI services.
* Follow OpenAI's best practices on how to develop assistants

In [31]:
!pip install openai



In [32]:
!pip install --upgrade openai



In [33]:
from openai import OpenAI
import os
import json
from IPython.core.display import display, HTML

In [34]:
import warnings

# Filter out all DeprecationWarnings
warnings.filterwarnings("ignore", category=DeprecationWarning)

In [35]:
def show_json(obj):
    display(json.loads(obj.model_dump_json()))

In [36]:
# We need this to load the files onto google colab
!git clone https://github.com/thousandoaks/GenerativeAI4DS-I.git

fatal: destination path 'GenerativeAI4DS-I' already exists and is not an empty directory.


# 1. You have to get your [OpenAI API Key](https://platform.openai.com/account/api-keys)

In [None]:
# Used by the agent in this tutorial
os.environ["OPENAI_API_KEY"] = "YOUR_OWN_KEY"

In [None]:
client = OpenAI(
  api_key=os.environ['OPENAI_API_KEY'],  # this is also the default, it can be omitted
)

In [None]:
client

# 2. Financial Assistant
An Assistant represents an entity that can be configured to respond to a user's messages using several parameters like model, instructions, and tools.

This time we will create a Financial Assistant able to inspect financial documents and answer back questions regarding them.





### 2.1. We create a new assistant with file search enabled

In [None]:
assistant = client.beta.assistants.create(
  name="Financial Analyst Assistant",
  instructions="You are an expert financial analyst. Use you knowledge base to answer questions about audited financial statements.",
  model="gpt-4o",
  tools=[{"type": "file_search"}],
)

### 2.2. We upload financial information
To access your files, the file_search tool uses the Vector Store object. Upload your files and create a Vector Store to contain them. Once the Vector Store is created, you should poll its status until all files are out of the in_progress state to ensure that all content has finished processing. The SDK provides helpers to uploading and polling in one shot.

In [41]:
# Create a vector store caled "Financial Statements"
vector_store = client.vector_stores.create(name="Financial Statements")

# Ready the files for upload to OpenAI
file_paths = ["/content/GenerativeAI4DS-I/datasets/Apple_10K.pdf","/content/GenerativeAI4DS-I/datasets/goog-10-k-2023.pdf"]
file_streams = [open(path, "rb") for path in file_paths]

# Use the upload and poll SDK helper to upload the files, add them to the vector store,
# and poll the status of the file batch for completion.
file_batch = client.vector_stores.file_batches.upload_and_poll(
  vector_store_id=vector_store.id, files=file_streams
)

# You can print the status and the file counts of the batch to see the result of this operation.
print(file_batch.status)
print(file_batch.file_counts)

KeyboardInterrupt: 

### 2.3 Update the assistant to to use the new Vector Store
To make the files accessible to your assistant, update the assistant’s tool_resources with the new vector_store id.

In [None]:
assistant = client.beta.assistants.update(
  assistant_id=assistant.id,
  tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
)

### 2.4. Create a thread

You can also attach files as Message attachments on your thread. Doing so will create another vector_store associated with the thread, or, if there is already a vector store attached to this thread, attach the new files to the existing thread vector store. When you create a Run on this thread, the file search tool will query both the vector_store from your assistant and the vector_store on the thread.

In this example, the user attached a copy of Apple’s latest 10-K filing.

In [None]:
# Create a thread and attach the file to the message
thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": "How many shares of AAPL were outstanding at the end of of October 2023?",

    }
  ]
)



### 2.5 Create a Run

Now, create a Run and observe that the model uses the File Search tool to provide a response to the user’s question.

In [None]:
# Use the create and poll SDK helper to create a run and poll the status of
# the run until it's in a terminal state.

run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id, assistant_id=assistant.id
)

messages = list(client.beta.threads.messages.list(thread_id=thread.id, run_id=run.id))

message_content = messages[0].content[0].text
annotations = message_content.annotations
citations = []
for index, annotation in enumerate(annotations):
    message_content.value = message_content.value.replace(annotation.text, f"[{index}]")
    if file_citation := getattr(annotation, "file_citation", None):
        cited_file = client.files.retrieve(file_citation.file_id)
        citations.append(f"[{index}] {cited_file.filename}")

print(message_content.value)
print("\n".join(citations))

### 2.6 We add more messages to the same thread as needed

In [None]:
message2 = client.beta.threads.messages.create(
  thread_id=thread.id,
  role="user",
  content="What was the net profit of Google in 2022 and 2023 ?"
)

show_json(message2)

In [None]:
run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id, assistant_id=assistant.id
)

show_json(message2)

In [None]:
if run.status == 'completed':

  messages = list(client.beta.threads.messages.list(thread_id=thread.id, run_id=run.id))

  message_content = messages[0].content[0].text
  annotations = message_content.annotations
  citations = []
  for index, annotation in enumerate(annotations):
      message_content.value = message_content.value.replace(annotation.text, f"[{index}]")
      if file_citation := getattr(annotation, "file_citation", None):
          cited_file = client.files.retrieve(file_citation.file_id)
          citations.append(f"[{index}] {cited_file.filename}")

  print(message_content.value)
  print("\n".join(citations))

else:
  print(run.status)