## Step 1: Create a new Assistant with File Search Enabled
Create a new assistant with file_search enabled in the tools parameter of the Assistant.

In [1]:
from openai import OpenAI
from config import api_key
client = OpenAI(api_key=api_key)
client

<openai.OpenAI at 0x10c228c10>

In [2]:
assistant = client.beta.assistants.create(
  name="Story Assistant",
  instructions="You are a motivator who answers the question based on the story file",
  model="gpt-4o",
  tools=[{"type": "file_search"}],
)

## Step 2: Upload Files and Add Them to a Vector Store

- **Purpose**: Use the Vector Store object for file access via the file_search tool.
- **Tasks**:
  - **Upload Files**: 
    - Upload your files.
    - Create a Vector Store to contain the files.
  - **Monitor Processing Status**:
    - Poll the status of the Vector Store until all files are no longer in the `in_progress` state.
    - Ensure all content has finished processing.
- **Tools**: 
  - Use SDK helpers for uploading files and polling the status in one step.

In [3]:
# Create a vector store caled "Financial Statements"
vector_store = client.beta.vector_stores.create(name="Story Text")
 
# Ready the files for upload to OpenAI
file_paths = ["story.txt"]
file_streams = [open(path, "rb") for path in file_paths]
 
# Use the upload and poll SDK helper to upload the files, add them to the vector store,
# and poll the status of the file batch for completion.
file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
  vector_store_id=vector_store.id, files=file_streams
)
 
# You can print the status and the file counts of the batch to see the result of this operation.
print(file_batch.status)
print(file_batch.file_counts)

completed
FileCounts(cancelled=0, completed=1, failed=0, in_progress=0, total=1)


In [6]:
vector_store

VectorStore(id='vs_hJpAJocyD7kagKj3VmLN7OZc', created_at=1721142113, file_counts=FileCounts(cancelled=0, completed=0, failed=0, in_progress=0, total=0), last_active_at=1721142113, metadata={}, name='Story Text', object='vector_store', status='completed', usage_bytes=0, expires_after=None, expires_at=None)

## Step 3: Update the assistant to use the new Vector Store
To make the files accessible to your assistant, update the assistant’s tool_resources with the new vector_store id.

In [5]:
assistant = client.beta.assistants.update(
  assistant_id=assistant.id,
  tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},
)

## Step 4: Create a thread
You can also attach files as Message attachments on your thread. Doing so will create another vector_store associated with the thread, or, if there is already a vector store attached to this thread, attach the new files to the existing thread vector store. When you create a Run on this thread, the file search tool will query both the vector_store from your assistant and the vector_store on the thread.

In [16]:
# Upload the user provided file to OpenAI
message_file = client.files.create(
  file=open("pdf/cigna_member_handbook_2024.pdf", "rb"), purpose="assistants"
)
 
# Create a thread and attach the file to the message
thread = client.beta.threads.create(
  messages=[
    {
      "role": "user",
      "content": "what is the story about?",
      # Attach the new file to the message.
      "attachments": [
        { "file_id": message_file.id, "tools": [{"type": "file_search"}] }
      ],
    }
  ]
)
 
# The thread now has a vector store with that file in its tool resources.
print(thread.tool_resources.file_search)

ToolResourcesFileSearch(vector_store_ids=['vs_stMQ0QuhXUNJIv58dVFemWqb'])


In [17]:
thread

Thread(id='thread_phkJwOJgnHnNugFoH0EMpOvB', created_at=1721152571, metadata={}, object='thread', tool_resources=ToolResources(code_interpreter=None, file_search=ToolResourcesFileSearch(vector_store_ids=['vs_stMQ0QuhXUNJIv58dVFemWqb'])))

## Step 5: Create a run and check the output
Now, create a Run and observe that the model uses the File Search tool to provide a response to the user’s question.



In [18]:
# Use the create and poll SDK helper to create a run and poll the status of
# the run until it's in a terminal state.

run = client.beta.threads.runs.create_and_poll(
    thread_id=thread.id, assistant_id=assistant.id
)

messages = list(client.beta.threads.messages.list(thread_id=thread.id, run_id=run.id))

message_content = messages[0].content[0].text
annotations = message_content.annotations
citations = []
for index, annotation in enumerate(annotations):
    message_content.value = message_content.value.replace(annotation.text, f"[{index}]")
    if file_citation := getattr(annotation, "file_citation", None):
        cited_file = client.files.retrieve(file_citation.file_id)
        citations.append(f"[{index}] {cited_file.filename}")

print(message_content.value)
print("\n".join(citations))

The story, titled "The Odyssey of Lumina: Illuminating Lives," revolves around Dr. Michael Greene, a visionary inventor who creates "Lumina," a groundbreaking wearable device designed to enhance human cognitive function, mood, and overall well-being through the emission of a gentle light. Initially, Lumina is highly successful and well-received, with people from various walks of life using it to boost their productivity and potential.

However, challenges soon arise. As people become overly dependent on Lumina, they neglect their own innate abilities, and some experience withdrawal symptoms. This dependency leads to a moral dilemma for Dr. Greene: he questions whether his invention is a force for good or a crutch hindering human progress.

Facing declining sales and increasing scrutiny, Dr. Greene takes decisive action by assembling a team of experts to reevaluate Lumina's design. They develop Lumina 2.0, which includes features to promote balance and moderation. The new version allows

In [15]:
assistant.id

'asst_94o2aTFbFx5WSKWjhm8fsR1o'