# Azure AI Agents - File Search

<img src="https://learn.microsoft.com/en-us/azure/ai-services/agents/media/agent-service-the-glue.png" width=800>

> https://learn.microsoft.com/en-us/azure/ai-services/agents/

In [1]:
import json
import os
import sys
import time

from azure.ai.agents import AgentsClient
from azure.ai.agents.models import (
    FileSearchTool,
    FilePurpose,
    ListSortOrder, MessageAttachment
)
from azure.identity import DefaultAzureCredential
from datetime import datetime, timezone, timedelta
from dotenv import load_dotenv
from openai import AzureOpenAI

In [2]:
load_dotenv("azure.env")

True

In [3]:
sys.version

'3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0]'

## Project

In [4]:
endpoint = os.getenv("PROJECT_ENDPOINT")
credential = DefaultAzureCredential()

project_client = AgentsClient(endpoint=endpoint, credential=credential)

In [5]:
DATA_DIR = "data"

os.makedirs(DATA_DIR, exist_ok=True)

output_file = os.path.join(DATA_DIR, "document.pdf")

In [6]:
!wget https://arxiv.org/abs/2311.06242 -O $output_file

--2025-06-10 09:11:01--  https://arxiv.org/abs/2311.06242
Resolving arxiv.org (arxiv.org)... 151.101.3.42, 151.101.67.42, 151.101.131.42, ...
Connecting to arxiv.org (arxiv.org)|151.101.3.42|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 48005 (47K) [text/html]
Saving to: ‘data/document.pdf’


2025-06-10 09:11:01 (30.0 MB/s) - ‘data/document.pdf’ saved [48005/48005]



In [7]:
model = "gpt-4o-mini"

In [8]:
file = project_client.files.upload_and_poll(file_path=output_file,
                                                  purpose=FilePurpose.AGENTS)

print(f"Uploaded file, file ID: {file.id}")

# create a vector store with the file you uploaded
vector_store = project_client.vector_stores.create_and_poll(
    file_ids=[file.id], name="document_vector_store")

print(f"Created vector store, vector store ID: {vector_store.id}")

Uploaded file, file ID: assistant-4wqVjrjAbM8KaxrYdQVKkn
Created vector store, vector store ID: vs_ofyHrvAfygkSFZYhyYh43jHH


In [9]:
# create a file search tool
file_search_tool = FileSearchTool(vector_store_ids=[vector_store.id])

# notices that FileSearchTool as tool and tool_resources must be added or the agent will be unable to search the file
agent = project_client.create_agent(
    model=model,
    name="document_agent",
    instructions="You are an AI helpful agent to analyse document",
    tools=file_search_tool.definitions,
    tool_resources=file_search_tool.resources,
    description="Document Agent",
)

print(f"Created agent, agent ID: {agent.id}")

Created agent, agent ID: asst_8joTpOGQE80pao21DZIgzXPE


In [10]:
id1 = agent.id

In [11]:
# Create a thread
thread = project_client.threads.create()
print(f"Created thread, thread ID: {thread.id}")

# Upload the user provided file as a messsage attachment
message_file = project_client.files.upload_and_poll(
    file_path=output_file, purpose=FilePurpose.AGENTS)

print(f"Uploaded file, file ID: {message_file.id}")

# Create a message with the file search attachment
# Notice that vector store is created temporarily when using attachments with a default expiration policy of seven days.
attachment = MessageAttachment(file_id=message_file.id,
                               tools=FileSearchTool().definitions)

prompt = "Summarize this document in three lines."

message = project_client.messages.create(thread_id=thread.id,
                                               role="user",
                                               content=prompt,
                                               attachments=[attachment])

print(f"Created message, message ID: {message.id}")

Created thread, thread ID: thread_0uCqRY2EHWjLn5RJJgPsz5aD
Uploaded file, file ID: assistant-WwPQyftXKpxWh5wvx6NGZZ
Created message, message ID: msg_TpxHUQsrSZgT6kFFJZ7eOMyo


In [12]:
run = project_client.runs.create_and_process(thread_id=thread.id,
                                                   agent_id=agent.id)
print(f"Created run, run ID: {run.id}")

messages = project_client.messages.list(thread_id=thread.id)
print(f"Messages: {messages}")

Created run, run ID: run_yc6KjmRXSW3sxwWnpcdN1QfQ
Messages: <iterator object azure.core.paging.ItemPaged at 0x7ff1fc4b3bb0>


In [13]:
# Fetch and log all messages
messages = project_client.messages.list(thread_id=thread.id, order=ListSortOrder.ASCENDING)

# Print last messages from the thread
for msg in messages:
    if msg.text_messages:
        last_text = msg.text_messages[-1]
        print(f"{msg.role}: {last_text.text.value}")

user: Summarize this document in three lines.
assistant: The document presents "Florence-2," a novel vision foundation model designed for various computer vision and vision-language tasks. It utilizes a unified, prompt-based approach, enabling versatile task performance including captioning, object detection, and segmentation, backed by a large dataset consisting of 5.4 billion visual annotations. Evaluations indicate Florence-2's strong capabilities in both zero-shot and fine-tuning scenarios, marking it as a competitive player in the field【4:8†source】.


## Another question

In [14]:
prompt = "Summarize this document."

message = project_client.messages.create(thread_id=thread.id,
                                               role="user",
                                               content=prompt,
                                               attachments=[attachment])

print(f"Created message, message ID: {message.id}")

Created message, message ID: msg_lg3oavm8JL9w3ur5TVYEUcnb


In [15]:
run = project_client.runs.create_and_process(thread_id=thread.id,
                                                   agent_id=agent.id)
print(f"Created run, run ID: {run.id}")

messages = project_client.messages.list(thread_id=thread.id)
print(f"Messages: {messages}")

Created run, run ID: run_012dh2hNXxBWhOhnU8ubOC4i
Messages: <iterator object azure.core.paging.ItemPaged at 0x7ff1fc3e1120>


In [16]:
messages = project_client.messages.list(thread_id=thread.id)

for message in messages:
    print(f"Message ID: {message.id}, Role: {message.role}, Content: {message.content}")

Message ID: msg_zSpUH2b2W6WsLtIECvtSIMXn, Role: assistant, Content: [{'type': 'text', 'text': {'value': 'The document introduces "Florence-2," a cutting-edge vision foundation model implemented with a unified, prompt-based representation facilitating diverse computer vision and vision-language tasks such as captioning, object detection, and segmentation. This model was trained on a vast dataset, FLD-5B, featuring 5.4 billion visual annotations derived from 126 million images, utilizing an iterative approach for data annotation. Extensive evaluations confirmed Florence-2\'s efficacy, showcasing strong zero-shot and fine-tuning capabilities that position it as a formidable player in the field of computer vision【10:0†source】.', 'annotations': [{'type': 'file_citation', 'text': '【10:0†source】', 'start_index': 612, 'end_index': 625, 'file_citation': {'file_id': 'assistant-WwPQyftXKpxWh5wvx6NGZZ'}}]}}]
Message ID: msg_lg3oavm8JL9w3ur5TVYEUcnb, Role: user, Content: [{'type': 'text', 'text': {

In [17]:
messages = list(project_client.messages.list(thread_id=thread.id))

if messages:
    last_message = messages[0]
    print(f"Content: {last_message.content}")
else:
    print("No messages found.")

Content: [{'type': 'text', 'text': {'value': 'The document introduces "Florence-2," a cutting-edge vision foundation model implemented with a unified, prompt-based representation facilitating diverse computer vision and vision-language tasks such as captioning, object detection, and segmentation. This model was trained on a vast dataset, FLD-5B, featuring 5.4 billion visual annotations derived from 126 million images, utilizing an iterative approach for data annotation. Extensive evaluations confirmed Florence-2\'s efficacy, showcasing strong zero-shot and fine-tuning capabilities that position it as a formidable player in the field of computer vision【10:0†source】.', 'annotations': [{'type': 'file_citation', 'text': '【10:0†source】', 'start_index': 612, 'end_index': 625, 'file_citation': {'file_id': 'assistant-WwPQyftXKpxWh5wvx6NGZZ'}}]}}]


In [18]:
prompt = "Describe the Florence model."

message = project_client.messages.create(thread_id=thread.id,
                                               role="user",
                                               content=prompt,
                                               attachments=[attachment])

print(f"Created message, message ID: {message.id}")

Created message, message ID: msg_vQVNI66ZteD9rsDJkKrhxH2z


In [19]:
run = project_client.runs.create_and_process(thread_id=thread.id,
                                                   agent_id=agent.id)
print(f"Created run, run ID: {run.id}")

messages = project_client.messages.list(thread_id=thread.id)
print(f"Messages: {messages}")

Created run, run ID: run_crUeVdYfmNt0xJZhsGpBTkP8
Messages: <iterator object azure.core.paging.ItemPaged at 0x7ff1fc309c60>


In [20]:
messages = project_client.messages.list(thread_id=thread.id)

for message in messages:
    print(f"Message ID: {message.id}, Role: {message.role}, Content: {message.content}")

Message ID: msg_x4dAIVN0TQvDHmTX2oa7ZuYb, Role: assistant, Content: [{'type': 'text', 'text': {'value': 'The Florence model, specifically Florence-2, is a novel vision foundation model that employs a unified, prompt-based representation designed to address a wide array of computer vision and vision-language tasks. It allows for versatile applications, such as captioning, object detection, grounding, and segmentation, by taking text prompts as instructions and generating outcomes in text format. Developed through a sequence-to-sequence framework, Florence-2 was trained on the comprehensive FLD-5B dataset, which includes 5.4 billion visual annotations, enabling it to exhibit exceptional zero-shot and fine-tuning capabilities, positioning it as a strong contender in the vision model landscape【13:1†source】【13:0†source】【13:10†source】.', 'annotations': [{'type': 'file_citation', 'text': '【13:1†source】', 'start_index': 695, 'end_index': 708, 'file_citation': {'file_id': 'assistant-4wqVjrjAbM8

In [21]:
messages = list(project_client.messages.list(thread_id=thread.id))

if messages:
    last_message = messages[0]
    print(f"Content: {last_message.content}")
else:
    print("No messages found.")

Content: [{'type': 'text', 'text': {'value': 'The Florence model, specifically Florence-2, is a novel vision foundation model that employs a unified, prompt-based representation designed to address a wide array of computer vision and vision-language tasks. It allows for versatile applications, such as captioning, object detection, grounding, and segmentation, by taking text prompts as instructions and generating outcomes in text format. Developed through a sequence-to-sequence framework, Florence-2 was trained on the comprehensive FLD-5B dataset, which includes 5.4 billion visual annotations, enabling it to exhibit exceptional zero-shot and fine-tuning capabilities, positioning it as a strong contender in the vision model landscape【13:1†source】【13:0†source】【13:10†source】.', 'annotations': [{'type': 'file_citation', 'text': '【13:1†source】', 'start_index': 695, 'end_index': 708, 'file_citation': {'file_id': 'assistant-4wqVjrjAbM8KaxrYdQVKkn'}}, {'type': 'file_citation', 'text': '【13:0†so

## Post processing

In [22]:
# List all agents in the project
print("Listing all agents in the project:")

agents = project_client.list_agents()
for agent in agents:
    print(f"Agent ID: {agent.id}, Name: {agent.name}, Model: {agent.model}, Instructions: {agent.instructions}")

Listing all agents in the project:
Agent ID: asst_8joTpOGQE80pao21DZIgzXPE, Name: document_agent, Model: gpt-4o-mini, Instructions: You are an AI helpful agent to analyse document


In [23]:
project_client.delete_agent(id1)
print(f"Deleted agent, agent ID: {id1}")

Deleted agent, agent ID: asst_8joTpOGQE80pao21DZIgzXPE
