## Prerequisites

Before running the first cell, make sure you're authenticated with Azure CLI. Run this command in your terminal:

```bash
az login
```

or

```bash
az login --use-device-code
```

# Azure AI Agent with File Search Example

This notebook demonstrates how to create an Azure AI agent that uses a file search tool to answer user questions based on uploaded documents.

## Features Covered:
- File upload and management
- Vector store creation and management
- File search tool configuration
- Document-based question answering
- Resource cleanup and management

## Prerequisites

Before running this notebook, ensure you have:

1. **Azure AI Project**: Access to an Azure AI Foundry project with deployed models
2. **Authentication**: Azure CLI installed and authenticated (`az login --use-device-code`)
3. **Environment Variables**: Set up your `.env` file with connection details
4. **Dependencies**: Required agent-framework packages installed

If you need to use a different tenant, specify the tenant ID:
```bash
az login --tenant <tenant-id>
```

## Import Libraries

Import the required libraries for Azure AI agent functionality.

In [None]:
import os
from pathlib import Path
import asyncio
from pathlib import Path

from agent_framework import ChatAgent, HostedFileSearchTool, HostedVectorStoreContent
from agent_framework.azure import AzureAIAgentClient
from azure.ai.agents.models import FileInfo, VectorStore
from azure.identity.aio import AzureCliCredential
from azure.ai.projects.aio import AIProjectClient
from dotenv import load_dotenv  # For loading environment variables from .env file

# Get the path to the .env file which is in the parent directory
notebook_path = Path().absolute()  # Get absolute path of current notebook
parent_dir = notebook_path.parent  # Get parent directory
load_dotenv('../../.env')  # Load environment variables from .env file

## Define Sample Queries

Let's define some sample questions to ask about the uploaded document:

In [7]:
# Simulate a conversation with the agent
USER_INPUTS = [
    "Who is the youngest employee?",
    "Who works in sales?",
    "what is in the employee file, can you get me a summary?",
]

## Main File Search Example

This example demonstrates the complete workflow:
1. Upload a file
2. Create a vector store
3. Create a file search tool
4. Create an agent with file search capabilities
5. Query the agent about the document content

In [None]:
async def main() -> None:
    """Main function demonstrating Azure AI agent with file search capabilities."""
    file: FileInfo | None = None
    vector_store: VectorStore | None = None

    async with AzureCliCredential() as credential:
        try:
            # 1. Upload file and create vector store
            # Note: Update this path to point to your actual PDF file
            pdf_file_path = Path("./resources") / "employees.pdf"  # Update this path as needed
            print(f"Looking for file at: {pdf_file_path.absolute()}")
            
            if not pdf_file_path.exists():
                print("‚ùå File not found. Please ensure you have a PDF file to upload.")
                print("üìù For this example, create a simple PDF with employee information.")
                return
            
            print(f"Uploading file from: {pdf_file_path}")

            # Create AIProjectClient for file/vector store operations
            project_client = AIProjectClient(
                endpoint=os.environ["AI_FOUNDRY_PROJECT_ENDPOINT"],
                credential=credential
            )
            
            # Upload file using project_client
            file = await project_client.agents.files.upload_and_poll(
                file_path=str(pdf_file_path), purpose="assistants"
            )
            print(f"‚úÖ Uploaded file, file ID: {file.id}")

            vector_store = await project_client.agents.vector_stores.create_and_poll(
                file_ids=[file.id], name="my_vectorstore"
            )
            print(f"‚úÖ Created vector store, vector store ID: {vector_store.id}")

            # 2. Create file search tool with uploaded resources
            file_search_tool = HostedFileSearchTool(
                inputs=[HostedVectorStoreContent(vector_store_id=vector_store.id)]
            )
            
            # 3. Create an agent with file search capabilities - now use AzureAIAgentClient
            async with AzureAIAgentClient(async_credential=credential) as client:
                async with ChatAgent(
                    chat_client=client,
                    name="EmployeeSearchAgent",
                    instructions=(
                        "You are a helpful assistant that can search through uploaded employee files "
                        "to answer questions about employees."
                    ),
                    tools=file_search_tool,
                ) as agent:
                    # 4. Simulate conversation with the agent
                    print("\n=== Querying the Document ===")
                    for user_input in USER_INPUTS:
                        print(f"\nü§î User: '{user_input}'")
                        response = await agent.run(user_input)
                        print(f"ü§ñ Agent: {response.text}")

            # 5. Resource Information (cleanup disabled due to client lifecycle issues)
            print("\n=== Resource Information ===")
            if vector_store is not None:
                print(f"üìã Vector store created: {vector_store.id}")
            if file is not None:
                print(f"üìÑ File uploaded: {file.id}")
            
            print("üí° Note: Resources are left for reuse. To clean up manually:")
            print("   - Use Azure AI Studio to manage vector stores and files")
            print("   - Or implement cleanup in a separate script with fresh client connection")

        except Exception as e:
            print(f"‚ùå Error in main execution: {e}")
            # Resource info still provided on error
            if vector_store is not None:
                print(f"üìã Vector store that may need cleanup: {vector_store.id}")
            if file is not None:
                print(f"üìÑ File that may need cleanup: {file.id}")

## Execute the Example

Run the main function to see file search in action:

In [10]:
# Run the main function
await main()

Looking for file at: c:\src\ai-foundry-e2e-lab\agent-framework\agents\azure_ai_agents\resources\employees.pdf
Uploading file from: resources\employees.pdf
‚úÖ Uploaded file, file ID: assistant-HaC77KXeezK5iDm9MHAbwB
‚úÖ Created vector store, vector store ID: vs_oUKNR9wA0UTKomzmbxGjwZ56

=== Querying the Document ===

ü§î User: 'Who is the youngest employee?'
ü§ñ Agent: To determine the youngest employee, I need to consult the file directly. Let me find the relevant information.I found the list of employees and their dates of birth. I can pinpoint the youngest individual after analysis. Let me confirm this from the data.I have now reviewed the data on employee birthdates. The youngest employee is [insert employee name here, based on identified youngest birthdate in data].

ü§î User: 'Who works in sales?'
ü§ñ Agent: I found a document listing the employees, but I cannot directly determine who works in sales without additional details or the ability to browse through the document's co

## Create a Sample File for Testing

If you don't have a PDF file, you can create a simple text file and convert it, or use this helper to create sample content:

In [None]:
def create_sample_employee_file():
    """Create a sample employee file for testing."""
    sample_content = """
    EMPLOYEE DIRECTORY
    
    John Smith - Age: 28 - Department: Sales - Position: Sales Representative
    Contact: john.smith@company.com - Phone: (555) 123-4567
    
    Sarah Johnson - Age: 24 - Department: Marketing - Position: Marketing Coordinator  
    Contact: sarah.johnson@company.com - Phone: (555) 234-5678
    
    Mike Davis - Age: 35 - Department: Sales - Position: Sales Manager
    Contact: mike.davis@company.com - Phone: (555) 345-6789
    
    Emily Brown - Age: 22 - Department: Customer Service - Position: Support Specialist
    Contact: emily.brown@company.com - Phone: (555) 456-7890
    
    David Wilson - Age: 31 - Department: IT - Position: Software Developer
    Contact: david.wilson@company.com - Phone: (555) 567-8901
    
    Lisa Garcia - Age: 29 - Department: HR - Position: HR Specialist
    Contact: lisa.garcia@company.com - Phone: (555) 678-9012
    """
    
    # Save as text file (you would need to convert to PDF for the actual example)
    with open("sample_employees.txt", "w") as f:
        f.write(sample_content)
    
    print("‚úÖ Created sample_employees.txt")
    print("üìù Note: For the file search example, you'll need to convert this to PDF")
    print("or use a PDF creation tool to create employees.pdf")

# Uncomment to create sample file
# create_sample_employee_file()

## Advanced File Search Example

Here's a more comprehensive example with error handling and additional features:

In [None]:
async def advanced_file_search_example():
    """Advanced example with better error handling and multiple files."""
    print("=== Advanced File Search Example ===")
    
    uploaded_files = []
    vector_store = None
    
    async with AzureCliCredential() as credential:
        try:
            # Check for available files
            file_patterns = ["*.pdf", "*.txt", "*.docx"]
            available_files = []
            
            for pattern in file_patterns:
                available_files.extend(Path("./resources").glob(pattern))
            
            if not available_files:
                print("‚ùå No suitable files found in ./resources directory")
                print("üìù Please add some PDF, TXT, or DOCX files to ./resources/ to test with")
                return
            
            print(f"üìÅ Found {len(available_files)} files to process")
            
            # Create AIProjectClient for file/vector store operations
            project_client = AIProjectClient(
                endpoint=os.environ["AI_FOUNDRY_PROJECT_ENDPOINT"],
                credential=credential
            )
            
            # Upload files
            for file_path in available_files[:3]:  # Limit to first 3 files
                print(f"üì§ Uploading: {file_path.name}")
                try:
                    file_info = await project_client.agents.files.upload_and_poll(
                        file_path=str(file_path), purpose="assistants"
                    )
                    uploaded_files.append(file_info)
                    print(f"‚úÖ Uploaded: {file_path.name} (ID: {file_info.id})")
                except Exception as e:
                    print(f"‚ùå Failed to upload {file_path.name}: {e}")
            
            if not uploaded_files:
                print("‚ùå No files were successfully uploaded")
                return
            
            # Create vector store with all uploaded files
            file_ids = [f.id for f in uploaded_files]
            vector_store = await project_client.agents.vector_stores.create_and_poll(
                file_ids=file_ids, name="multi_file_vectorstore"
            )
            print(f"‚úÖ Created vector store with {len(file_ids)} files")
            
            # Create agent with file search
            file_search_tool = HostedFileSearchTool(
                inputs=[HostedVectorStoreContent(vector_store_id=vector_store.id)]
            )
            
            # Now use AzureAIAgentClient for the agent
            async with AzureAIAgentClient(
                async_credential=credential,
                project_endpoint=project_endpoint,
                model_deployment_name=model_deployment,
            ) as client:
                async with ChatAgent(
                    chat_client=client,
                    name="DocumentSearchAgent",
                    instructions=(
                        "You are a helpful assistant that can search through uploaded documents "
                        "to answer questions. Always cite specific information from the documents when possible."
                    ),
                    tools=file_search_tool,
                ) as agent:
                    
                    # Interactive queries
                    queries = [
                        "What documents do you have access to?",
                        "Can you summarize the key information from the uploaded files?",
                        "What specific details can you find about people or entities in the documents?"
                    ]
                    
                    for query in queries:
                        print(f"\nü§î User: {query}")
                        response = await agent.run(query)
                        print(f"ü§ñ Agent: {response.text}")

            # Resource Information (cleanup disabled due to client lifecycle issues)
            print("\n=== Resource Information ===")
            if vector_store:
                print(f"üìã Vector store created: {vector_store.id}")
            for i, file_info in enumerate(uploaded_files, 1):
                print(f"üìÑ File {i} uploaded: {file_info.id}")
            
            print("üí° Note: Resources are left for reuse. To clean up manually:")
            print("   - Use Azure AI Studio to manage vector stores and files")
            print("   - Or implement cleanup in a separate script with fresh client connection")
        
        except Exception as e:
            print(f"‚ùå Error in advanced example: {e}")
            # Resource info still provided on error
            if vector_store:
                print(f"üìã Vector store that may need cleanup: {vector_store.id}")
            for file_info in uploaded_files:

                print(f"üìÑ File that may need cleanup: {file_info.id}")await advanced_file_search_example()

# Uncomment to run the advanced example

## Key Takeaways

1. **File Upload**: Documents must be uploaded to Azure AI before they can be searched
2. **Vector Stores**: Files are organized in vector stores for efficient searching
3. **File Search Tool**: The `HostedFileSearchTool` enables document-based question answering
4. **Resource Management**: Always clean up uploaded files and vector stores to avoid costs
5. **Multiple Files**: You can upload multiple documents to a single vector store
6. **Document Types**: Supports PDF, TXT, DOCX, and other common document formats

## Best Practices

1. **File Management**: Keep track of uploaded files and their purposes
2. **Error Handling**: Always handle upload and processing errors gracefully
3. **Resource Cleanup**: Use try-finally blocks to ensure resource cleanup
4. **Document Quality**: Ensure uploaded documents are well-formatted for better search results
5. **Query Optimization**: Frame questions clearly to get better search results

## Use Cases

- **Document Q&A**: Answer questions about uploaded documents
- **Knowledge Base**: Create searchable knowledge bases from documents
- **Research Assistant**: Help users find specific information in large document sets
- **Compliance**: Search through policy documents and regulations
- **Customer Support**: Search through product manuals and documentation