# üîç RAG File Search Agent with Azure AI Foundry (Python)

## üìö Retrieval-Augmented Generation (RAG) Overview

This notebook demonstrates how to build a sophisticated RAG system using Azure AI Foundry's file search capabilities. You'll create an intelligent agent that can search through uploaded documents and provide accurate, context-aware responses based on your knowledge base.

**RAG Capabilities:**
- üìÅ **Document Upload**: Add files to a searchable knowledge base
- üîç **Vector Search**: Find relevant content using semantic similarity
- üß† **Context Integration**: Combine search results with AI reasoning
- üí¨ **Interactive Q&A**: Ask questions about your documents naturally

## üèóÔ∏è RAG Architecture

### Core Components
- **Azure AI Foundry**: Enterprise-grade AI platform with built-in RAG support
- **Vector Store**: Semantic search infrastructure for document embeddings
- **File Search Tool**: Intelligent document retrieval and ranking system
- **Agent Integration**: Seamless combination of search and generation

### RAG Process Flow
```python
Document Upload ‚Üí Vector Store ‚Üí Embedding Creation ‚Üí Search Index
                                        
```

### Advanced Features
- **Multi-Document Support**: Search across multiple files simultaneously
- **Semantic Understanding**: Goes beyond keyword matching
- **Context Ranking**: Prioritizes most relevant information
- **Real-time Processing**: Dynamic document updates and search

## üîß Technical Implementation

**Azure AI Services:**
- Azure AI Foundry project workspace
- Vector store management with automatic indexing
- File upload and processing pipeline
- Integrated search and generation capabilities

**Agent Framework Integration:**
- `HostedFileSearchTool`: Pre-built RAG functionality
- `HostedVectorStoreContent`: Document management interface
- `AzureAIAgentClient`: Unified client for all operations
- Async processing for optimal performance

## üìã Use Cases

1. **Knowledge Base Q&A**: Answer questions from company documentation
2. **Research Assistant**: Find information across multiple research papers  
3. **Customer Support**: Provide answers from product manuals and FAQs
4. **Content Discovery**: Help users find relevant information in large document collections

## ‚öôÔ∏è Prerequisites & Setup

**Azure Requirements:**
- Azure AI Foundry project with RAG capabilities enabled
- Appropriate permissions for document upload and vector store creation
- Azure CLI authentication configured

**File Preparation:**
- Documents in supported formats (Markdown, PDF, text, etc.)
- Organized in accessible file paths
- Content optimized for semantic search

**Required Dependencies:**
```bash

pip uninstall agent-framework -y
pip uninstall agent-framework-azure-ai -y

pip install -r ../../../Installation/requirements.txt --constraint ../../../Installation/constraints.txt -U
```

Let's build an intelligent document search and Q&A system! üìñ‚ú®

In [1]:
# üì¶ Import Required Libraries for RAG Implementation

import os  # For environment variable access and file path operations

# üîç Azure AI Agents Components for RAG
from azure.ai.agents.models import FilePurpose, VectorStore, FileSearchTool  # RAG-specific models and tools
from azure.ai.projects.aio import AIProjectClient  # Azure AI Foundry project client
from azure.identity.aio import AzureCliCredential  # Azure authentication via CLI
from dotenv import load_dotenv  # Environment variable management

# ü§ñ Agent Framework Components for RAG Integration  
from agent_framework import AgentRunResponse, ChatAgent, HostedFileSearchTool, HostedVectorStoreContent
from agent_framework.azure import AzureAIAgentClient  # Unified Azure AI agent client

In [2]:
# üîß Load Environment Configuration
# Initialize environment variables for Azure AI Foundry configuration
# Authentication handled via Azure CLI credentials
load_dotenv()

True

In [3]:
# üóÑÔ∏è Vector Store Creation Function for RAG
async def create_vector_store(client: AzureAIAgentClient) -> tuple[str, VectorStore]:
    """Create a vector store with sample documents for semantic search.
    
    This function demonstrates the RAG setup process:
    1. Upload a document file to Azure AI Foundry
    2. Create a vector store for semantic search
    3. Index the document for retrieval
    
    Args:
        client: AzureAIAgentClient instance for API operations
        
    Returns:
        Tuple containing file ID and VectorStore instance
    """
    # üìÅ Upload Document to Azure AI Foundry
    # Upload a sample markdown file that will serve as our knowledge base
    file_path = '../files/demo.md'
    file = await client.project_client.agents.files.upload_and_poll(
        file_path=file_path, 
        purpose=FilePurpose.AGENTS  # Specify this file is for agent use
    )
    print(f"Uploaded file, file ID: {file.id}")

    # üîç Create Vector Store for Semantic Search
    # Create a vector store that will index the uploaded document
    # The vector store enables semantic search across document content
    vector_store = await client.project_client.agents.vector_stores.create_and_poll(
        file_ids=[file.id],  # Link the uploaded file to this vector store
        name="graph_knowledge_base"  # Give the vector store a descriptive name
    )
    print(f"Created vector store, ID: {vector_store.id}")

    return file.id, vector_store

In [4]:
async with (
        AzureCliCredential() as credential,
        AzureAIAgentClient(async_credential=credential) as chat_client,
    ):
        file_id, vector_store = await create_vector_store(chat_client)

        file_search = FileSearchTool(vector_store_ids=[vector_store.id])
        
        agent = chat_client.create_agent(
            name="PythonRAGDemo",
            instructions="""
                You are a helpful assistant that helps people find information in a set of files.  If you can't find the answer in the files, just say you don't know. Do not make up an answer.
                """,
            tools=file_search.definitions,  # Tools available to the agent
            tool_resources=file_search.resources,  # Resources for the tool
        )
                

        print("Agent created. You can now ask questions about the uploaded document.")

        query = "What is GraphRAG?"
        response = await AgentRunResponse.from_agent_response_generator(agent.run_stream(query, tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}}))
        print(f"Assistant: {response}")

Uploaded file, file ID: assistant-H8Am6z1bE4yLvYrtZj9mqc
Created vector store, ID: vs_toYiO5QrYf727HZG3UfS3qAr
Agent created. You can now ask questions about the uploaded document.
Assistant: GraphRAG is an AI-based content interpretation and search capability that uses large language models (LLMs) to parse data and create a knowledge graph. It answers user questions about a user-provided private dataset by connecting information across large volumes of data. GraphRAG can handle complex queries that span many documents and thematic questions such as identifying top themes in a dataset.

GraphRAG is designed to support critical information discovery and analysis where the information required spans many documents, may be noisy, or mixed with misinformation. It is intended for use by trained users who apply responsible analytic approaches and critical reasoning, with human analysis needed to verify and augment its responses. The system is meant to be deployed with domain-specific text co