# RAG with Existing Dataset/Vector Store

This sample demonstrates how to **reuse an existing vector store** that was created either:
- Through the Azure AI Foundry portal
- From a previous code execution
- By another agent or application

**Key Benefits:**
- ‚úÖ No need to re-upload documents
- ‚úÖ Faster agent creation
- ‚úÖ Consistent knowledge base across multiple agents
- ‚úÖ Cost-effective (no duplicate processing)

**Required Environment Variables:**
- `AZURE_AI_PROJECT_ENDPOINT`: Your Azure AI project endpoint
- `AZURE_AI_MODEL_DEPLOYMENT_NAME`: The name of your model deployment

**Prerequisites:**
- An existing vector store (from portal or previous code execution)
- Vector store ID (you can find this in Azure AI Foundry portal under Data + Indexes)

In [None]:
import os
import asyncio
from agent_framework import ChatAgent, HostedFileSearchTool, HostedVectorStoreContent
from agent_framework.azure import AzureAIAgentClient
from azure.identity.aio import AzureCliCredential
from azure.ai.projects import AIProjectClient
from azure.identity import DefaultAzureCredential

project_endpoint = os.environ.get('AZURE_AI_PROJECT_ENDPOINT')
model_name = os.environ.get('AZURE_AI_MODEL_DEPLOYMENT_NAME')

print(f"Project endpoint: {project_endpoint}")
print(f"Deployment name: {model_name}")

In [None]:
async def list_available_vector_stores(client: AzureAIAgentClient) -> list:
    """List all available vector stores in the project to help you find the right one."""
    print("üìã Listing available vector stores...\n")
    
    try:
        # List vector stores using the project client
        vector_stores_paged = client.project_client.agents.vector_stores.list()
        vector_stores = list(vector_stores_paged)
        
        if vector_stores:
            print(f"Found {len(vector_stores)} vector store(s):\n")
            for i, store in enumerate(vector_stores, 1):
                print(f"  {i}. Name: {store.name}")
                print(f"     ID: {store.id}")
                print(f"     Created: {store.created_at}")
                print(f"     File count: {store.file_counts.total if store.file_counts else 'Unknown'}")
                print(f"     Status: {store.status}")
                print()
            return vector_stores
        else:
            print("‚ùå No vector stores found.")
            print("üí° Create one using the original rag-with-agents.ipynb or Azure AI Foundry portal.")
            return []
            
    except Exception as e:
        print(f"‚ùå Error listing vector stores: {e}")
        return []

In [None]:
async def use_existing_vector_store_by_id(vector_store_id: str) -> HostedVectorStoreContent:
    """Create a HostedVectorStoreContent object from an existing vector store ID."""
    print(f"üîó Using existing vector store: {vector_store_id}")
    
    # Create the vector store content object
    vector_store_content = HostedVectorStoreContent(vector_store_id=vector_store_id)
    
    print(f"‚úÖ Vector store content object created successfully")
    return vector_store_content

In [None]:
async def use_existing_vector_store_by_name(client: AzureAIAgentClient, store_name: str) -> HostedVectorStoreContent:
    """Find and use an existing vector store by name."""
    print(f"üîç Looking for vector store with name: '{store_name}'")
    
    try:
        vector_stores_paged = client.project_client.agents.vector_stores.list()
        vector_stores = list(vector_stores_paged)
        
        # Find store by name
        matching_store = None
        for store in vector_stores:
            if store.name == store_name:
                matching_store = store
                break
        
        if matching_store:
            print(f"‚úÖ Found vector store: {matching_store.id}")
            print(f"   Status: {matching_store.status}")
            print(f"   File count: {matching_store.file_counts.total if matching_store.file_counts else 'Unknown'}")
            
            return HostedVectorStoreContent(vector_store_id=matching_store.id)
        else:
            print(f"‚ùå Vector store '{store_name}' not found")
            available_names = [store.name for store in vector_stores]
            print(f"Available stores: {available_names}")
            return None
            
    except Exception as e:
        print(f"‚ùå Error finding vector store: {e}")
        return None

In [None]:
# üéØ METHOD 1: List all available vector stores and let user choose
async def demo_list_and_choose():
    """Demonstrate listing available vector stores and choosing one."""
    print("=== METHOD 1: LIST AND CHOOSE VECTOR STORE ===")
    
    async with (
        AzureCliCredential() as credential,
        AzureAIAgentClient(async_credential=credential) as chat_client,
    ):
        # List available vector stores
        vector_stores = await list_available_vector_stores(chat_client)
        
        if not vector_stores:
            print("No vector stores available. Please create one first.")
            return
        
        # Use the first available vector store
        selected_store = vector_stores[0]
        print(f"üéØ Using vector store: {selected_store.name} ({selected_store.id})")
        
        # Create vector store content
        vector_store_content = HostedVectorStoreContent(vector_store_id=selected_store.id)
        file_search = HostedFileSearchTool(inputs=vector_store_content)
        
        # Create agent with existing vector store
        agent = chat_client.create_agent(
            name="ExistingDatasetAgent_ListMethod",
            instructions="""
                You are an AI assistant that uses an existing knowledge base to answer questions.
                
                - Answer questions using only the information from the uploaded documents
                - If information is not available in the documents, clearly state this
                - Cite the document source when providing answers
                - Be helpful and accurate in your responses
                """,
            tools=[file_search],
            tool_choice="auto"
        )
        
        print("\nü§ñ Agent created successfully! Testing with a query...\n")
        
        # Test the agent
        query = "What information is available in the knowledge base?"
        print(f"User: {query}")
        print("Assistant: ", end="")
        
        async for chunk in agent.run_stream(
            query, 
            tool_resources={"file_search": {"vector_store_ids": [selected_store.id]}}
        ):
            if chunk.text:
                print(chunk.text, end="", flush=True)
        
        print("\n\n‚úÖ METHOD 1 completed successfully!")

In [None]:
# üéØ METHOD 2: Use specific vector store by ID (if you know it)
async def demo_use_by_id():
    """Demonstrate using a specific vector store by ID."""
    print("\n\n=== METHOD 2: USE SPECIFIC VECTOR STORE BY ID ===")
    
    # üîß CONFIGURATION: Replace with your actual vector store ID
    # You can get this from:
    # 1. Azure AI Foundry portal (Data + Indexes section)
    # 2. Output from running the rag-with-agents.ipynb
    # 3. From the list above
    
    VECTOR_STORE_ID = "vs_your_vector_store_id_here"  # üëà Replace this!
    
    if VECTOR_STORE_ID == "vs_your_vector_store_id_here":
        print("‚ùå Please update VECTOR_STORE_ID with your actual vector store ID")
        print("üí° You can get this from the list above or Azure AI Foundry portal")
        return
    
    async with (
        AzureCliCredential() as credential,
        AzureAIAgentClient(async_credential=credential) as chat_client,
    ):
        # Use existing vector store by ID
        vector_store_content = await use_existing_vector_store_by_id(VECTOR_STORE_ID)
        file_search = HostedFileSearchTool(inputs=vector_store_content)
        
        # Create agent with specific vector store
        agent = chat_client.create_agent(
            name="ExistingDatasetAgent_ByID",
            instructions="""
                You are a specialized AI assistant with access to a specific knowledge base.
                
                - Provide detailed answers based on the available documents
                - If asked about topics not covered in the documents, explain the limitation
                - Always ground your responses in the provided content
                - Be conversational but accurate
                """,
            tools=[file_search],
            tool_choice="auto"
        )
        
        print("\nü§ñ Agent created with specific vector store! Testing...\n")
        
        # Test the agent
        query = "Can you summarize the key information from the documents?"
        print(f"User: {query}")
        print("Assistant: ", end="")
        
        async for chunk in agent.run_stream(
            query, 
            tool_resources={"file_search": {"vector_store_ids": [VECTOR_STORE_ID]}}
        ):
            if chunk.text:
                print(chunk.text, end="", flush=True)
        
        print("\n\n‚úÖ METHOD 2 completed successfully!")

In [None]:
# üéØ METHOD 3: Find vector store by name
async def demo_use_by_name():
    """Demonstrate finding and using a vector store by name."""
    print("\n\n=== METHOD 3: USE VECTOR STORE BY NAME ===")
    
    # üîß CONFIGURATION: Replace with your actual vector store name
    VECTOR_STORE_NAME = "graph_knowledge_base"  # üëà This matches the name from rag-with-agents.ipynb
    
    async with (
        AzureCliCredential() as credential,
        AzureAIAgentClient(async_credential=credential) as chat_client,
    ):
        # Find vector store by name
        vector_store_content = await use_existing_vector_store_by_name(chat_client, VECTOR_STORE_NAME)
        
        if not vector_store_content:
            print("‚ùå Could not find the specified vector store")
            return
        
        file_search = HostedFileSearchTool(inputs=vector_store_content)
        
        # Create agent with found vector store
        agent = chat_client.create_agent(
            name="ExistingDatasetAgent_ByName",
            instructions="""
                You are an expert AI assistant with access to curated knowledge.
                
                - Answer questions comprehensively using the available documents
                - Provide context and examples when available
                - If information is incomplete, suggest related topics that might be covered
                - Maintain a professional and helpful tone
                """,
            tools=[file_search],
            tool_choice="auto"
        )
        
        print("\nü§ñ Agent created using named vector store! Testing...\n")
        
        # Test with multiple queries
        queries = [
            "What topics are covered in the knowledge base?",
            "Can you provide specific details about any services mentioned?"
        ]
        
        for i, query in enumerate(queries, 1):
            print(f"\n--- Query {i} ---")
            print(f"User: {query}")
            print("Assistant: ", end="")
            
            async for chunk in agent.run_stream(
                query, 
                tool_resources={"file_search": {"vector_store_ids": [vector_store_content.vector_store_id]}}
            ):
                if chunk.text:
                    print(chunk.text, end="", flush=True)
            print()
        
        print("\n‚úÖ METHOD 3 completed successfully!")

In [None]:
# üöÄ Run the demonstrations
print("üîÑ Starting demonstrations of reusing existing vector stores...\n")

# Run Method 1: List and choose
await demo_list_and_choose()

# Uncomment the methods you want to test:
# await demo_use_by_id()      # Method 2: Use by ID
# await demo_use_by_name()    # Method 3: Use by name

print("\nüéâ All demonstrations completed!")
print("\nüí° Key takeaways:")
print("   ‚Ä¢ Vector stores can be reused across multiple agents")
print("   ‚Ä¢ No need to re-upload documents for each agent")
print("   ‚Ä¢ Use specific IDs for production scenarios")
print("   ‚Ä¢ Search by name for more user-friendly workflows")

## üìã How to Find Your Vector Store ID

### **Method 1: From Azure AI Foundry Portal**
1. Go to **Azure AI Foundry portal** (https://ai.azure.com)
2. Navigate to your project
3. Look for **"Data + Indexes"** or **"Indexes"** in the left sidebar
4. Click on your index/vector store
5. Copy the **Index ID** or **Vector Store ID**

### **Method 2: From Previous Code Execution**
When you run the original `rag-with-agents.ipynb`, look for output like:
```
Created vector store, ID: vs_abc123xyz789
```

### **Method 3: Run the List Function Above**
Execute the first demo method to see all available vector stores with their IDs.

## üîß **Configuration Options**

### **For Production Use:**
- Use **Method 2** (by ID) for reliability
- Store vector store IDs in environment variables
- Implement error handling for missing stores

### **For Development/Testing:**
- Use **Method 1** (list and choose) for exploration
- Use **Method 3** (by name) for readable code

## üéØ **Next Steps**

1. **Run the first cell** to see available vector stores
2. **Copy a vector store ID** from the output
3. **Update Method 2** with your actual ID
4. **Test different approaches** to see what works best for your use case

## üí° **Best Practices**

- ‚úÖ **Name your vector stores descriptively** for easy identification
- ‚úÖ **Document vector store purposes** in your team
- ‚úÖ **Use consistent naming conventions** across projects
- ‚úÖ **Test vector store availability** before creating agents
- ‚úÖ **Consider vector store versioning** for evolving knowledge bases