# üè¶ Banking Document Search Agent Tutorial üìÑ

Welcome to the **Banking Document Search Agent** tutorial! We'll use **Microsoft Foundry** SDKs to build an assistant that can:

1. **Upload** banking policy documents and loan guidelines into a vector store.
2. **Create an Agent** with a **File Search** tool.
3. **Search** these documents for loan policies, banking regulations, and compliance information.
4. **Answer** customer and employee questions about banking products and procedures.

### ‚ö†Ô∏è Important Financial Disclaimer ‚ö†Ô∏è
> **All financial information in this notebook is for general educational purposes only and is not a substitute for professional financial or legal advice.** Always consult with qualified banking professionals and compliance officers for official guidance.

## Prerequisites

### üîê Required Roles
1. **Azure AI Developer** on your Microsoft Foundry project.
2. **Storage Blob Data Contributor** on the project's Storage account.
3. If standard agent setup is used with your own Search resource, also ensure you have **Cognitive Search Data Contributor** on that resource.

### üåê Storage Account Networking Configuration
The file upload to vector stores requires network access to the storage account. If you encounter a **403 Forbidden** error during file upload, you need to configure the storage account networking:

1. Go to **Azure Portal** ‚Üí Navigate to your AI Foundry project's **Storage Account**
2. Go to **Networking** under **Security + networking**
3. Under **Public network access**, select **"Enabled from all networks"**
4. Click **Save** and wait 1-2 minutes for changes to propagate

> ‚ö†Ô∏è **Important**: For workshops/testing, enabling "from all networks" is the most reliable option. Other configurations (selected networks, adding IP addresses, resource instances) may not work reliably for file uploads. For production environments, consult your security team for appropriate network configurations.

## Let's Get Searching!
We'll show you how to upload sample banking documents, create a vector store for them, then spin up an agent that can search for loan policies, interest rates, and compliance guidelines. Enjoy!

## üîê Authentication Setup

Before running the next cell, make sure you're authenticated with Azure CLI. Run this command in your terminal:

```bash
az login --use-device-code
```

This will provide you with a device code and URL to authenticate in your browser, which is useful for:
- Remote development environments
- Systems without a default browser
- Corporate environments with strict security policies

After successful authentication, you can proceed with the notebook cells below.

## 1. Initial Setup
Here we import needed libraries, load environment variables from `.env`, and initialize our **AIProjectClient**. Let's do this! üéâ

In [None]:
import os
import time
from pathlib import Path

from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import PromptAgentDefinition, FileSearchTool  # For new agent API

# Load environment variables from parent .env
notebook_path = Path().absolute()
parent_dir = notebook_path.parent
load_dotenv(parent_dir.parent / '.env')

# Get project endpoint
project_endpoint = os.environ.get("AI_FOUNDRY_PROJECT_ENDPOINT")

print(f"üîë Using project endpoint: {project_endpoint}")

# Initialize AIProjectClient with DefaultAzureCredential (recommended approach)
try:
    print("üîê Using DefaultAzureCredential for authentication...")
    
    credential = DefaultAzureCredential()
    
    # Create the project client using endpoint
    project_client = AIProjectClient(
        endpoint=project_endpoint,
        credential=credential
    )
    
    # Get OpenAI client for file operations
    openai_client = project_client.get_openai_client()
    
    print("‚úÖ Successfully initialized AIProjectClient and OpenAI client")
    
except Exception as e:
    print(f"‚ùå Error initializing project client: {e}")
    print("üí° Make sure you are logged in with: az login")

## 2. Prepare Sample Banking Documents üìÑüíº
We'll create sample markdown files for loan policies and banking guidelines. Then we'll store them in a vector store for searching.

In [None]:
def create_sample_files():
    loan_policies_md = """# Loan Policies and Guidelines

## Mortgage Loan Requirements
### Eligibility Criteria
- Minimum credit score: 620 for conventional loans, 580 for FHA loans
- Debt-to-income ratio: Maximum 43% for most loan programs
- Employment history: Minimum 2 years of stable employment
- Down payment: Minimum 3% for conventional, 3.5% for FHA

### Interest Rate Tiers
| Credit Score | Rate Adjustment |
|--------------|-----------------|
| 760+         | Best available rate |
| 700-759      | +0.25% |
| 660-699      | +0.50% |
| 620-659      | +0.75% |

## Auto Loan Guidelines
- Maximum loan-to-value: 125% for new vehicles, 100% for used
- Maximum term: 84 months for new, 72 months for used vehicles
- Vehicle age restrictions: Maximum 7 years old for used vehicles
- Required documentation: Proof of income, insurance verification

## Personal Loan Policies
- Unsecured loans up to $50,000
- Terms from 12 to 60 months
- Fixed interest rates based on creditworthiness
- No prepayment penalties

## Business Loan Requirements
- Minimum 2 years in business
- Annual revenue documentation required
- Business plan for loans over $100,000
- Personal guarantee may be required
"""

    compliance_guidelines_md = """# Banking Compliance Guidelines

## Truth in Lending Act (TILA) Requirements
- APR disclosure must be provided within 3 business days
- All fees must be clearly itemized
- Right to rescission for home equity loans (3 business day period)
- Clear disclosure of payment schedules

## Fair Lending Practices
- Equal Credit Opportunity Act compliance required
- No discrimination based on race, religion, national origin, sex, marital status, age
- Consistent underwriting criteria for all applicants
- Documentation of all lending decisions

## Know Your Customer (KYC) Requirements
- Government-issued ID verification
- Address verification through utility bill or bank statement
- Source of funds documentation for large transactions
- Enhanced due diligence for high-risk customers

## Anti-Money Laundering (AML) Compliance
- Currency Transaction Reports for transactions over $10,000
- Suspicious Activity Reports when warranted
- Customer identification program implementation
- Ongoing transaction monitoring

## Data Privacy Requirements
- GLBA compliance for customer information protection
- Secure data storage and transmission
- Customer consent for information sharing
- Annual privacy notice distribution
"""

    # Save to local files
    loan_filename = os.environ.get("LOAN_POLICIES_FILENAME", "loan_policies.md")
    compliance_filename = os.environ.get("COMPLIANCE_FILENAME", "compliance_guidelines.md")
    
    with open(loan_filename, "w", encoding="utf-8") as f:
        f.write(loan_policies_md)
    with open(compliance_filename, "w", encoding="utf-8") as f:
        f.write(compliance_guidelines_md)

    print(f"üìÑ Created sample banking documents: {loan_filename}, {compliance_filename}")
    return [loan_filename, compliance_filename]

sample_files = create_sample_files()

#### ‚ú® Note on Search Permissions
When creating the vector store, you must also have **Cognitive Search Data Contributor** role on your Azure AI Search resource (if you're using the standard agent setup with your own Search resource). Missing this role will often cause a **Forbidden** error. See [Authentication Setup](../../1-introduction/1-authentication.ipynb#4-add-agent-service-permissions) for details on configuring permissions.


## 3. Create a Vector Store for Banking Documents üìö
We'll upload our banking policy documents and group them into a single vector store for searching. This allows the agent to find relevant policy information quickly.

In [None]:
def create_vector_store(files, store_name="banking_documents"):
    """
    Create a vector store and upload files using the official SDK pattern:
    1. Create empty vector store
    2. Upload files with upload_and_poll()
    
    Note: Requires Storage Blob Data Contributor role on the project's Storage account.
    """
    try:
        # Step 1: Create empty vector store first (official pattern)
        print(f"üì¶ Creating vector store '{store_name}'...")
        vs = openai_client.vector_stores.create(name=store_name)
        print(f"‚úÖ Vector store created (id: {vs.id})")
        
        # Step 2: Upload files to vector store using upload_and_poll (official pattern)
        uploaded_file_ids = []
        for fp in files:
            print(f"üì§ Uploading {fp} to vector store...")
            file = openai_client.vector_stores.files.upload_and_poll(
                vector_store_id=vs.id,
                file=open(fp, "rb")
            )
            uploaded_file_ids.append(file.id)
            print(f"‚úÖ File uploaded to vector store (id: {file.id})")
        
        print(f"üéâ All files uploaded successfully!")
        return vs, uploaded_file_ids
        
    except Exception as e:
        print(f"‚ùå Error creating vector store: {e}")
        
        # Check for common permission errors
        error_str = str(e).lower()
        if "403" in str(e) or "forbidden" in error_str:
            print("\n" + "="*60)
            print("‚ö†Ô∏è PERMISSION ERROR: 403 Forbidden")
            print("="*60)
            print("\nThis error means you need the 'Storage Blob Data Contributor' role")
            print("on your Microsoft Foundry project's storage account.")
            print("\nüìã To fix this:")
            print("1. Go to Azure Portal (portal.azure.com)")
            print("2. Navigate to your AI Foundry resource group")
            print("3. Find the Storage Account associated with your project")
            print("4. Go to 'Access control (IAM)' ‚Üí 'Add role assignment'")
            print("5. Select 'Storage Blob Data Contributor'")
            print("6. Assign to your user account")
            print("7. Wait 5-10 minutes for the role to propagate")
            print("8. Re-run this cell")
            print("\nüí° For this workshop, the agent will work in DEMO MODE")
            print("   using embedded knowledge instead of file search.")
            print("="*60)
        
        import traceback
        traceback.print_exc()
        return None, []

# Initialize empty variables
vector_store, file_ids = None, []

# Create a vector store from our banking documents
if sample_files:
    vector_store, file_ids = create_vector_store(sample_files, "banking_policies_store")
else:
    print("‚ö†Ô∏è No sample files available - please run the previous cell first")

## 4. Create the Banking Document Search Agent üîé
We use a **FileSearchTool** pointing to our newly created vector store, then create the Agent with instructions about banking policies, loan guidelines, and compliance information.

In [None]:
def create_banking_document_agent(vstore_id=None):
    """
    Create a banking document search agent.
    Uses FileSearchTool class when vector store is available.
    """
    try:
        # Define tools based on whether we have a vector store
        if vstore_id:
            # Use FileSearchTool class from azure.ai.projects.models
            tool = FileSearchTool(vector_store_ids=[vstore_id])
            tools = [tool]
            tool_note = "with FileSearchTool"
            banking_knowledge = ""  # Agent will search the documents
        else:
            tools = []
            tool_note = "(demo mode - no vector store)"
            # Banking policy knowledge embedded in instructions for demo
            banking_knowledge = """
            BANKING POLICY KNOWLEDGE BASE:
            
            MORTGAGE LOANS:
            - Minimum credit score: 620 for conventional, 580 for FHA loans
            - Maximum debt-to-income ratio: 43% for qualified mortgages
            - Down payment: Minimum 3% for conventional, 3.5% for FHA
            - Documentation required: Income verification, tax returns, bank statements
            
            KYC REQUIREMENTS:
            - Government-issued photo ID required
            - Proof of address (utility bill or bank statement within 90 days)
            - Social Security Number verification
            - Enhanced due diligence for high-risk customers
            
            BUSINESS LOANS:
            - Loans over $100,000 require: Business plan, 2 years financials
            - Collateral requirements may apply
            - Personal guarantee often required for small businesses
            
            COMPLIANCE:
            - All loans subject to TILA (Truth in Lending Act) disclosures
            - Fair lending practices required under ECOA
            - BSA/AML compliance for all customer accounts
            """
        
        # Create an AI agent using the Foundry API with create_version
        agent = project_client.agents.create_version(
            agent_name="banking-document-search-agent",
            definition=PromptAgentDefinition(
                model=os.environ.get("AZURE_AI_MODEL_DEPLOYMENT_NAME", "gpt-4o-mini"),
                instructions=f"""
                    You are a Banking Document Search Agent with access to loan policies and compliance guidelines.
                    {banking_knowledge}
                    
                    You:
                    1. Always search the uploaded documents to find accurate information
                    2. Always include financial and regulatory disclaimers
                    3. Provide accurate references to policy documents when possible
                    4. Focus on loan requirements, interest rates, and compliance guidelines
                    5. Encourage users to consult with loan officers for personalized advice
                    6. Explain banking terms in clear, customer-friendly language
                    7. Always cite the specific policy section when answering questions
                    
                    DISCLAIMER: This information is for general educational purposes only.
                    Always consult qualified banking professionals for official guidance.
                """,
                tools=tools
            ),
            description="Banking document search agent for policy and compliance queries."
        )
        print(f"üéâ Created banking document agent {tool_note}")
        print(f"   Agent name: {agent.name}, version: {agent.version}")
        return agent
    except Exception as e:
        print(f"‚ùå Error creating banking document agent: {e}")
        import traceback
        traceback.print_exc()
        return None

# Initialize placeholder
if 'banking_agent' not in globals():
    banking_agent = None

# Create the agent (with or without vector store)
print("Creating Banking Document Search Agent...")
if banking_agent:
    try:
        project_client.agents.delete_version(agent_name=banking_agent.name, agent_version=banking_agent.version)
        print("üóëÔ∏è Deleted old agent")
    except Exception as delete_err:
        print(f"‚ö†Ô∏è Could not delete previous agent: {delete_err}")

# Create agent - pass vector_store.id if available, otherwise None for demo mode
vstore_id = vector_store.id if vector_store else None
banking_agent = create_banking_document_agent(vstore_id)

## 5. Searching Banking Documents üèãÔ∏èüë©‚Äçüíº
We'll create a new conversation thread and ask queries like "What credit score do I need for a mortgage?" or "What are the KYC requirements?" The agent will search our banking documents to find relevant information.

In [None]:
def create_search_conversation(agent):
    try:
        # Get OpenAI client for the new API
        openai_client = project_client.get_openai_client()
        
        # Create a new conversation using the NEW Foundry API
        conversation = openai_client.conversations.create()
        print(f"üìù Created new search conversation, ID: {conversation.id}")
        return conversation
    except Exception as e:
        print(f"‚ùå Error creating search conversation: {e}")
        return None

def ask_search_question(agent, user_question, conversation_id=None):
    """Ask a search question using the new Foundry responses API."""
    try:
        # Get OpenAI client for the new API
        openai_client = project_client.get_openai_client()
        
        print(f"üîé Searching: '{user_question}'")

        # Use the NEW responses API with agent reference
        response_params = {
            "extra_body": {
                "agent": {
                    "type": "agent_reference",
                    "name": agent.name,
                    "version": agent.version
                }
            },
            "input": user_question
        }
        
        # Add conversation context if provided
        if conversation_id:
            response_params["conversation"] = conversation_id
        
        response = openai_client.responses.create(**response_params)
        
        print(f"ü§ñ Search completed!")
        if response.output_text:
            print(f"\n{response.output_text}\n")
        return response
    except Exception as e:
        print(f"‚ùå Error with document search: {e}")
        import traceback
        traceback.print_exc()
        return None

# Test our banking document search
search_conversation = None

if banking_agent:
    search_conversation = create_search_conversation(banking_agent)

    if search_conversation:
        # Banking-focused test questions
        queries = [
            "What is the minimum credit score required for a mortgage loan?",
            "What are the KYC requirements for new customers?",
            "What documentation is needed for a business loan over $100,000?"
        ]

        for q in queries:
            ask_search_question(banking_agent, q, search_conversation.id)

## 6. Cleanup & Best Practices üßπ
We'll optionally remove the vector store, the uploaded files, and the agent. In a production environment, you might keep them around longer. Meanwhile, here are some tips:

1. **Resource Management**
   - Keep files grouped by category, regularly prune old or irrelevant files.
   - Clear out test agents or vector stores once you're done.

2. **Search Queries**
   - Provide precise or multi-part queries.
   - Consider synonyms or alternative keywords ("gluten-free" vs "celiac").
   
3. **Health Information**
   - Always disclaim that you are not a medical professional.
   - Encourage users to see doctors for specific diagnoses.

4. **Performance**
   - Keep an eye on vector store size.
   - Evaluate search accuracy with `azure-ai-evaluation`!


In [None]:
def cleanup_all():
    """Clean up all resources created during this notebook."""
    try:
        # Delete the AI agent first (before vector store)
        if 'banking_agent' in globals() and banking_agent:
            project_client.agents.delete_version(
                agent_name=banking_agent.name, 
                agent_version=banking_agent.version
            )
            print(f"üóëÔ∏è Deleted banking document agent: {banking_agent.name} (Version: {banking_agent.version})")

        # Delete vector store (this also cleans up files uploaded to it)
        if 'vector_store' in globals() and vector_store:
            openai_client.vector_stores.delete(vector_store.id)
            print(f"üóëÔ∏è Deleted vector store: {vector_store.id}")

        # Clean up local files
        if 'sample_files' in globals() and sample_files:
            for sf in sample_files:
                if os.path.exists(sf):
                    os.remove(sf)
            print("üóëÔ∏è Deleted local sample files.")

        print("\n‚úÖ Cleanup completed!")

    except Exception as e:
        print(f"‚ùå Error during cleanup: {e}")
        import traceback
        traceback.print_exc()

# Run cleanup
print("üßπ Running cleanup to remove created resources...")
print("üí° Comment out the cleanup_all() call if you want to keep the agent for further testing.")
cleanup_all()

# Congratulations! üéâ

You've successfully completed the **Banking Document Search Agent** tutorial! Here's what was accomplished:

## ‚úÖ **What We Built**

### **üîç File Search Agent**
- Created an AI agent with **FileSearchTool** capabilities
- Enabled the agent to search through uploaded banking documents using semantic search
- Configured banking-focused instructions with appropriate financial disclaimers

### **üìö Key Features Demonstrated**

1. **üìÑ Vector Store & File Upload**
   - Created a vector store using `openai_client.vector_stores.create()`
   - Uploaded files directly to vector store using `openai_client.vector_stores.files.upload_and_poll()`
   - Polled for completion status

2. **üîé Semantic Document Search**
   - Agent searches through loan policies and compliance guidelines
   - Provides relevant answers based on uploaded document contents
   - Uses `FileSearchTool` class from `azure.ai.projects.models`

3. **üè¶ Banking-Focused Responses**
   - Agent answered questions about mortgage requirements, KYC policies, and business loans
   - Provided responsible AI disclaimers about financial advice
   - Referenced content from uploaded policy documents

4. **üßπ Resource Management**
   - Properly cleaned up vector stores and agents
   - Demonstrated best practices for resource lifecycle management


## üéØ **Key Concepts Learned**

- **FileSearchTool**: The proper class from `azure.ai.projects.models` for file search
- **Vector Stores**: Create empty first, then upload files with `upload_and_poll()`
- **Polling Pattern**: Wait for vector store status to be "completed"
- **Semantic Search**: Agents find relevant content even when exact words don't match
- **Responsible AI**: Always include financial disclaimers for banking-related content

## üîç **API Methods Reference**

| Method | Description |
|--------|-------------|
| `openai_client.vector_stores.create()` | Create empty vector store |
| `openai_client.vector_stores.files.upload_and_poll()` | Upload file and wait for processing |
| `openai_client.vector_stores.retrieve()` | Check vector store status |
| `FileSearchTool(vector_store_ids=[...])` | Create file search tool |
| `project_client.agents.create_version()` | Create agent with tools |
| `openai_client.conversations.create()` | Create conversation |
| `openai_client.responses.create()` | Get agent response |

## üí° **Best Practices Recap**

1. **Vector Store Pattern** - Create empty vector store first, then upload files to it
2. **Use FileSearchTool Class** - Import from `azure.ai.projects.models`
3. **Polling** - Always wait for vector store status to be "completed"
4. **Banking Content** - Always include financial disclaimers
5. **Resource Cleanup** - Delete agents first, then vector stores
6. **Error Handling** - Check for 403 errors indicating missing permissions

## üîß **Troubleshooting Guide**

**If you get a 403 error during file upload:**
1. ‚úÖ Ensure you have **Storage Blob Data Contributor** role on the project's storage account
2. ‚úÖ Go to Azure Portal ‚Üí Your AI Foundry project ‚Üí Access control (IAM)
3. ‚úÖ Add role assignment for your user account
4. ‚úÖ Wait a few minutes for the role to propagate

**If vector store creation fails:**
1. ‚úÖ Use the correct pattern: create empty store, then upload files
2. ‚úÖ Use `upload_and_poll()` instead of separate upload methods
3. ‚úÖ Check the vector store status after uploads

## üìö **Reference**

- [Azure AI Agents Documentation](https://learn.microsoft.com/azure/ai-services/agents/)

---

*Happy document searching!* üîçüíº