# Notebook 3: Advanced RAG

In this notebook, you will be exploring how [AzureAISearchContextProvider](https://learn.microsoft.com/en-us/python/api/agent-framework-core/agent_framework.azure.azureaisearchcontextprovider?view=agent-framework-python-latest) performs better retrieval than just using the simple RAG approach.

## Learning Objectives
- Learn to use [AzureAISearchContextProvider](https://learn.microsoft.com/en-us/python/api/agent-framework-core/agent_framework.azure.azureaisearchcontextprovider?view=agent-framework-python-latest) to combine vector and keyword searches with semantic ranking.
- Learn how to use the AzureAISearchContextProvider as a context_provider with a ChatAgent


### Setup the Module Imports

In [19]:
import os

from agent_framework import ChatAgent
from agent_framework.azure import AzureOpenAIChatClient, AzureAISearchContextProvider
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from dotenv import load_dotenv
from openai import AzureOpenAI

### Get the needed environment variables

In [20]:
load_dotenv(override=True)

api_version = os.getenv("AZURE_OPENAI_API_VERSION")
search_endpoint = os.getenv("AZURE_SEARCH_ENDPOINT")
search_key = os.getenv("AZURE_SEARCH_API_KEY")
index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")
project_endpoint = os.getenv("AZURE_AI_PROJECT_ENDPOINT")
model_deployment = os.getenv("AZURE_AI_MODEL_DEPLOYMENT_NAME")
embedding_deployment = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT")

Define a method to get embeddings for the queries we are going to be asking

In [21]:

EMBEDDING_DIMENSIONS = 1536 #3072 for 3-large

async def get_embeddings(text: str) -> list[float]:
    if not text or text.strip() == "":
        # Return zero vector for empty text
        return [0.0] * EMBEDDING_DIMENSIONS
    
    # Truncate text if too long (max ~8000 tokens for ada-002)
    max_chars = 30000  # Approximate character limit
    if len(text) > max_chars:
        text = text[:max_chars]
    
    # Create a token provider that returns a fresh bearer token on each call
    token_provider = get_bearer_token_provider(
        DefaultAzureCredential(),
        "https://cognitiveservices.azure.com/.default",
    )

    client = AzureOpenAI(
        azure_ad_token_provider=token_provider,
        api_version="2024-02-01",
    )

    try:
        response = client.embeddings.create(
            input=text,
            model=embedding_deployment
        )
        return response.data[0].embedding
    except Exception as e:
        print(f"Error generating embedding: {e}")
        # Return zero vector on error
        return [0.0] * EMBEDDING_DIMENSIONS


Initialize the AzureAISearchContextProvider

In [22]:
search_provider = AzureAISearchContextProvider(
        endpoint=search_endpoint,
        index_name=index_name,
        api_key=search_key,
        mode="semantic", 
        top_k=3,  # Retrieve top 3 most relevant documents
        vector_field_name="BodyEmbeddings",
        embedding_function=get_embeddings
    )

In [23]:
# Sample queries to demonstrate RAG
USER_INPUTS = [
    #"What problems are there with Surface devices?",
    "What sort of AWS problems have been reported?",
    #"Are there any issues logged for Dell XPS laptops?"
    #"Do we have more issues with MacBook Air computers or Dell XPS laptops?",
	#"What issues do we have with dell xps laptops?",
	#"What issues are for Dell XPS laptops and the user tried Win + Ctrl + Shift + B?",
	#"How many tickets were logged and Incidents for Human Resources and low priority?",
	#"Which Dell XPS issue does not mention Windows?",
    #"What department had consultants with Login Issues?"
]

In [24]:
agent = ChatAgent(
    chat_client=AzureOpenAIChatClient(credential=DefaultAzureCredential()),
    name="SearchAgent",
    instructions=(
        "You are a helpful assistant. Use the provided context from the knowledge base to answer questions accurately."
    ),
    context_providers=[search_provider]
)

In [27]:
import logging

# Reduce logging verbosity for Azure Search
logging.getLogger('azure.search').setLevel(logging.ERROR)

print("=== Azure AI Agent with Search Context (Semantic Mode) ===\n")

for user_input in USER_INPUTS:
    print(f"User: {user_input}")
    print("Agent: ", end="", flush=True)

    # Stream response
    async for chunk in agent.run_stream(user_input):
        if chunk.text:
            print(chunk.text, end="", flush=True)
    
    print("\n")
    print("=" * 80)

=== Azure AI Agent with Search Context (Semantic Mode) ===

User: What sort of AWS problems have been reported?
Agent: The reported AWS problems include:

1. An AWS service outage causing disruption and requiring a swift fix.
2. Issues with deployment and infrastructure optimization using AWS Management Service, possibly due to misconfiguration or resource allocation challenges.
3. Critical scalability and reliability problems in AWS infrastructure, including inconsistent elasticity in instance scaling and delays in deployment operations, impacting operational efficiency and customer satisfaction.



Play around with the questions you ask. You'll find it does better than the simple RAG approach in Notebook 2.

However, it still won't get "How many tickets were logged and Incidents for Human Resources and low priority?" correct.