# Azure AI Search Knowledge Agents with Blob Knowledge Source

> Make sure to open the project on VSCode from `ignite25-LAB511-build-knowledge-agents...` folder. If you open the project from the `LAB511` folder, you will get errors because of missing dependencies.

This Python notebook demonstrates how to use [Azure AI Search Knowledge Agents](https://learn.microsoft.com/azure/search/search-agentic-retrieval-overview) with a [blob knowledge source](https://learn.microsoft.com/azure/search/search-knowledge-source-how-to-blob) to build intelligent retrieval applications.

## What You'll Learn

In this lab, you will:
* Connect to an existing blob knowledge source that contains indexed documents
* Create a knowledge agent that uses Azure OpenAI models for intelligent retrieval
* Query the knowledge agent with natural language to get grounded, citation-backed answers
* Explore the agent's activity and reasoning process

## Architecture Overview

The lab environment has already provisioned:
* **Azure Blob Storage** - Contains your source documents (PDFs about employee benefits)
* **Azure AI Search** - Hosts the knowledge source with vectorized and indexed content
* **Azure OpenAI** - Provides embedding and chat completion models

The blob knowledge source uses [integrated vectorization](https://learn.microsoft.com/azure/search/vector-search-integrated-vectorization) to automatically:
* Generate embeddings using Azure OpenAI's `text-embedding-3-large` model
* Create a search index optimized for semantic search
* Connect your documents to the knowledge agent for intelligent retrieval

## Prerequisites

The following resources have already been created by the lab setup:
- ✅ Azure Blob Storage with uploaded documents
- ✅ Azure AI Search service with knowledge source configured
- ✅ Azure OpenAI with deployed models
- ✅ Indexed documents ready for querying

Let's get started!

## Step 1: Load Environment Variables

Before we begin, we need to load the configuration for your Azure resources. The lab setup has created a `.env` file at the workspace root containing:
* Azure AI Search endpoint and credentials
* Azure OpenAI endpoint, API key, and model deployment names
* Azure Blob Storage connection information
* Knowledge source and agent names

Run the cell below to load these environment variables into your Python session.

**Note:** This example includes image verbalization that uses an LLM to describe embedded images in your source content. Image verbalization can be enabled by setting `USE_VERBALIZATION` to `true` in the `.env` file (it's currently set to `false` by default).

In [None]:
from dotenv import load_dotenv
from azure.identity.aio import DefaultAzureCredential
from azure.core.credentials import AzureKeyCredential
import os

load_dotenv(override=True) # take environment variables from .env.

# Variables not used here do not need to be updated in your .env file
endpoint = os.environ["AZURE_SEARCH_SERVICE_ENDPOINT"]
credential = AzureKeyCredential(os.getenv("AZURE_SEARCH_ADMIN_KEY")) if os.getenv("AZURE_SEARCH_ADMIN_KEY") else DefaultAzureCredential()
knowledge_source_name = os.getenv("AZURE_SEARCH_KNOWLEDGE_SOURCE", "blob-knowledge-source")
knowledge_agent_name = os.getenv("AZURE_SEARCH_KNOWLEDGE_AGENT", "blob-knowledge-agent")
blob_connection_string = os.environ["BLOB_CONNECTION_STRING"]
# search blob datasource connection string is optional - defaults to blob connection string
# This field is only necessary if you are using MI to connect to the data source
# https://learn.microsoft.com/azure/search/search-howto-indexing-azure-blob-storage#supported-credentials-and-connection-strings
search_blob_connection_string = os.getenv("SEARCH_BLOB_DATASOURCE_CONNECTION_STRING", blob_connection_string)
blob_container_name = os.getenv("BLOB_CONTAINER_NAME", "documents")
azure_openai_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
azure_openai_key = os.getenv("AZURE_OPENAI_KEY")
azure_openai_embedding_deployment = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT", "text-embedding-3-large")
azure_openai_embedding_model_name = os.getenv("AZURE_OPENAI_EMBEDDING_MODEL_NAME", "text-embedding-3-large")
azure_openai_chatgpt_deployment = os.getenv("AZURE_OPENAI_CHATGPT_DEPLOYMENT", "gpt-5")
azure_openai_chatgpt_model_name = os.getenv("AZURE_OPENAI_CHATGPT_MODEL_NAME", "gpt-5")
use_verbalization = os.getenv("USE_VERBALIZATION", "false") == "true"


## Step 2: Retrieve the Existing Knowledge Source

The lab setup has already created a [blob knowledge source](https://learn.microsoft.com/azure/search/search-knowledge-source-how-to-blob) on Azure AI Search and uploaded documents to Azure Blob Storage. The indexer has processed and vectorized the documents, making them ready for intelligent retrieval.

In this step, we will:
1. **Define the chat model** - Configure the Azure OpenAI model that will be used by the knowledge agent for answer generation
2. **Retrieve the knowledge source** - Connect to the existing knowledge source created during lab setup
3. **Check indexer status** - Verify that the documents have been successfully indexed

This ensures that all components are ready before creating the knowledge agent.

In [None]:
from azure.search.documents.indexes.models import AzureBlobKnowledgeSource, AzureOpenAIVectorizer, AzureOpenAIVectorizerParameters, KnowledgeAgentAzureOpenAIModel
from azure.search.documents.indexes.aio import SearchIndexClient, SearchIndexerClient

# Define the chat model (needed for agent creation)
chat_model = KnowledgeAgentAzureOpenAIModel(
    azure_open_ai_parameters=AzureOpenAIVectorizerParameters(
        resource_url=azure_openai_endpoint,
        deployment_name=azure_openai_chatgpt_deployment,
        api_key=azure_openai_key,
        model_name=azure_openai_chatgpt_model_name
    )
)

# Retrieve the existing knowledge source
async with SearchIndexClient(endpoint=endpoint, credential=credential) as index_client:
    knowledge_source = await index_client.get_knowledge_source(knowledge_source_name)
    print(f"Retrieved existing knowledge source: {knowledge_source.name}")

# Check indexer status
async with SearchIndexerClient(endpoint=endpoint, credential=credential) as indexer_client:
    indexer_name = f"{knowledge_source_name}-indexer"
    indexer_status = await indexer_client.get_indexer_status(indexer_name)
    print(f"Indexer status: {indexer_status.status}")
    print(f"Last run status: {indexer_status.last_result.status if indexer_status.last_result else 'Not run yet'}")

Created knowledge source: blob-knowledge-source


## Step 3: Create a Knowledge Agent

Now we'll create a [knowledge agent](https://learn.microsoft.com/azure/search/search-agentic-retrieval-overview) that acts as an intelligent wrapper around your knowledge source and LLM deployment. The agent coordinates the retrieval and generation process to provide grounded answers.

### Output Modality Options

The agent supports two output modalities:
* **`EXTRACTIVE_DATA`** (default) - Returns exact content from your knowledge sources without generative alteration
* **`ANSWER_SYNTHESIS`** (used here) - Generates natural language answers using the LLM that cite the retrieved content

We're using `ANSWER_SYNTHESIS` with `include_activity=True` to get:
* **LLM-generated answers** that cite retrieved documents
* **Detailed activity logs** showing the agent's reasoning process, including subqueries and reranking decisions

Learn more about [answer synthesis in Azure AI Search](https://learn.microsoft.com/azure/search/search-agentic-retrieval-how-to-synthesize).

In [None]:
from azure.search.documents.indexes.models import KnowledgeAgent, KnowledgeSourceReference, KnowledgeAgentOutputConfiguration, KnowledgeAgentOutputConfigurationModality

output_config = KnowledgeAgentOutputConfiguration(
    modality=KnowledgeAgentOutputConfigurationModality.ANSWER_SYNTHESIS,
    include_activity=True
)

agent = KnowledgeAgent(
    name=knowledge_agent_name,
    models=[chat_model],
    knowledge_sources=[
        KnowledgeSourceReference(
            name=knowledge_source.name,
            include_reference_source_data=True,
            always_query_source=True
        )
    ],
    output_configuration=output_config
)

async with SearchIndexClient(endpoint=endpoint, credential=credential) as index_client:
    await index_client.create_or_update_agent(agent)
    print(f"Created knowledge agent: {agent.name}")

Created knowledge agent: blob-knowledge-agent


## Step 4: Query the Knowledge Agent

Now let's use agentic retrieval to query our documents! This step demonstrates how the knowledge agent processes queries to produce grounded, citation-backed answers.

### How Agentic Retrieval Works

Given a user query and conversation history, the knowledge agent:
1. **Analyzes the conversation** - Understands the full context and user's information need
2. **Query decomposition** - Breaks down complex queries into focused subqueries
3. **Concurrent execution** - Runs subqueries in parallel against your knowledge source
4. **Semantic reranking** - Uses semantic ranker to rerank and filter results for relevance
5. **Answer synthesis** - Synthesizes the top results into a natural-language answer with citations

### Example Query

The example below asks about differences between two employee benefit plans. The agent will:
* Search across the indexed PDF documents
* Find relevant sections from both plans
* Generate a comparative answer with specific citations

Try running the query, then modify it to ask your own questions!

In [None]:
from azure.search.documents.agent.aio import KnowledgeAgentRetrievalClient
from azure.search.documents.agent.models import KnowledgeAgentRetrievalRequest, KnowledgeAgentMessage, KnowledgeAgentMessageTextContent, SearchIndexKnowledgeSourceParams

messages = [
    KnowledgeAgentMessage(
        role="user",
        content=[KnowledgeAgentMessageTextContent(
            text="Differences between Northwind Health Plus and Standard"
        )]
    )
]

agent_client = KnowledgeAgentRetrievalClient(endpoint=endpoint, agent_name=knowledge_agent_name, credential=credential)
result = await agent_client.retrieve(KnowledgeAgentRetrievalRequest(messages=messages))

## Step 5: Review the Retrieval Response

The knowledge agent's response contains three key components that provide full transparency into the retrieval process:

### 1. Response Content (Answer)
An LLM-generated answer to your query that cites the retrieved documents. This is the primary output you'd show to end users.

### 2. Activity Content (Reasoning)
Detailed planning and execution information showing:
* **Subqueries generated** - How the agent broke down your query
* **Reranking decisions** - Which results were promoted or filtered
* **Intermediate steps** - The agent's thought process and execution flow

### 3. References Content (Sources)
Source documents and text chunks that contributed to the answer, including:
* Document names and metadata
* Specific text passages used
* Relevance scores and rankings

### Why This Matters

These three components enable you to:
* **Verify grounding** - Ensure answers are based on your actual content
* **Build traceable citations** - Create links back to source documents
* **Debug and optimize** - Understand why certain results were retrieved
* **Tune retrieval parameters** - Adjust reranker thresholds and knowledge source settings

**Tip:** Retrieval parameters (like reranker thresholds and knowledge source parameters) influence how aggressively your agent reranks results and which sources it queries. Inspect the activity and references to validate grounding quality.

Let's examine each component:

In [None]:
print(result.response[0].content[0].text)

Northwind Health Plus and Northwind Standard differ in several key areas:

- **Coverage Scope**: Northwind Health Plus is a comprehensive plan that covers medical, vision, and dental services, as well as prescription drugs, mental health and substance abuse services, preventive care, and emergency services (both in-network and out-of-network). Northwind Standard is a more basic plan that covers medical, vision, and dental services, preventive care, and prescription drugs, but does not cover emergency services, mental health and substance abuse services, or out-of-network services [ref_id:0][ref_id:2][ref_id:1].

- **Prescription Drugs**: Northwind Health Plus covers a wider range of prescription drugs, including generic, brand-name, and specialty drugs. Northwind Standard only covers generic and brand-name drugs [ref_id:0].

- **Vision and Dental**: Both plans cover vision and dental services, but Northwind Health Plus includes coverage for vision exams, glasses, contact lenses, and de

In [None]:
import json

# Activity -> JSON string of activity as list of dicts

activity_content = json.dumps([a.as_dict() for a in result.activity], indent=2)
print("activity_content:\n", activity_content, "\n")

activity_content:
 [
  {
    "id": 0,
    "type": "modelQueryPlanning",
    "elapsed_ms": 913,
    "input_tokens": 2021,
    "output_tokens": 48
  },
  {
    "id": 1,
    "type": "azureBlob",
    "elapsed_ms": 555,
    "knowledge_source_name": "blob-knowledge-source",
    "query_time": "2025-09-09T05:40:40.117Z",
    "count": 50,
    "azure_blob_arguments": {
      "search": "Differences between Northwind Health Plus and Northwind Health Standard plans"
    }
  },
  {
    "id": 2,
    "type": "semanticReranker",
    "input_tokens": 0
  },
  {
    "id": 3,
    "type": "modelAnswerSynthesis",
    "elapsed_ms": 3968,
    "input_tokens": 7010,
    "output_tokens": 402
  }
] 



In [None]:
# References -> JSON string of references as list of dicts
references_content = json.dumps([r.as_dict() for r in result.references], indent=2)
print("references_content:\n", references_content, "\n")