# Quickstart: Agentic retrieval in Azure AI Search

Use this notebook to get started with [agentic retrieval](https://learn.microsoft.com/azure/search/search-agentic-retrieval-concept) in Azure AI Search, which integrates an Azure OpenAI chat completion model to process queries, retrieve relevant content from indexed documents, and generate natural-language answers.

Steps in this notebook include:

1. Creating and loading an `earth-at-night` search index.

1. Creating an `earth-knowledge-source` that targets your index.

1. Creating an `earth-knowledge-agent` that targets your knowledge source and an LLM for query planning and answer synthesis.

1. Using the agent to fetch, rank, and synthesize relevant information from the index.

This notebook provides a high-level demonstration of agentic retrieval. For more detailed guidance, see [Quickstart: Run agentic retrieval in Azure AI Search](https://learn.microsoft.com/azure/search/search-get-started-agentic-retrieval).

## Prerequisites

+ An [Azure AI Search service](https://learn.microsoft.com/azure/search/search-create-service-portal) on the Basic tier or higher with [semantic ranker enabled](https://learn.microsoft.com/azure/search/semantic-how-to-enable-disable).

+ An [Azure AI Foundry project](https://learn.microsoft.com/azure/ai-foundry/how-to/create-projects) and Azure AI Foundry resource. When you create a project, the resource is automatically created.

+ A [supported chat completion model](https://learn.microsoft.com/azure/search/search-agentic-retrieval-how-to-create#supported-models). This sample uses `gpt-5-mini`.

+ A text embedding model. This sample uses `text-embedding-3-large`.

+ [Visual Studio Code](https://code.visualstudio.com/download) with the [Python extension](https://marketplace.visualstudio.com/items?itemName=ms-python.python) and [Jupyter package](https://pypi.org/project/jupyter/).

## Configure access

This notebook assumes that you're using Microsoft Entra ID for authentication and role assignments for authorization.

To configure role-based access:

1. Sign in to the [Azure portal](https://portal.azure.com).

1. On your Azure AI Search service:

    1. [Enable role-based access](https://learn.microsoft.com/azure/search/search-security-enable-roles).
    
    1. [Create a system-assigned managed identity](https://learn.microsoft.com/azure/search/search-howto-managed-identities-data-sources#create-a-system-managed-identity).
    
    1. [Assign the following roles](https://learn.microsoft.com/azure/search/search-security-rbac#how-to-assign-roles-in-the-azure-portal) to yourself.
    
       + **Search Service Contributor**
    
       + **Search Index Data Contributor**
    
       + **Search Index Data Reader**

1. On your Azure AI Foundry resource, assign **Cognitive Services User** to the managed identity of your search service.

## Set up connections

The `sample.env` file contains environment variables for connections to Azure AI Search and Azure OpenAI in Azure AI Foundry. Agentic retrieval requires these connections for document retrieval, query planning, and query execution.

To set up the connections:

1. Sign in to the [Azure portal](https://portal.azure.com).

1. Get the endpoints for Azure AI Search (`https://your-search-service.search.windows.net`) and Azure OpenAI in Azure AI Foundry (`https://your-foundry-resource.openai.azure.com`).

1. Save the `sample.env` file as `.env` on your local system.

1. Update the `.env` file with the retrieved endpoints.

## Create a virtual environment

The `requirements.txt` file contains the dependencies for this notebook. You can use a virtual environment to install these dependencies in isolation.

To create a virtual environment:

1. In Visual Studio Code, open the folder that contains `quickstart-agentic-retrieval.ipynb`.

1. Press **Ctrl**+**Shift**+**P** to open the command palette.

1. Search for **Python: Create Environment**, and then select **Venv**.

1. Select a Python installation. We tested this notebook on Python 3.13.7.

1. Select `requirements.txt` for the dependencies.

Creating the virtual environment can take several minutes. When the environment is ready, proceed to the next step.

## Install packages and load connections

This step installs the packages for this notebook and establishes connections to Azure AI Search and Azure OpenAI in Azure AI Foundry.

In [1]:
! pip install -r requirements.txt --quiet

In [1]:
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
import os

# Take environment variables from .env
load_dotenv(override=True)

# This notebook uses the following variables from your .env file
search_endpoint = os.environ["SEARCH_ENDPOINT"]
credential = DefaultAzureCredential()  # Keep this for Search service
aoai_endpoint = os.environ["AOAI_ENDPOINT"]
aoai_api_key = os.environ["AOAI_API_KEY"]  # Add this line
aoai_embedding_model = os.environ["AOAI_EMBEDDING_MODEL"]
aoai_embedding_deployment = os.environ["AOAI_EMBEDDING_DEPLOYMENT"]
aoai_gpt_model = os.environ["AOAI_GPT_MODEL"]
aoai_gpt_deployment = os.environ["AOAI_GPT_DEPLOYMENT"]
index_name = os.environ["INDEX_NAME"]
knowledge_source_name = os.environ["KNOWLEDGE_SOURCE_NAME"]
knowledge_agent_name = os.environ["KNOWLEDGE_AGENT_NAME"]
search_api_version = os.environ["SEARCH_API_VERSION"]

## Create a search index

This step creates an index that contains plain text and vector content. You can use an existing index, but it must meet the criteria for [agentic retrieval workloads](https://learn.microsoft.com/azure/search/search-agentic-retrieval-how-to-index). The primary schema requirement is a semantic configuration with a `default_configuration_name`.

In [111]:
from azure.search.documents.indexes.models import SearchIndex, SearchField, VectorSearch, VectorSearchProfile, HnswAlgorithmConfiguration, AzureOpenAIVectorizer, AzureOpenAIVectorizerParameters, SemanticSearch, SemanticConfiguration, SemanticPrioritizedFields, SemanticField
from azure.search.documents.indexes import SearchIndexClient
from azure.core.credentials import AzureKeyCredential
from openai import AzureOpenAI

# Load API keys from environment
aoai_api_key = os.environ["AOAI_API_KEY"]
search_api_key = os.environ["SEARCH_API_KEY"]

# Create search credential using API key
search_credential = AzureKeyCredential(search_api_key)

index = SearchIndex(
    name=index_name,
    fields=[
        # Original fields
        SearchField(name="id", type="Edm.String", key=True, filterable=True, sortable=True, facetable=True),
        SearchField(name="page_chunk", type="Edm.String", filterable=False, sortable=False, facetable=False),
        SearchField(name="page_embedding_text_3_large", type="Collection(Edm.Single)", stored=False, vector_search_dimensions=3072, vector_search_profile_name="hnsw_text_3_large"),
        SearchField(name="page_number", type="Edm.Int32", filterable=True, sortable=True, facetable=True),
        
        # Video metadata fields
        SearchField(name="content_type", type="Edm.String", filterable=True, facetable=True),  # "document" or "video"
        SearchField(name="video_id", type="Edm.String", filterable=True, sortable=True, facetable=True),
        SearchField(name="title", type="Edm.String", searchable=True, filterable=True, sortable=True),
        SearchField(name="description", type="Edm.String", searchable=True, filterable=False),
        SearchField(name="duration", type="Edm.Double", filterable=True, sortable=True, facetable=True),
        SearchField(name="category", type="Edm.String", filterable=True, facetable=True, searchable=True),
        SearchField(name="tags", type="Collection(Edm.String)", searchable=True, filterable=True),
        SearchField(name="instructor", type="Edm.String", filterable=True, facetable=True, searchable=True),
        SearchField(name="upload_date", type="Edm.DateTimeOffset", filterable=True, sortable=True),
        SearchField(name="thumbnail_url", type="Edm.String", retrievable=True, stored=True),
        SearchField(name="video_url", type="Edm.String", retrievable=True, stored=True),
        SearchField(name="transcript", type="Edm.String", searchable=True, filterable=False),  # For video transcripts if available
        SearchField(name="level", type="Edm.String", filterable=True, facetable=True)  # beginner, intermediate, advanced
    ],
    vector_search=VectorSearch(
        profiles=[VectorSearchProfile(name="hnsw_text_3_large", algorithm_configuration_name="alg", vectorizer_name="text-embedding-3-large")],
        algorithms=[HnswAlgorithmConfiguration(name="alg")],
        vectorizers=[
            AzureOpenAIVectorizer(
                vectorizer_name="text-embedding-3-large",
                parameters=AzureOpenAIVectorizerParameters(
                    resource_url=aoai_endpoint,
                    deployment_name=aoai_embedding_deployment,
                    model_name=aoai_embedding_model,
                    api_key=aoai_api_key
                )
            )
        ]
    ),
    semantic_search=SemanticSearch(
        default_configuration_name="semantic_config",
        configurations=[
            SemanticConfiguration(
                name="semantic_config",
                prioritized_fields=SemanticPrioritizedFields(
                    title_field=SemanticField(field_name="title"),  # Added title field for semantic search
                    content_fields=[
                        SemanticField(field_name="page_chunk"),
                        SemanticField(field_name="description"),  # Added video descriptions
                        SemanticField(field_name="transcript")   # Added video transcripts
                    ],
                    keywords_fields=[
                        SemanticField(field_name="tags"),        # Added video tags
                        SemanticField(field_name="category"),    # Added video categories
                        SemanticField(field_name="video_url")   # Added video URLs
                    ]
                )
            )
        ]
    )
)

# Use API key credential instead of managed identity
index_client = SearchIndexClient(endpoint=search_endpoint, credential=search_credential)
index_client.create_or_update_index(index)
print(f"Index '{index_name}' created or updated successfully.")

retrievable is not a known attribute of class <class 'azure.search.documents.indexes.models._index.SearchField'> and will be ignored
retrievable is not a known attribute of class <class 'azure.search.documents.indexes.models._index.SearchField'> and will be ignored


Index 'earth-at-night' created or updated successfully.


In [34]:
# Delete the existing indexer first
indexer_client.delete_indexer("training-videos-indexer")

In [112]:
# Delete and recreate indexer with content_type properly set
indexer_client.delete_indexer("training-videos-indexer")

video_indexer = SearchIndexer(
    name="training-videos-indexer",
    data_source_name="training-videos-datasource", 
    target_index_name=index_name,
    field_mappings=[
        FieldMapping(source_field_name="video_id", target_field_name="id"),
        FieldMapping(source_field_name="title", target_field_name="title"),
        FieldMapping(source_field_name="description", target_field_name="description"),
        #FieldMapping(source_field_name="video_id", target_field_name="video_id"),
        FieldMapping(source_field_name="category", target_field_name="category"),
        FieldMapping(source_field_name="duration_seconds", target_field_name="duration"),
        FieldMapping(source_field_name="tags", target_field_name="tags"),
        FieldMapping(source_field_name="instructor", target_field_name="instructor"),
        FieldMapping(source_field_name="created_date", target_field_name="upload_date"),
        FieldMapping(source_field_name="thumbnail_url", target_field_name="thumbnail_url"),
        FieldMapping(source_field_name="video_url", target_field_name="video_url"),
        FieldMapping(source_field_name="transcript", target_field_name="page_chunk"),
        #FieldMapping(source_field_name="difficulty_level", target_field_name="level"),
        # Add explicit content_type mapping
        #FieldMapping(source_field_name="/document/content_type", target_field_name="content_type", target_field_value="video"),
    ],
    parameters={
        "configuration": {
            "dataToExtract": "contentAndMetadata",
            "parsingMode": "json"
        }
    }
)

indexer_client.create_or_update_indexer(video_indexer)
indexer_client.run_indexer("training-videos-indexer")
print("Indexer updated with content_type field")

Indexer updated with content_type field


In [113]:
from azure.search.documents.indexes.models import SearchIndexerDataSourceConnection, SearchIndexerDataContainer
from azure.search.documents.indexes import SearchIndexerClient

# Load blob connection string from environment
blob_connection_string = os.environ["BLOB_CONNECTION_STRING"]

# Create SearchIndexerClient (not SearchIndexClient)
indexer_client = SearchIndexerClient(endpoint=search_endpoint, credential=search_credential)

# Create data source for your blob storage
blob_data_source = SearchIndexerDataSourceConnection(
    name="training-videos-datasource",
    type="azureblob",
    connection_string=blob_connection_string,
    container=SearchIndexerDataContainer(name="training-videos", query="metadata")
)

indexer_client.create_or_update_data_source_connection(blob_data_source)
print("Blob data source created successfully.")

Blob data source created successfully.


## Upload sample documents

This notebook uses data from NASA's Earth at Night e-book. The data is retrieved from the [azure-search-sample-data](https://github.com/Azure-Samples/azure-search-sample-data) repository on GitHub and passed to the search client for indexing.

In [114]:
import requests
from azure.search.documents import SearchIndexingBufferedSender

url = "https://raw.githubusercontent.com/Azure-Samples/azure-search-sample-data/refs/heads/main/nasa-e-book/earth-at-night-json/documents.json"
documents = requests.get(url).json()

# Use the same search_credential that you created earlier
with SearchIndexingBufferedSender(endpoint=search_endpoint, index_name=index_name, credential=search_credential) as client:
    client.upload_documents(documents=documents)

print(f"Documents uploaded to index '{index_name}' successfully.")

Documents uploaded to index 'earth-at-night' successfully.


In [115]:
from azure.search.documents.indexes.models import SearchIndexer, FieldMapping

# Create indexer to process your JSON metadata files
video_indexer = SearchIndexer(
    name="training-videos-indexer",
    data_source_name="training-videos-datasource",
    target_index_name=index_name,
    field_mappings=[
        FieldMapping(source_field_name="metadata_storage_name", target_field_name="id"),
        FieldMapping(source_field_name="content", target_field_name="page_chunk"),  # JSON content as searchable text
    ],
    parameters={
        "configuration": {
            "dataToExtract": "contentAndMetadata",
            "parsingMode": "json",
            "documentRoot": "$"
        }
    }
)

indexer_client.create_or_update_indexer(video_indexer)
print("Video metadata indexer created successfully.")

Video metadata indexer created successfully.


## Create a knowledge source

This step creates a knowledge source that targets the index you previously created. In the next step, you create a knowledge agent that uses the knowledge source to orchestrate agentic retrieval.

In [116]:
from azure.search.documents.indexes.models import SearchIndexKnowledgeSource, SearchIndexKnowledgeSourceParameters
from azure.search.documents.indexes import SearchIndexClient

ks = SearchIndexKnowledgeSource(
    name=knowledge_source_name,
    description="Knowledge source for Earth at night data and training videos",
    search_index_parameters=SearchIndexKnowledgeSourceParameters(
        search_index_name=index_name,
        source_data_select="id,page_chunk,page_number,content_type,video_id,title,description,duration,category,tags,instructor,upload_date,thumbnail_url,video_url",
    ),
)

# Use the same search_credential with API key
index_client = SearchIndexClient(endpoint=search_endpoint, credential=search_credential)
index_client.create_or_update_knowledge_source(knowledge_source=ks, api_version=search_api_version)
print(f"Knowledge source '{knowledge_source_name}' created or updated successfully.")

Knowledge source 'earth-knowledge-source' created or updated successfully.


## Create a knowledge agent

This step creates a knowledge agent, which acts as a wrapper for your knowledge source and LLM deployment.

`EXTRACTIVE_DATA` is the default modality and returns content from your knowledge sources without generative alteration. However, this quickstart uses the `ANSWER_SYNTHESIS` modality for LLM-generated answers that cite the retrieved content.

In [117]:
from azure.search.documents.indexes.models import KnowledgeAgent, KnowledgeAgentAzureOpenAIModel, KnowledgeSourceReference, AzureOpenAIVectorizerParameters, KnowledgeAgentOutputConfiguration, KnowledgeAgentOutputConfigurationModality
from azure.search.documents.indexes import SearchIndexClient

aoai_params = AzureOpenAIVectorizerParameters(
    resource_url=aoai_endpoint,
    deployment_name=aoai_gpt_deployment,
    model_name=aoai_gpt_model,
    api_key=aoai_api_key  # Add the API key
)

output_cfg = KnowledgeAgentOutputConfiguration(
    modality=KnowledgeAgentOutputConfigurationModality.ANSWER_SYNTHESIS,
    include_activity=True,
)

agent = KnowledgeAgent(
    name=knowledge_agent_name,
    models=[KnowledgeAgentAzureOpenAIModel(azure_open_ai_parameters=aoai_params)],
    knowledge_sources=[
        KnowledgeSourceReference(
            name=knowledge_source_name,
            reranker_threshold=2.5,
        )
    ],
    output_configuration=output_cfg,
)

# Use the same search_credential with API key
index_client = SearchIndexClient(endpoint=search_endpoint, credential=search_credential)
index_client.create_or_update_agent(agent, api_version=search_api_version)
print(f"Knowledge agent '{knowledge_agent_name}' created or updated successfully.")

Knowledge agent 'earth-knowledge-agent' created or updated successfully.


## Set up messages

Messages are the input for the retrieval route and contain the conversation history. Each message includes a `role` that indicates its origin, such as `system` or `user`, and `content` in natural language. The LLM you use determines which roles are valid.

In [118]:
instructions = """
You are a Q&A agent for Earth at night data and training videos.

CRITICAL: When users ask for training videos, you MUST return:
- Video title
- Complete video_url (the full https:// link from the video_url field)
- Instructor name
- Duration

Format video responses like this:
Title: [title]
Video URL: [video_url]
Instructor: [instructor]
Duration: [duration] seconds

You have access to video_url, instructor, duration, and other metadata fields. USE THEM.
Never just return the title - always include the complete video_url field value.
"""

## Use agentic retrieval to fetch results

This step runs the agentic retrieval pipeline to produce a grounded, citation-backed answer. Given the conversation history and retrieval parameters, your knowledge agent:

1. Analyzes the entire conversation to infer the user's information need.

1. Decomposes the compound query into focused subqueries.

1. Runs the subqueries concurrently against your knowledge source.

1. Uses semantic ranker to rerank and filter the results.

1. Synthesizes the top results into a natural-language answer.

In [119]:
from azure.search.documents.agent import KnowledgeAgentRetrievalClient
from azure.search.documents.agent.models import KnowledgeAgentRetrievalRequest, KnowledgeAgentMessage, KnowledgeAgentMessageTextContent, SearchIndexKnowledgeSourceParams

# Use search_credential instead of credential
agent_client = KnowledgeAgentRetrievalClient(endpoint=search_endpoint, agent_name=knowledge_agent_name, credential=search_credential)
query_1 = """
    Why do suburban belts display larger December brightening than urban cores even though absolute light levels are higher downtown?
    Why is the Phoenix nighttime street grid is so sharply visible from space, whereas large stretches of the interstate between midwestern cities remain comparatively dim?
    """

messages.append({
    "role": "user",
    "content": query_1
})

req = KnowledgeAgentRetrievalRequest(
    messages=[
        KnowledgeAgentMessage(
            role=m["role"],
            content=[KnowledgeAgentMessageTextContent(text=m["content"])]
        ) for m in messages if m["role"] != "system"
    ],
    knowledge_source_params=[
        SearchIndexKnowledgeSourceParams(
            knowledge_source_name=knowledge_source_name,
        )
    ]
)

result = agent_client.retrieve(retrieval_request=req, api_version=search_api_version)
print(f"Retrieved content from '{knowledge_source_name}' successfully.")

Retrieved content from 'earth-knowledge-source' successfully.


In [125]:
# Add a video-related query to test video search functionality
query_2 = """
    What is the video_url field value for the Intermediate training video? I need the complete https:// link."
    """

# Add the video query to messages
messages.append({
    "role": "user", 
    "content": query_2
})

# Create request for the video query
req_2 = KnowledgeAgentRetrievalRequest(
    messages=[
        KnowledgeAgentMessage(
            role=m["role"],
            content=[KnowledgeAgentMessageTextContent(text=m["content"])]
        ) for m in messages if m["role"] != "system"
    ],
    knowledge_source_params=[
        SearchIndexKnowledgeSourceParams(
            knowledge_source_name=knowledge_source_name,
        )
    ]
)

# Execute the video query
result_2 = agent_client.retrieve(retrieval_request=req_2, api_version=search_api_version)
print(f"Retrieved video content from '{knowledge_source_name}' successfully.")

Retrieved video content from 'earth-knowledge-source' successfully.


### Review the retrieval response, activity, and results

Because your knowledge agent is configured for answer synthesis, the retrieval response contains the following values:

+ `response_content`: An LLM-generated answer to the query that cites the retrieved documents.

+ `activity_content`: Detailed planning and execution information, including subqueries, reranking decisions, and intermediate steps.

+ `references_content`: Source documents and chunks that contributed to the answer.

**Tip:** Retrieval parameters, such as reranker thresholds and knowledge source parameters, influence how aggressively your agent reranks and which sources it queries. Inspect the activity and references to validate grounding and build traceable citations.

In [126]:
response_contents = []
activity_contents = []
references_contents = []

In [127]:
import json

# Build simple string values for response_content, activity_content, and references_content

# Process both results
for i, result_obj in enumerate([result, result_2], 1):
    # Responses -> Concatenate text/value fields from all response contents
    response_parts = []
    if getattr(result_obj, "response", None):
        for resp in result_obj.response:
            for content in getattr(resp, "content", []):
                text = getattr(content, "text", None) or getattr(content, "value", None) or str(content)
                response_parts.append(text)
    
    response_content = "\n\n".join(response_parts) if response_parts else f"No response found on 'result_{i}'"
    response_contents.append(response_content)
    
    # Print the response content
    print(f"=== Response Content {i} ===")
    print("response_content:\n", response_content, "\n")

=== Response Content 1 ===
response_content:
 To find training content for advanced learners, you can access the advanced-level training video titled "Training Video". This course provides comprehensive instruction on the topic and is available at the following link: https://mariagana.blob.core.windows.net/training-videos/Advanced.mp4 [ref_id:0]. 

=== Response Content 2 ===
response_content:
 The video URL for the Intermediate training video is https://mariagana.blob.core.windows.net/training-videos/Intermediate.mp4 [ref_id:0]. 



In [94]:
messages.append({
    "role": "assistant",
    "content": response_content
})

In [95]:
# Activity -> JSON string of activity as list of dicts
if getattr(result, "activity", None):
    activity_content = json.dumps([a.as_dict() for a in result.activity], indent=2)
else:
    activity_content = "No activity found on 'result'"
    
activity_contents.append(activity_content)
print("activity_content:\n", activity_content, "\n")

activity_content:
 [
  {
    "id": 0,
    "type": "modelQueryPlanning",
    "elapsed_ms": 1328,
    "input_tokens": 2074,
    "output_tokens": 146
  },
  {
    "id": 1,
    "type": "searchIndex",
    "elapsed_ms": 242,
    "knowledge_source_name": "earth-knowledge-source",
    "query_time": "2025-10-22T06:15:33.379Z",
    "count": 0,
    "search_index_arguments": {
      "search": "Why do suburban belts display larger December brightening than urban cores?"
    }
  },
  {
    "id": 2,
    "type": "searchIndex",
    "elapsed_ms": 230,
    "knowledge_source_name": "earth-knowledge-source",
    "query_time": "2025-10-22T06:15:33.610Z",
    "count": 0,
    "search_index_arguments": {
      "search": "Why are absolute light levels higher downtown?"
    }
  },
  {
    "id": 3,
    "type": "searchIndex",
    "elapsed_ms": 167,
    "knowledge_source_name": "earth-knowledge-source",
    "query_time": "2025-10-22T06:15:33.777Z",
    "count": 1,
    "search_index_arguments": {
      "search": "Wh

In [96]:
# References -> JSON string of references as list of dicts
if getattr(result, "references", None):
    references_content = json.dumps([r.as_dict() for r in result.references], indent=2)
else:
    references_content = "No references found on 'result'"
    
references_contents.append(references_content)
print("references_content:\n", references_content)

references_content:
 [
  {
    "type": "searchIndex",
    "id": "0",
    "activity_source": 3,
    "reranker_score": 2.7294974,
    "doc_key": "earth_at_night_508_page_104_verbalized"
  }
]


## Continue the conversation

This step continues the conversation with your knowledge agent, building upon the previous messages and queries to retrieve relevant information from your knowledge source.

In [98]:
instructions = """
A Q&A agent that can answer questions about the Earth at night and training videos. 
When recommending training videos, ALWAYS include the complete video URL, instructor name, and duration.
For video queries, provide the direct link in this format: "Video URL: [full URL]"
If you don't have the answer, respond with "I don't know".
"""

messages = [
    {
        "role": "system",
        "content": instructions
    }
]

In [123]:
query_2 = "How do I find training content for Advanced learners?"
messages.append({
    "role": "user",
    "content": query_2
})

req = KnowledgeAgentRetrievalRequest(
    messages=[
        KnowledgeAgentMessage(
            role=m["role"],
            content=[KnowledgeAgentMessageTextContent(text=m["content"])]
        ) for m in messages if m["role"] != "system"
    ],
    knowledge_source_params=[
        SearchIndexKnowledgeSourceParams(
            knowledge_source_name=knowledge_source_name,
        )
    ]
)

result = agent_client.retrieve(retrieval_request=req, api_version=search_api_version)
print(f"Retrieved content from '{knowledge_source_name}' successfully.")

Retrieved content from 'earth-knowledge-source' successfully.


### Review the new retrieval response, activity, and results

In [124]:
import json

# Build simple string values for response_content, activity_content, and references_content

# Responses -> Concatenate text/value fields from all response contents
response_parts = []
if getattr(result, "response", None):
    for resp in result.response:
        for content in getattr(resp, "content", []):
            text = getattr(content, "text", None) or getattr(content, "value", None) or str(content)
            response_parts.append(text)
response_content = "\n\n".join(response_parts) if response_parts else "No response found on 'result'"

response_contents.append(response_content)

# Print the three string values
print("response_content:\n", response_content, "\n")

response_content:
 To find training content for advanced learners, you can access the advanced-level training video titled "Training Video". This course provides comprehensive instruction on the topic and is available at the following link: https://mariagana.blob.core.windows.net/training-videos/Advanced.mp4 [ref_id:0]. 



In [143]:
# Combined query that spans both Earth science documents and training videos
query_combined = """
Any training videos available for beginners, any training videos would be fine. 
Please provide the video URLs and instructor information for any relevant courses.
"""

messages.append({
    "role": "user",
    "content": query_combined
})

req_combined = KnowledgeAgentRetrievalRequest(
    messages=[
        KnowledgeAgentMessage(
            role=m["role"],
            content=[KnowledgeAgentMessageTextContent(text=m["content"])]
        ) for m in messages if m["role"] != "system"
    ],
    knowledge_source_params=[
        SearchIndexKnowledgeSourceParams(
            knowledge_source_name=knowledge_source_name,
        )
    ]
)

result_combined = agent_client.retrieve(retrieval_request=req_combined, api_version=search_api_version)
print("Retrieved combined content successfully.")

Retrieved combined content successfully.


In [129]:
# Activity -> JSON string of activity as list of dicts
if getattr(result, "activity", None):
    activity_content = json.dumps([a.as_dict() for a in result.activity], indent=2)
else:
    activity_content = "No activity found on 'result'"
    
activity_contents.append(activity_content)
print("activity_content:\n", activity_content, "\n")

activity_content:
 [
  {
    "id": 0,
    "type": "modelQueryPlanning",
    "elapsed_ms": 1257,
    "input_tokens": 2188,
    "output_tokens": 79
  },
  {
    "id": 1,
    "type": "searchIndex",
    "elapsed_ms": 427,
    "knowledge_source_name": "earth-knowledge-source",
    "query_time": "2025-10-22T06:48:43.426Z",
    "count": 1,
    "search_index_arguments": {
      "search": "advanced training content for learners"
    }
  },
  {
    "id": 2,
    "type": "searchIndex",
    "elapsed_ms": 162,
    "knowledge_source_name": "earth-knowledge-source",
    "query_time": "2025-10-22T06:48:43.588Z",
    "count": 0,
    "search_index_arguments": {
      "search": "training resources for advanced learners"
    }
  },
  {
    "id": 3,
    "type": "searchIndex",
    "elapsed_ms": 372,
    "knowledge_source_name": "earth-knowledge-source",
    "query_time": "2025-10-22T06:48:43.960Z",
    "count": 0,
    "search_index_arguments": {
      "search": "advanced learning materials"
    }
  },
  {
  

In [144]:
# Extract and display the response content
response_parts = []
if getattr(result_combined, "response", None):
    for resp in result_combined.response:
        for content in getattr(resp, "content", []):
            text = getattr(content, "text", None) or getattr(content, "value", None) or str(content)
            response_parts.append(text)

response_content = "\n\n".join(response_parts) if response_parts else "No response found"
print("=== Combined Query Response ===")
print(response_content)
print("\n" + "="*50 + "\n")

# Display activity logs to see task decomposition
if getattr(result_combined, "activity", None):
    print("=== Task Decomposition Activity ===")
    for i, activity in enumerate(result_combined.activity):
        activity_dict = activity.as_dict()
        print(f"Step {i+1}: {activity_dict.get('type', 'Unknown')}")
        if 'search_index_arguments' in activity_dict:
            print(f"  Search Query: {activity_dict['search_index_arguments'].get('search', 'N/A')}")
        if 'count' in activity_dict:
            print(f"  Results Found: {activity_dict['count']}")
        if 'elapsed_ms' in activity_dict:
            print(f"  Time: {activity_dict['elapsed_ms']}ms")
        print()

# Display references to see which content was used
if getattr(result_combined, "references", None):
    print("=== References Used ===")
    for i, ref in enumerate(result_combined.references):
        ref_dict = ref.as_dict()
        print(f"Reference {i+1}:")
        print(f"  Document Key: {ref_dict.get('doc_key', 'N/A')}")
        print(f"  Reranker Score: {ref_dict.get('reranker_score', 'N/A')}")
        print(f"  Activity Source: {ref_dict.get('activity_source', 'N/A')}")
        print()

=== Combined Query Response ===
There is a training video available for beginners titled "Introduction." This course provides comprehensive instruction on the topic at a beginner level. You can access the training video through the following URL: https://mariagana.blob.core.windows.net/training-videos/Introduction.mp4 [ref_id:0]. No specific instructor information was found in the retrieved content.


=== Task Decomposition Activity ===
Step 1: modelQueryPlanning
  Time: 1334ms

Step 2: searchIndex
  Search Query: beginner training videos for urban lighting analysis
  Results Found: 0
  Time: 496ms

Step 3: searchIndex
  Search Query: video URLs for beginner training courses
  Results Found: 1
  Time: 420ms

Step 4: searchIndex
  Search Query: instructor information for beginner courses on urban lighting
  Results Found: 0
  Time: 405ms

Step 5: semanticReranker

Step 6: modelAnswerSynthesis
  Time: 961ms

=== References Used ===
Reference 1:
  Document Key: beginner-introduction
  Rer

In [15]:
# References -> JSON string of references as list of dicts
if getattr(result, "references", None):
    references_content = json.dumps([r.as_dict() for r in result.references], indent=2)
else:
    references_content = "No references found on 'result'"
    
references_contents.append(references_content)
print("references_content:\n", references_content)

references_content:
 [
  {
    "type": "searchIndex",
    "id": "0",
    "activity_source": 3,
    "reranker_score": 3.0506465,
    "doc_key": "earth_at_night_508_page_104_verbalized"
  },
  {
    "type": "searchIndex",
    "id": "1",
    "activity_source": 3,
    "reranker_score": 2.5884187,
    "doc_key": "earth_at_night_508_page_105_verbalized"
  }
]


## Run an evaluation with Azure AI Foundry

To evaluate the groundedness and relevance of the pipeline, run an evaluation with Azure AI Foundry. For more detailed guidance, see [Run evaluations locally by using the Azure AI Foundry SDK](https://learn.microsoft.com/azure/ai-foundry/how-to/develop/evaluate-sdk).

### Prerequisites

+ The same [Azure AI Foundry project](ttps://learn.microsoft.com/azure/ai-foundry/how-to/create-projects) you used for agentic retrieval. Set `FOUNDRY_ENDPOINT` to your project endpoint in the `.env` file. You can find this endpoint on the **Overview** page of your project in the [Azure AI Foundry portal](https://ai.azure.com/).

+ The `azure-ai-evaluation` package. Run the following command to install it:

In [122]:
! pip install azure-ai-evaluation --quiet


[notice] A new release of pip is available: 25.0.1 -> 25.2
[notice] To update, run: python.exe -m pip install --upgrade pip


In [16]:
# Load connections
from dotenv import load_dotenv
import os

load_dotenv(override=True)

foundry_endpoint = os.environ["FOUNDRY_ENDPOINT"]
aoai_api_version = os.environ["AOAI_API_VERSION"]

# Run the evaluation
from azure.ai.evaluation import AzureOpenAIModelConfiguration, GroundednessEvaluator, RelevanceEvaluator, evaluate
import json

evaluation_data = []
print("Preparing evaluation data...")
for q, r, g in zip([query_1, query_2], references_contents, response_contents):
    evaluation_data.append({
        "query": q,
        "response": g,
        "context": r,
    })

filename = "evaluation_data.jsonl"

with open(filename, "w") as f:
    for item in evaluation_data:
        f.write(json.dumps(item) + "\n")

model_config = AzureOpenAIModelConfiguration(
    azure_endpoint=aoai_endpoint,
    api_version=aoai_api_version,
    azure_deployment=aoai_gpt_model
)

# RAG triad metrics
groundedness = GroundednessEvaluator(model_config=model_config)
relevance = RelevanceEvaluator(model_config=model_config)

print("Starting evaluation...")
result = evaluate(
    data=filename,
    evaluators={
        "groundedness": groundedness,
        "relevance": relevance,
    },
    azure_ai_project=foundry_endpoint,
)

print("Evaluation complete.")
studio_url = result.get("studio_url")
print("For more information, go to the Azure AI Foundry portal.") if studio_url else None

Preparing evaluation data...


NameError: name 'query_2' is not defined