# Quickstart: Agentic retrieval in Azure AI Search

Use this notebook to get started with [agentic retrieval](https://learn.microsoft.com/azure/search/search-agentic-retrieval-concept) in Azure AI Search, which integrates conversation history and large language models (LLMs) on Azure OpenAI to plan, retrieve, and synthesize complex queries.

Steps in this notebook include:

+ Creating an `earth_at_night` search index.

+ Loading the index with documents from a GitHub URL.

+ Creating an `earth-search-agent` in Azure AI Search that points to an LLM for query planning.

+ Using the agent to fetch and rank relevant information from the index.

+ Generating answers using the Azure OpenAI client.

This notebook provides a high-level demonstration of agentic retrieval. For more detailed guidance, see [Quickstart: Run agentic retrieval in Azure AI Search](https://learn.microsoft.com/azure/search/search-get-started-agentic-retrieval).

## Prerequisites

+ An [Azure AI Search service](https://learn.microsoft.com/azure/search/search-create-service-portal) on the Basic tier or higher with [semantic ranker enabled](https://learn.microsoft.com/azure/search/semantic-how-to-enable-disable).

+ An [Azure OpenAI resource](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource).

+ A [supported model](https://learn.microsoft.com/azure/search/search-agentic-retrieval-how-to-create#supported-models) deployed to your Azure OpenAI resource. This notebook uses `gpt-4.1-mini`.

## Configure access

This notebook assumes authentication and authorization using Microsoft Entra ID and role assignments. It also assumes that you run the code from your local device.

To configure role-based access:

1. Sign in to the [Azure portal](https://portal.azure.com).

1. [Enable role-based access](https://learn.microsoft.com/azure/search/search-security-enable-roles) on your Azure AI Search service.

1. [Create a system-assigned managed identity](https://learn.microsoft.com/azure/search/search-howto-managed-identities-data-sources#create-a-system-managed-identity) on your Azure AI Search service.

1. On your Azure AI Search service, [assign the following roles](https://learn.microsoft.com/azure/search/search-security-rbac#how-to-assign-roles-in-the-azure-portal) to yourself.

   + **Search Service Contributor**

   + **Search Index Data Contributor**

   + **Search Index Data Reader**

1. On your Azure OpenAI resource, assign **Cognitive Services User** to the managed identity of your search service.

## Set up connections

The `sample.env` file contains environment variables for connections to Azure AI Search and Azure OpenAI. Agentic retrieval requires these connections for document retrieval, query planning, query execution, and answer generation.

To set up connections:

1. Sign in to the [Azure portal](https://portal.azure.com).

2. Retrieve the endpoints for both Azure AI Search and Azure OpenAI.

3. Save the `sample.env` file as `.env` on your local device.

4. Update the `.env` file with the retrieved endpoints.

## Create a virtual environment

The `requirements.txt` file contains the dependencies for this notebook. You can install these dependencies in isolation using a virtual environment.

To create a virtual environment:

1. In Visual Studio Code, open the folder that contains `quickstart.ipynb`.

1. Press **Ctrl**+**Shift**+**P** to open the command palette.

1. Search for **Python: Create Environment**, and then select **Venv**.

1. Select a Python installation. We tested this notebook on Python 3.13.

1. Select `requirements.txt` for the dependencies.

Creating the virtual environment can take several minutes. When the environment is ready, proceed to the next step.

## Install packages and load connections

This step installs the packages for this notebook and establishes connections to Azure AI Search and Azure OpenAI.

In [1]:
! pip install -r requirements.txt --quiet


[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [2]:
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
import os

load_dotenv(override=True) # Take environment variables from .env.

# The following variables from your .env file are used in this notebook
answer_model = os.getenv("ANSWER_MODEL", "gpt-4o")
endpoint = os.environ["AZURE_SEARCH_ENDPOINT"]
credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://search.azure.com/.default")
index_name = os.getenv("AZURE_SEARCH_INDEX", "earth_at_night")
azure_openai_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
azure_openai_gpt_deployment = os.getenv("AZURE_OPENAI_GPT_DEPLOYMENT", "gpt-4o")
azure_openai_gpt_model = os.getenv("AZURE_OPENAI_GPT_MODEL", "gpt-4o")
azure_openai_api_version = os.getenv("AZURE_OPENAI_API_VERSION", "2025-03-01-preview")
azure_openai_embedding_deployment = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT", "text-embedding-3-large")
azure_openai_embedding_model = os.getenv("AZURE_OPENAI_EMBEDDING_MODEL", "text-embedding-3-large")
agent_name = os.getenv("AZURE_SEARCH_AGENT_NAME", "earth-search-agent")
api_version = "2025-05-01-Preview"

## Create an index in Azure AI Search

This step creates a search index that contains plain text and vector content. You can use an existing index, but it must meet the criteria for [agentic retrieval workloads](https://learn.microsoft.com/azure/search/search-agentic-retrieval-how-to-index). The primary schema requirement is a semantic configuration with a `default_configuration_name`.

In [3]:
from azure.search.documents.indexes.models import SearchIndex, SearchField, VectorSearch, VectorSearchProfile, HnswAlgorithmConfiguration, AzureOpenAIVectorizer, AzureOpenAIVectorizerParameters, SemanticSearch, SemanticConfiguration, SemanticPrioritizedFields, SemanticField
from azure.search.documents.indexes import SearchIndexClient

index = SearchIndex(
    name=index_name,
    fields=[
        SearchField(name="id", type="Edm.String", key=True, filterable=True, sortable=True, facetable=True),
        SearchField(name="page_chunk", type="Edm.String", filterable=False, sortable=False, facetable=False),
        SearchField(name="page_embedding_text_3_large", type="Collection(Edm.Single)", stored=False, vector_search_dimensions=3072, vector_search_profile_name="hnsw_text_3_large"),
        SearchField(name="page_number", type="Edm.Int32", filterable=True, sortable=True, facetable=True)
    ],
    vector_search=VectorSearch(
        profiles=[VectorSearchProfile(name="hnsw_text_3_large", algorithm_configuration_name="alg", vectorizer_name="azure_openai_text_3_large")],
        algorithms=[HnswAlgorithmConfiguration(name="alg")],
        vectorizers=[
            AzureOpenAIVectorizer(
                vectorizer_name="azure_openai_text_3_large",
                parameters=AzureOpenAIVectorizerParameters(
                    resource_url=azure_openai_endpoint,
                    deployment_name=azure_openai_embedding_deployment,
                    model_name=azure_openai_embedding_model
                )
            )
        ]
    ),
    semantic_search=SemanticSearch(
        default_configuration_name="semantic_config",
        configurations=[
            SemanticConfiguration(
                name="semantic_config",
                prioritized_fields=SemanticPrioritizedFields(
                    content_fields=[
                        SemanticField(field_name="page_chunk")
                    ]
                )
            )
        ]
    )
)

index_client = SearchIndexClient(endpoint=endpoint, credential=credential)
index_client.create_or_update_index(index)
print(f"Index '{index_name}' created or updated successfully")

Index 'earth_at_night' created or updated successfully


## Upload sample documents

This notebook uses data from NASA's Earth at Night e-book. The data is retrieved from the [azure-search-sample-data](https://github.com/Azure-Samples/azure-search-sample-data) repository on GitHub and passed to the search client for indexing.

In [4]:
import requests
from azure.search.documents import SearchIndexingBufferedSender

url = "https://raw.githubusercontent.com/Azure-Samples/azure-search-sample-data/refs/heads/main/nasa-e-book/earth-at-night-json/documents.json"
documents = requests.get(url).json()

with SearchIndexingBufferedSender(endpoint=endpoint, index_name=index_name, credential=credential) as client:
    client.upload_documents(documents=documents)

print(f"Documents uploaded to index '{index_name}'")

Documents uploaded to index 'earth_at_night'


## Create an agent in Azure AI Search

This step creates a knowledge agent, which acts as a wrapper for the LLM you deployed to Azure OpenAI. The LLM is used to send queries to an agentic retrieval pipeline.

In [5]:
from azure.search.documents.indexes.models import KnowledgeAgent, KnowledgeAgentAzureOpenAIModel, KnowledgeAgentTargetIndex, KnowledgeAgentRequestLimits, AzureOpenAIVectorizerParameters

agent = KnowledgeAgent(
    name=agent_name,
    models=[
        KnowledgeAgentAzureOpenAIModel(
            azure_open_ai_parameters=AzureOpenAIVectorizerParameters(
                resource_url=azure_openai_endpoint,
                deployment_name=azure_openai_gpt_deployment,
                model_name=azure_openai_gpt_model
            )
        )
    ],
    target_indexes=[
        KnowledgeAgentTargetIndex(
            index_name=index_name,
            default_reranker_threshold=2.5
        )
    ],
)

index_client.create_or_update_agent(agent)
print(f"Knowledge agent '{agent_name}' created or updated successfully")


Knowledge agent 'earth-search-agent' created or updated successfully


## Set up messages

Messages are the input for the retrieval route and contain the conversation history. Each message includes a `role` that indicates its origin, such as `assistant` or `user`, and `content` in natural language. The LLM you use determines which roles are valid.

In [6]:
instructions = """
A Q&A agent that can answer questions about the Earth at night.
Sources have a JSON format with a ref_id that must be cited in the answer.
If you do not have the answer, respond with "I don't know".
"""

messages = [
    {
        "role": "system",
        "content": instructions
    }
]

## Use agentic retrieval to fetch results

This step runs the retrieval pipeline to extract relevant information from your search index. Based on the messages and parameters on the retrieval request, the LLM:

1. Analyzes the entire conversation history to determine the underlying information need.

1. Breaks down the compound user query into focused subqueries.
 
1. Runs each subquery simultaneously against text fields and vector embeddings in your index.

1. Uses semantic ranker to rerank the results of all subqueries.

1. Merges the results into a single string.

In [None]:
from azure.search.documents.agent import KnowledgeAgentRetrievalClient
from azure.search.documents.agent.models import KnowledgeAgentRetrievalRequest, KnowledgeAgentMessage, KnowledgeAgentMessageTextContent, KnowledgeAgentIndexParams

agent_client = KnowledgeAgentRetrievalClient(endpoint=endpoint, agent_name=agent_name, credential=credential)

messages.append({
    "role": "user",
    "content": """
    Why do suburban belts display larger December brightening than urban cores even though absolute light levels are higher downtown?
    Why is the Phoenix nighttime street grid is so sharply visible from space, whereas large stretches of the interstate between midwestern cities remain comparatively dim?
    """
})

retrieval_result = agent_client.retrieve(
    retrieval_request=KnowledgeAgentRetrievalRequest(
        messages=[KnowledgeAgentMessage(role=msg["role"], content=[KnowledgeAgentMessageTextContent(text=msg["content"])]) for msg in messages if msg["role"] != "system"],
        target_index_params=[KnowledgeAgentIndexParams(index_name=index_name, reranker_threshold=2.5)]
    )
)
messages.append({
    "role": "assistant",
    "content": retrieval_result.response[0].content[0].text
})

### Review the retrieval response, activity, and results

Each retrieval response from Azure AI Search includes:

+ A unified string that represents grounding data from the search results.

+ The query plan.

+ Reference data that shows which chunks of the source documents contributed to the unified string.

In [8]:
import textwrap

print("Response")
print(textwrap.fill(retrieval_result.response[0].content[0].text, width=120))

Response
[{"ref_id":0,"content":"# Urban Structure\n\n## March 16, 2013\n\n### Phoenix Metropolitan Area at Night\n\nThis figure
presents a nighttime satellite view of the Phoenix metropolitan area, highlighting urban structure and transport
corridors. City lights illuminate the layout of several cities and major thoroughfares.\n\n**Labeled Urban
Features:**\n\n- **Phoenix:** Central and brightest area in the right-center of the image.\n- **Glendale:** Located to
the west of Phoenix, this city is also brightly lit.\n- **Peoria:** Further northwest, this area is labeled and its
illuminated grid is seen.\n- **Grand Avenue:** Clearly visible as a diagonal, brightly lit thoroughfare running from
Phoenix through Glendale and Peoria.\n- **Salt River Channel:** Identified in the southeast portion, running through
illuminated sections.\n- **Phoenix Mountains:** Dark, undeveloped region to the northeast of Phoenix.\n- **Agricultural
Fields:** Southwestern corner of the image, grid patterns are 

In [9]:
import json
print("Activity")
print(json.dumps([a.as_dict() for a in retrieval_result.activity], indent=2))
print("Results")
print(json.dumps([r.as_dict() for r in retrieval_result.references], indent=2))

Activity
[
  {
    "id": 0,
    "type": "ModelQueryPlanning",
    "input_tokens": 1265,
    "output_tokens": 278
  },
  {
    "id": 1,
    "type": "AzureSearchQuery",
    "target_index": "earth_at_night",
    "query": {
      "search": "suburban belts December brightening urban cores comparison"
    },
    "query_time": "2025-05-13T22:07:16.959Z",
    "count": 0,
    "elapsed_ms": 864
  },
  {
    "id": 2,
    "type": "AzureSearchQuery",
    "target_index": "earth_at_night",
    "query": {
      "search": "Phoenix nighttime street grid visibility from space"
    },
    "query_time": "2025-05-13T22:07:17.386Z",
    "count": 2,
    "elapsed_ms": 413
  },
  {
    "id": 3,
    "type": "AzureSearchQuery",
    "target_index": "earth_at_night",
    "query": {
      "search": "midwestern cities interstate dimness compared to Phoenix"
    },
    "query_time": "2025-05-13T22:07:17.654Z",
    "count": 0,
    "elapsed_ms": 267
  },
  {
    "id": 4,
    "type": "AzureSearchSemanticRanker",
    "inp

## Create the Azure OpenAI client

So far, this notebook has used agentic retrieval for answer *extraction*, which you can extend to answer *generation* by using the Azure OpenAI client. This enables more detailed, context-rich responses that aren't strictly tied to indexed content.

In [10]:
from openai import AzureOpenAI
from azure.identity import get_bearer_token_provider

azure_openai_token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")
client = AzureOpenAI(
    azure_endpoint=azure_openai_endpoint,
    azure_ad_token_provider=azure_openai_token_provider,
    api_version=azure_openai_api_version
)

### Use the Responses API to generate an answer

One option for answer generation is the Responses API, which passes the conversation history to the LLM for processing.

In [11]:
response = client.responses.create(
    model=answer_model,
    input=messages
)

wrapped = textwrap.fill(response.output_text, width=100)
print(wrapped)

Suburban belts often experience larger December brightening than urban cores due to differences in
light usage and urban layout. In suburban areas, there may be more widespread holiday lighting and
residential outdoor lighting, which contributes to increased brightness during December. Urban cores
have higher absolute light levels from commercial and street lighting, but the additional seasonal
lighting impact is relatively smaller compared to suburbs where individual residential displays can
have a more pronounced effect [1].  Phoenix's street grid is sharply visible from space because the
city is laid out in a regular grid pattern typical of central and western U.S. cities. This grid is
accentuated by the street lighting and prominent urban features, like the Grand Avenue corridor and
brightly lit commercial properties. In contrast, large stretches of interstate between Midwestern
cities can remain dim due to less dense urbanization and fewer built-up areas that provide extensive
lig

### Use the Chat Completions API to generate an answer

Alternatively, you can use the Chat Completions API for answer generation.

In [12]:
response = client.chat.completions.create(
    model=answer_model,
    messages=messages
)

wrapped = textwrap.fill(response.choices[0].message.content, width=100)
print(wrapped)

Suburban belts tend to exhibit larger December brightening than urban cores due to the widespread
holiday lighting and decoration typical in residential areas. These decorations enhance the lighting
levels during the holiday season, leading to increased brightness. Urban cores usually have high
absolute light levels year-round due to concentrated commercial and industrial activities, and thus
may not show as significant a change during December as residential areas do due to holiday
decorations [ref_id:1].  The distinct visibility of Phoenix's nighttime street grid from space can
be attributed to its urban planning structure. Phoenix, like other cities in the central and western
United States, has a clear grid-like layout that is accentuated by extensive street lighting
patterns. This setup makes the street grid sharply visible, whereas large stretches of interstate
highways between midwestern cities typically traverse rural areas that lack such dense lighting,
resulting in comparative

## Continue the conversation

This step continues the conversation with the knowledge agent, building upon the previous messages and queries to retrieve relevant information from your search index.

In [None]:
messages.append({
    "role": "user",
    "content": "How do I find lava at night?"
})

retrieval_result = agent_client.retrieve(
    retrieval_request=KnowledgeAgentRetrievalRequest(
        messages=[KnowledgeAgentMessage(role=msg["role"], content=[KnowledgeAgentMessageTextContent(text=msg["content"])]) for msg in messages if msg["role"] != "system"],
        target_index_params=[KnowledgeAgentIndexParams(index_name=index_name, reranker_threshold=2.5)]
    )
)
messages.append({
    "role": "assistant",
    "content": retrieval_result.response[0].content[0].text
})

### Review the retrieval response, activity, and results

In [15]:
print("Response")
print(textwrap.fill(retrieval_result.response[0].content[0].text, width=120))

Response
[{"ref_id":0,"content":"## Nature's Light Shows\n\nAt night, with the light of the Sun removed, nature's brilliant glow
from Earth's surface becomes visible to the naked eye from space. Some of Earth's most spectacular light shows are
natural, like the aurora borealis, or Northern Lights, in the Northern Hemisphere (aurora australis, or Southern Lights,
in the Southern Hemisphere). The auroras are natural electrical phenomena caused by charged particles that race from the
Sun toward Earth, inducing chemical reactions in the upper atmosphere and creating the appearance of streamers of
reddish or greenish light in the sky, usually near the northern or southern magnetic pole. Other natural lights can
indicate danger, like a raging forest fire encroaching on a city, town, or community, or lava spewing from an erupting
volcano.\n\nWhatever the source, the ability of humans to monitor nature's light shows at night has practical
applications for society. For example, tracking fires d

In [None]:
import json
print("Activity")
print(json.dumps([a.as_dict() for a in retrieval_result.activity], indent=2))
print("Results")
print(json.dumps([r.as_dict() for r in retrieval_result.references], indent=2))

Activity
[
  {
    "id": 0,
    "type": "ModelQueryPlanning",
    "input_tokens": 2283,
    "output_tokens": 207
  },
  {
    "id": 1,
    "type": "AzureSearchQuery",
    "target_index": "earth_at_night",
    "query": {
      "search": "how to locate lava flows at night"
    },
    "query_time": "2025-05-06T15:44:00.218Z",
    "count": 6,
    "elapsed_ms": 497
  },
  {
    "id": 2,
    "type": "AzureSearchQuery",
    "target_index": "earth_at_night",
    "query": {
      "search": "best practices for observing lava at night"
    },
    "query_time": "2025-05-06T15:44:00.571Z",
    "elapsed_ms": 352
  }
]
Results
[
  {
    "type": "AzureSearchDoc",
    "id": "0",
    "activity_source": 1,
    "doc_key": "earth_at_night_508_page_60_verbalized"
  },
  {
    "type": "AzureSearchDoc",
    "id": "1",
    "activity_source": 1,
    "doc_key": "earth_at_night_508_page_64_verbalized"
  },
  {
    "type": "AzureSearchDoc",
    "id": "2",
    "activity_source": 1,
    "doc_key": "earth_at_night_50

## Generate answer

In [16]:
response = client.responses.create(
    model=answer_model,
    input=messages
)

wrapped = textwrap.fill(response.output_text, width=100)
print(wrapped)

To find lava at night, you can use satellite imagery and nighttime observation techniques:  1.
**Satellite Imagery**: Satellites equipped with infrared sensors can detect the heat emitted by lava
flows. Instruments like the VIIRS Day/Night Band can capture the glow from active volcanoes using
moonlight and other faint light sources. [ref_id:3]  2. **Thermal Imaging**: Combining thermal data
with infrared wavelengths can reveal hot lava flows and cooling lava. This method is useful for
monitoring volcanic activity even when obstructed by clouds. [ref_id:5]  3. **Observation from
Space**: Active volcanoes like Mount Etna and Kilauea have been observed at night using these
techniques. The visible brightness from hot lava can be distinguished from city lights, making it
easier to identify volcanic activity. [ref_id:2]  These methods are essential for scientific
communities.


## Clean up objects and resources

If you no longer need Azure AI Search or Azure OpenAI, delete them from your Azure subscription. You can also start over by deleting individual objects.

### Delete the knowledge agent

In [17]:
index_client = SearchIndexClient(endpoint=endpoint, credential=credential)
index_client.delete_agent(agent_name)
print(f"Knowledge agent '{agent_name}' deleted successfully")

Knowledge agent 'earth-search-agent' deleted successfully


### Delete the search index

In [18]:
index_client = SearchIndexClient(endpoint=endpoint, credential=credential)
index_client.delete_index(index)
print(f"Index '{index_name}' deleted successfully")

Index 'earth_at_night' deleted successfully
