# Agentic retrieval in Azure AI Search (legal usecase)

Use this notebook to get started with [agentic retrieval](https://learn.microsoft.com/azure/search/search-agentic-retrieval-concept) in Azure AI Search, which integrates conversation history and large language models (LLMs) on Azure OpenAI to plan, retrieve, and synthesize complex queries.

Steps in this notebook include:

+ Creating a search index.
+ Loading the index with documents from a GitHub URL.
+ Creating an agent in Azure AI Search that points to an LLM for query planning.
+ Using the agent to fetch and rank relevant information from the index.
+ Generating answers using the Azure OpenAI client.

This notebook provides a high-level demonstration of agentic retrieval. For more detailed guidance, see [Quickstart: Run agentic retrieval in Azure AI Search](https://learn.microsoft.com/azure/search/search-get-started-agentic-retrieval).

> https://techcommunity.microsoft.com/blog/azure-ai-services-blog/introducing-agentic-retrieval-in-azure-ai-search/4414677
>
<img src="img1.jpg">

> This feature is public preview.

## Architecture overview
Agentic retrieval builds on top of three core components of Azure AI Search:

- Index: Your search index holds both plain text and vectorized content, organized under a semantic configuration. Text fields marked as searchable and retrievable feed the LLM for query planning and grounding, while vector fields support similarity search when you enable a vectorizer.

- Agent: A new top-level resource that links your Azure AI Search service to an Azure OpenAI model. It encapsulates the model’s endpoint, authentication, and default parameters for reranking thresholds, reference inclusion, and runtime limits.

- Retrieval engine: Orchestrates the end-to-end flow: invoking the LLM for query planning, dispatching subqueries to the index in parallel, collecting results, reranking, and packaging the final grounding data along with metadata arrays.

Agentic retrieval marks a departure from traditional search features, and a shift to knowledge retrieval capabilities intentionally designed to ground agents.

## Prerequisites

+ An [Azure AI Search service](https://learn.microsoft.com/azure/search/search-create-service-portal) on the Basic tier or higher with [semantic ranker enabled](https://learn.microsoft.com/azure/search/semantic-how-to-enable-disable).
+ An [Azure OpenAI resource](https://learn.microsoft.com/azure/ai-services/openai/how-to/create-resource).
+ A [supported model](https://learn.microsoft.com/azure/search/search-agentic-retrieval-how-to-create#supported-models) deployed to your Azure OpenAI resource.

## Configure access

This notebook assumes authentication and authorization using Microsoft Entra ID and role assignments. It also assumes that you run the code from your local device.

To configure role-based access:

1. Sign in to the [Azure portal](https://portal.azure.com).
1. [Enable role-based access](https://learn.microsoft.com/azure/search/search-security-enable-roles) on your Azure AI Search service.
1. [Create a system-assigned managed identity](https://learn.microsoft.com/azure/search/search-howto-managed-identities-data-sources#create-a-system-managed-identity) on your Azure AI Search service.
1. On your Azure AI Search service, [assign the following roles](https://learn.microsoft.com/azure/search/search-security-rbac#how-to-assign-roles-in-the-azure-portal) to yourself.

   + **Owner/Contributor** or **Search Service Contributor**
   + **Search Index Data Contributor**
   + **Search Index Data Reader**

1. On your Azure OpenAI resource, assign **Cognitive Services User** to the managed identity of your search service.

## Set up connections

The `azure.env` file contains environment variables for connections to Azure AI Search and Azure OpenAI. Agentic retrieval requires these connections for document retrieval, query planning, query execution, and answer generation.

In [1]:
#!pip install azure-search-documents==11.6.0b12

In [2]:
import datetime
import json
import langchain
import openai
import os
import pandas as pd
import requests
import sys

from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from azure.search.documents import SearchIndexingBufferedSender
from azure.search.documents.agent import KnowledgeAgentRetrievalClient
from azure.search.documents.agent.models import KnowledgeAgentRetrievalRequest, KnowledgeAgentMessage, KnowledgeAgentMessageTextContent, KnowledgeAgentIndexParams
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import AzureOpenAIVectorizer, AzureOpenAIVectorizerParameters, HnswAlgorithmConfiguration, KnowledgeAgent, KnowledgeAgentAzureOpenAIModel, KnowledgeAgentRequestLimits, KnowledgeAgentTargetIndex, SearchField, SearchIndex, SemanticConfiguration, SemanticField, SemanticPrioritizedFields, SemanticSearch, VectorSearch, VectorSearchProfile
from collections import defaultdict
from dotenv import load_dotenv
from openai import AzureOpenAI
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

## Settings

In [3]:
print(f"Python version: {sys.version}")
print(f"OpenAI version: {openai.__version__}")
print(f"Langchain version: {langchain.__version__}")

Python version: 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0]
OpenAI version: 1.82.1
Langchain version: 0.3.25


In [4]:
print(f"Today is {datetime.datetime.today().strftime('%d-%b-%Y %H:%M:%S')}")

Today is 02-Jun-2025 08:09:34


In [5]:
load_dotenv("azure.env")

True

In [6]:
# Azure AI Search
credential = DefaultAzureCredential()
token_provider = get_bearer_token_provider(credential, "https://search.azure.com/.default")
azure_ai_search_endpoint = os.environ["azure_ai_search_endpoint"]
index_name = os.getenv("AZURE_SEARCH_INDEX", "agent-rag-index-demo")

# Gpt-4o mini (or gpt-4o, gpt-4.1, gpt-4.1-nano, gpt-4.1-mini)
aoai_endpoint = os.environ["azure_openai_endpoint"]  # model endpoint
aoai_gpt_deployment = os.getenv("AZURE_OPENAI_GPT_DEPLOYMENT", "gpt-4.1-mini")  # model deployment name
aoai_gpt_model = os.getenv("AZURE_OPENAI_GPT_MODEL", "gpt-4.1-mini")  # model name
aoai_api_version = os.getenv("AZURE_OPENAI_API_VERSION", "2025-03-01-preview")  # api version

# Embeddings
aoai_embedding_deployment = os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT", "text-embedding-3-large")  # model deployment name
aoai_embedding_model = os.getenv("AZURE_OPENAI_EMBEDDING_MODEL", "text-embedding-3-large")  # model name

# Agent
agent_name = os.getenv("AZURE_SEARCH_AGENT_NAME", "agent-rag-demo")  # agent name
api_version = "2025-05-01-Preview"  # api version for agentic rag

## File processing

In [7]:
file_name = "doc.pdf"

!wget https://www.equalrightstrust.org/ertdocumentbank/french_penal_code_33.pdf -O $file_name
!ls $file_name -lh

--2025-06-02 08:09:35--  https://www.equalrightstrust.org/ertdocumentbank/french_penal_code_33.pdf
Resolving www.equalrightstrust.org (www.equalrightstrust.org)... 193.37.35.248
Connecting to www.equalrightstrust.org (www.equalrightstrust.org)|193.37.35.248|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 331652 (324K) [application/pdf]
Saving to: ‘doc.pdf’


2025-06-02 08:09:36 (10.8 MB/s) - ‘doc.pdf’ saved [331652/331652]

-rwxrwxrwx 1 root root 324K Jun  2 08:09 doc.pdf


In [8]:
loader = PyPDFLoader(file_name)
pages = loader.load()

print(f"Document {file_name} has {len(pages)} pages")

Document doc.pdf has 132 pages


In [9]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=2000,
    chunk_overlap=200,
    length_function=len,
    is_separator_regex=False,
)

chunks = text_splitter.split_documents(pages)

transcripts = []
page_numbers = []
ids = []

# Group chunks by page_number
grouped_chunks = defaultdict(list)
for chunk in chunks:
    page_number = int(chunk.metadata["page_label"])
    grouped_chunks[page_number].append(chunk)

for page_number in sorted(grouped_chunks):
    for idx, chunk in enumerate(grouped_chunks[page_number], start=1):
        transcript = chunk.page_content
        unique_id = f"{os.path.splitext(file_name)[0]}_{page_number:03}_{idx:02}"
        transcripts.append(transcript)
        page_numbers.append(page_number)
        ids.append(unique_id)
        
df = pd.DataFrame({
    "text": transcripts,
    "page_number": page_numbers,
    "id": ids
})

df['page_number'] = df['page_number'].astype('int32')
df = df[['id', 'text', 'page_number']]

In [10]:
df

Unnamed: 0,id,text,page_number
0,doc_001_01,PENAL CODE\nPENAL CODE\nWith the participation...,1
1,doc_001_02,ARTICLE 112-1\n Conduct is punishable on...,1
2,doc_002_01,PENAL CODE\nARTICLE 112-4\n The immediat...,2
3,doc_002_02,ARTICLE 113-5\n French criminal law is a...,2
4,doc_002_03,misdemeanour subject to a penalty of at least ...,2
...,...,...,...
391,doc_131_01,"PENAL CODE\n€30,000.\n Proceeding to a p...",131
392,doc_131_02,"particularly serious disease.""\nARTICLE 726-15...",131
393,doc_131_03,means to cause or attempt to cause an artifici...,131
394,doc_132_01,"PENAL CODE\n 1° A fine, in the manner pr...",132


In [11]:
len(df)

396

## Upload sample document

In [12]:
azure_openai_token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

aoai_client = AzureOpenAI(azure_endpoint=aoai_endpoint,
                          azure_ad_token_provider=azure_openai_token_provider,
                          api_version=aoai_api_version)

## Create an index in Azure AI Search

This step creates a search index that contains plain text and vector content. You can use an existing index, but it must meet the criteria for [agentic retrieval workloads](https://learn.microsoft.com/azure/search/search-agentic-retrieval-how-to-index). The primary schema requirement is a semantic configuration with a `default_configuration_name`.

In [13]:
from azure.search.documents.indexes.models import SearchIndex, SearchField, VectorSearch, VectorSearchProfile, HnswAlgorithmConfiguration, AzureOpenAIVectorizer, AzureOpenAIVectorizerParameters, SemanticSearch, SemanticConfiguration, SemanticPrioritizedFields, SemanticField
from azure.search.documents.indexes import SearchIndexClient

index = SearchIndex(
    name=index_name,
    fields=[
        SearchField(name="id", type="Edm.String", key=True, filterable=True, sortable=True, facetable=True),
        SearchField(name="text", type="Edm.String", filterable=False, sortable=False, facetable=False),
        SearchField(name="vectorembeddings", type="Collection(Edm.Single)", stored=False, vector_search_dimensions=3072, vector_search_profile_name="hnsw_text_3_large"),
        SearchField(name="page_number", type="Edm.Int32", filterable=True, sortable=True, facetable=True)
    ],
 
    vector_search=VectorSearch(
        profiles=[VectorSearchProfile(name="hnsw_text_3_large", algorithm_configuration_name="alg", vectorizer_name="azure_openai_text_3_large")],
        algorithms=[HnswAlgorithmConfiguration(name="alg")],
        vectorizers=[
            AzureOpenAIVectorizer(
                vectorizer_name="azure_openai_text_3_large",
                parameters=AzureOpenAIVectorizerParameters(
                    resource_url=aoai_endpoint,
                    deployment_name=aoai_embedding_deployment,
                    model_name=aoai_embedding_model,
                )
            )
        ]
    ),
    
    semantic_search=SemanticSearch(
        default_configuration_name="semantic_config",
        configurations=[
            SemanticConfiguration(
                name="semantic_config",
                prioritized_fields=SemanticPrioritizedFields(
                    content_fields=[
                        SemanticField(field_name="text")
                    ]
                )
            )
        ]
    )
)

index_client = SearchIndexClient(endpoint=azure_ai_search_endpoint, credential=credential)
index_client.create_or_update_index(index)

print(f"Index '{index_name}' created or updated successfully")

Index 'agent-rag-index-demo' created or updated successfully


In [14]:
documents_json = df.to_json(orient='records')
documents = json.loads(documents_json)

with SearchIndexingBufferedSender(endpoint=azure_ai_search_endpoint,
                                  index_name=index_name,
                                  credential=credential) as client:
    client.upload_documents(documents=documents)

print(f"Documents uploaded to Azure AI Search index: '{index_name}'")

Documents uploaded to Azure AI Search index: 'agent-rag-index-demo'


## Create an agent in Azure AI Search

This step creates a search agent, which acts as a wrapper for the LLM you deployed to Azure OpenAI. The LLM is used to send queries to an agentic retrieval pipeline.

In [15]:
agent = KnowledgeAgent(
    name=agent_name,
    models=[
        KnowledgeAgentAzureOpenAIModel(
            azure_open_ai_parameters=AzureOpenAIVectorizerParameters(
                resource_url=aoai_endpoint,
                deployment_name=aoai_gpt_deployment,
                model_name=aoai_gpt_model)
        )
    ],
    target_indexes=[
        KnowledgeAgentTargetIndex(
            index_name=index_name,
            default_reranker_threshold=2.5,
            default_include_reference_source_data = True,
        )
    ],
)

index_client.create_or_update_agent(agent)
print(f"Knowledge agent: '{agent_name}' created or updated successfully")

Knowledge agent: 'agent-rag-demo' created or updated successfully


## Set up messages

Messages are the input for the retrieval route and contain the conversation history. Each message includes a `role` that indicates its origin, such as `assistant` or `user`, and `content` in natural language. The LLM you use determines which roles are valid.

In [16]:
instructions = """
You are a Q&A legal agent capable of answering questions related to a legal document.
The sources are provided in JSON format and include a "doc_key" reference that must be cited in the responses. 
If the answer is not available, the agent should reply with "I DO NOT KNOW".
Always add this sentence at the end of the answer: 'Note: This answer was generated by an AI'.
"""

messages = [{"role": "system", "content": instructions}]

## Use agentic retrieval to fetch results

This step runs the retrieval pipeline to extract relevant information from your search index. Based on the messages and parameters on the retrieval request, the LLM:

1. Analyzes the entire conversation history to determine the underlying information need.
1. Breaks down the compound user query into focused subqueries.
1. Runs each subquery simultaneously against text fields and vector embeddings in your index.
1. Uses semantic ranker to rerank the results of all subqueries.
1. Merges the results into a single string.

In [17]:
agent_client = KnowledgeAgentRetrievalClient(endpoint=azure_ai_search_endpoint,
                                             agent_name=agent_name,
                                             credential=credential)

In [18]:
messages.append({
    "role":
    "user",
    "content":
    """
    What are the penalties for theft? For murder? Identity theft?
    """
})

In [19]:
retrieval_result = agent_client.retrieve(
    retrieval_request=KnowledgeAgentRetrievalRequest(
        messages=[
            KnowledgeAgentMessage(role=msg["role"],
                                  content=[
                                      KnowledgeAgentMessageTextContent(
                                          text=msg["content"])
                                  ]) for msg in messages
            if msg["role"] != "system"
        ],
        target_index_params=[
            KnowledgeAgentIndexParams(index_name=index_name,
                                      reranker_threshold=2.5)
        ]))

In [20]:
messages.append({
    "role": "assistant",
    "content": retrieval_result.response[0].content[0].text
})

### Review the retrieval response, activity, and results

Each retrieval response from Azure AI Search includes:

+ A unified string that represents grounding data from the search results.
+ The query plan.
+ Reference data that shows which chunks of the source documents contributed to the unified string.

In [21]:
print("\033[1;31;34m")
print(retrieval_result.response[0].content[0].text)

[1;31;34m
[{"ref_id":0,"content":"half if, by alerting the legal or administrative authorities, he has allowed the offence which is underway to be stopped, or\nhas prevented it from resulting in loss of life or permanent disability, and, where relevant, has identified any other\nperpetrators or accomplices.\nARTICLE 311-10\n(Ordinance no. 2000-916 of 19 September 2000 Article 3 Official Journal of 22 September 2000 in force 1 January\n2002)\n       Theft is punished by criminal imprisonment for life and a fine of €150,000 where it is preceded, accompanied or\nfollowed either by violence causing death, or acts of torture or barbarity.\n       The first two paragraphs of article 132-3 governing the safety period are applicable to the offence referred to under\nthe present article.\nARTICLE 311-11\n       For the purpose of articles 311-4, 311-5, 311-6, 311-7, 311-9 and 311-10 theft followed by acts of violence\ncommitted to assist an escape or to ensure the impunity of a perpetrator or 

In [22]:
print("***** Activity *****")
print("\033[1;31;34m")
print(json.dumps([a.as_dict() for a in retrieval_result.activity], indent=5))

***** Activity *****
[1;31;34m
[
     {
          "id": 0,
          "type": "ModelQueryPlanning",
          "input_tokens": 1228,
          "output_tokens": 273
     },
     {
          "id": 1,
          "type": "AzureSearchQuery",
          "target_index": "agent-rag-index-demo",
          "query": {
               "search": "penalties for theft"
          },
          "query_time": "2025-06-02T08:10:21.290Z",
          "count": 6,
          "elapsed_ms": 540
     },
     {
          "id": 2,
          "type": "AzureSearchQuery",
          "target_index": "agent-rag-index-demo",
          "query": {
               "search": "penalties for murder"
          },
          "query_time": "2025-06-02T08:10:21.477Z",
          "count": 4,
          "elapsed_ms": 165
     },
     {
          "id": 3,
          "type": "AzureSearchQuery",
          "target_index": "agent-rag-index-demo",
          "query": {
               "search": "penalties for identity theft"
          },
          "que

In [23]:
print("***** Results *****")
print("\033[1;31;34m")
print(json.dumps([r.as_dict() for r in retrieval_result.references], indent=5))

***** Results *****
[1;31;34m
[
     {
          "type": "AzureSearchDoc",
          "id": "0",
          "activity_source": 1,
          "doc_key": "doc_068_02",
          "source_data": {
               "id": "doc_068_02",
               "text": "half if, by alerting the legal or administrative authorities, he has allowed the offence which is underway to be stopped, or\nhas prevented it from resulting in loss of life or permanent disability, and, where relevant, has identified any other\nperpetrators or accomplices.\nARTICLE 311-10\n(Ordinance no. 2000-916 of 19 September 2000 Article 3 Official Journal of 22 September 2000 in force 1 January\n2002)\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0Theft is punished by criminal imprisonment for life and a fine of \u20ac150,000 where it is preceded, accompanied or\nfollowed either by violence causing death, or acts of torture or barbarity.\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0The first two paragraphs of article 132-3 governing the saf

In [24]:
data = [r.as_dict() for r in retrieval_result.references]
sorted_data = sorted(data, key=lambda x: x["doc_key"])
print("Sources:")
print("\033[1;31;34m")

for item in sorted_data:
    source = item["doc_key"].split("_")
    page = source[1]
    print(f"Page = {source[1]} chunk page: {source[2]}")

Sources:
[1;31;34m
Page = 029 chunk page: 02
Page = 030 chunk page: 01
Page = 030 chunk page: 02
Page = 055 chunk page: 01
Page = 067 chunk page: 01
Page = 067 chunk page: 02
Page = 067 chunk page: 03
Page = 068 chunk page: 01
Page = 068 chunk page: 02
Page = 087 chunk page: 01


### Use the Responses API to generate an answer
https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/responses?tabs=python-secure

In [25]:
response = aoai_client.responses.create(
    model=aoai_gpt_deployment,
    input=messages,
    temperature=0.7,
)

print("\033[1;31;34m")
print(response.output_text)

[1;31;34m
The penalties for theft vary depending on the circumstances:

- Simple theft is punished by three years' imprisonment and a fine of €45,000 (Article 311-3).
- Theft committed by two or more people as perpetrators or accomplices is punished by five years' imprisonment and a fine of €75,000 (Article 311-4).
- Theft preceded, accompanied, or followed by acts of violence causing a maximum total incapacity to work of eight days is punished by seven years' imprisonment and a fine of €100,000 (Article 311-5).
- Theft preceded, accompanied, or followed by acts of violence causing a total incapacity to work of more than eight days is punished by ten years' imprisonment and a fine of €150,000 (Article 311-6).
- Theft preceded, accompanied, or followed by acts of violence causing mutilation or permanent disability is punished by fifteen years' criminal imprisonment and a fine of €150,000 (Article 311-7).
- Theft committed with the use or threatened use of a weapon is punished by twenty

In [26]:
response

Response(id='resp_683d5c6e799481908c2d0bacd6a1718d0a0882e8cb9726e3', created_at=1748851822.0, error=None, incomplete_details=None, instructions=None, metadata={}, model='gpt-4.1-mini', object='response', output=[ResponseOutputMessage(id='msg_683d5c711cdc8190bca898549a3e25250a0882e8cb9726e3', content=[ResponseOutputText(annotations=[], text="The penalties for theft vary depending on the circumstances:\n\n- Simple theft is punished by three years' imprisonment and a fine of €45,000 (Article 311-3).\n- Theft committed by two or more people as perpetrators or accomplices is punished by five years' imprisonment and a fine of €75,000 (Article 311-4).\n- Theft preceded, accompanied, or followed by acts of violence causing a maximum total incapacity to work of eight days is punished by seven years' imprisonment and a fine of €100,000 (Article 311-5).\n- Theft preceded, accompanied, or followed by acts of violence causing a total incapacity to work of more than eight days is punished by ten yea

In [27]:
response.model

'gpt-4.1-mini'

In [28]:
response.usage

ResponseUsage(input_tokens=4634, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=532, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=5166)

## Continue the conversation

This step continues the conversation with the search agent, building upon the previous messages and queries to retrieve relevant information from your search index.

In [29]:
messages.append({
    "role": "user",
    "content": "What are the potential imprisonment consequences for a driver who unintentionally causes a personal injury that results in the victim being unable to work for four months?"
})

retrieval_result = agent_client.retrieve(
    retrieval_request=KnowledgeAgentRetrievalRequest(
        messages=[
            KnowledgeAgentMessage(role=msg["role"],
                                  content=[
                                      KnowledgeAgentMessageTextContent(
                                          text=msg["content"])
                                  ]) for msg in messages
            if msg["role"] != "system"
        ],
        target_index_params=[
            KnowledgeAgentIndexParams(index_name=index_name,
                                      reranker_threshold=2.5)
        ]))

messages.append({
    "role": "assistant",
    "content": retrieval_result.response[0].content[0].text
})

### Review the retrieval response, activity, and results

In [30]:
print("***** Response *****")
print("\033[1;31;34m")
print(retrieval_result.response[0].content[0].text)

***** Response *****
[1;31;34m
[{"ref_id":0,"content":"mission or against a health professional in the exercise of his duties, where the status of the victim is apparent or known\nto the perpetrator;\n       5° against a witness, a victim or civil party, either to prevent him from denouncing the action, filing a complaint or\nmaking a statement before a court, or because of his denunciation, complaint or statement;\n       5°bis because of the victim's actual or supposed membership or non-membership of a given ethnic group, nation,\nrace or religion;\n       5°ter because of the sexual orientation of the victim;\n       6° by the spouse or cohabitee of the victim;\n       7° by a person holding public authority or discharging a public service mission, in the exercise or at the occasion of\nthe exercise of the functions or mission;\n       8° by two or more acting as perpetrators or accomplices;\n       9° with premeditation;\n       10° with the use or threatened use of a weapon.\n   

In [31]:
print("***** Activity *****")
print("\033[1;31;34m")
print(json.dumps([a.as_dict() for a in retrieval_result.activity], indent=5))

***** Activity *****
[1;31;34m
[
     {
          "id": 0,
          "type": "ModelQueryPlanning",
          "input_tokens": 5793,
          "output_tokens": 240
     },
     {
          "id": 1,
          "type": "AzureSearchQuery",
          "target_index": "agent-rag-index-demo",
          "query": {
               "search": "Imprisonment penalties for unintentional personal injury causing four months incapacity to work"
          },
          "query_time": "2025-06-02T08:11:24.024Z",
          "count": 3,
          "elapsed_ms": 5943
     },
     {
          "id": 2,
          "type": "AzureSearchSemanticRanker",
          "input_tokens": 23189
     }
]


In [32]:
print("***** Results *****")
print("\033[1;31;34m")
print(json.dumps([r.as_dict() for r in retrieval_result.references], indent=5))

***** Results *****
[1;31;34m
[
     {
          "type": "AzureSearchDoc",
          "id": "0",
          "activity_source": 1,
          "doc_key": "doc_036_03",
          "source_data": {
               "id": "doc_036_03",
               "text": "mission or against a health professional in the exercise of his duties, where the status of the victim is apparent or known\nto the perpetrator;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a05\u00b0 against a witness, a victim or civil party, either to prevent him from denouncing the action, filing a complaint or\nmaking a statement before a court, or because of his denunciation, complaint or statement;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a05\u00b0bis because of the victim's actual or supposed membership or non-membership of a given ethnic group, nation,\nrace or religion;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a05\u00b0ter because of the sexual orientation of the victim;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a06\u00b0 by the spouse 

In [33]:
data = [r.as_dict() for r in retrieval_result.references]
sorted_data = sorted(data, key=lambda x: x["doc_key"])
print("Sources:")
print("\033[1;31;34m")

for item in sorted_data:
    source = item["doc_key"].split("_")
    page = source[1]
    print(f"Page = {source[1]} chunk page: {source[2]}")

Sources:
[1;31;34m
Page = 036 chunk page: 02
Page = 036 chunk page: 03
Page = 039 chunk page: 01


## Generate answer

In [34]:
response = aoai_client.responses.create(model=aoai_gpt_deployment,
                                        temperature=0.7,
                                        top_p=0.95,
                                        input=messages)

print("\033[1;31;34m")
print(response.output_text)

[1;31;34m
The penalties for a driver who unintentionally causes a personal injury that results in the victim being unable to work for four months are not explicitly stated for that exact duration in the provided excerpts. However, based on the information about acts of violence causing incapacity to work, the following can be inferred:

- Acts of violence causing an incapacity to work of eight days or less, or causing no incapacity to work, are punished by three years' imprisonment and a fine of €45,000 under certain conditions (e.g., against vulnerable persons or public officials) [ref_id: 2].
- The penalties are increased to five years' imprisonment and a fine of €75,000 where the unintended personal injury is committed with two or more aggravating circumstances (such as driving under the influence, excessive speed, or lack of valid license) [ref_id: 1].

Since the injury in question results in a total incapacity to work for four months (which is more than eight days), the applicabl

In [35]:
response.usage

ResponseUsage(input_tokens=6067, input_tokens_details=InputTokensDetails(cached_tokens=0), output_tokens=272, output_tokens_details=OutputTokensDetails(reasoning_tokens=0), total_tokens=6339)

## Another examples

In [36]:
messages.append({
    "role": "user",
    "content": "What is article 313-8?"
})

retrieval_result = agent_client.retrieve(
    retrieval_request=KnowledgeAgentRetrievalRequest(
        messages=[
            KnowledgeAgentMessage(role=msg["role"],
                                  content=[
                                      KnowledgeAgentMessageTextContent(
                                          text=msg["content"])
                                  ]) for msg in messages
            if msg["role"] != "system"
        ],
        target_index_params=[
            KnowledgeAgentIndexParams(index_name=index_name,
                                      reranker_threshold=2.5)
        ]))
messages.append({
    "role": "assistant",
    "content": retrieval_result.response[0].content[0].text
})

In [37]:
print("***** Activity *****")
print("\033[1;31;34m")
print(json.dumps([a.as_dict() for a in retrieval_result.activity], indent=5))

***** Activity *****
[1;31;34m
[
     {
          "id": 0,
          "type": "ModelQueryPlanning",
          "input_tokens": 7212,
          "output_tokens": 149
     },
     {
          "id": 1,
          "type": "AzureSearchQuery",
          "target_index": "agent-rag-index-demo",
          "query": {
               "search": "Article 313-8 Penal Code"
          },
          "query_time": "2025-06-02T08:11:37.784Z",
          "count": 2,
          "elapsed_ms": 209
     },
     {
          "id": 2,
          "type": "AzureSearchSemanticRanker",
          "input_tokens": 24215
     }
]


In [38]:
print("***** Results *****")
print("\033[1;31;34m")
print(json.dumps([r.as_dict() for r in retrieval_result.references], indent=5))

***** Results *****
[1;31;34m
[
     {
          "type": "AzureSearchDoc",
          "id": "0",
          "activity_source": 1,
          "doc_key": "doc_073_01",
          "source_data": {
               "id": "doc_073_01",
               "text": "PENAL CODE\ncommitted, for a maximum period of five years;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a03\u00b0 closure, for a maximum period of five years, of the business premises or of one or more of the premises of the\nenterprise used to carry out the criminal behaviour;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a04\u00b0 confiscation of the thing which was used or was intended for use in the commission of the offence or of the thing\nwhich is the product of it, with the exception of articles subject to restitution;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a05\u00b0 area banishment pursuant to the conditions set out under article 131-31;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a06\u00b0 prohibition to draw cheques, except those allowing 

In [39]:
data = [r.as_dict() for r in retrieval_result.references]
sorted_data = sorted(data, key=lambda x: x["doc_key"])
print("Sources:")
print("\033[1;31;34m")

for item in sorted_data:
    source = item["doc_key"].split("_")
    page = source[1]
    print(f"Page = {source[1]} chunk page: {source[2]}")

Sources:
[1;31;34m
Page = 072 chunk page: 02
Page = 073 chunk page: 01


In [40]:
response = aoai_client.responses.create(model=aoai_gpt_deployment,
                                        temperature=0.7,
                                        input=messages)

print("\033[1;31;34m")
print(response.output_text)

[1;31;34m
Article 313-8 states that natural persons convicted of any of the misdemeanours referred to under articles 313-1, 313-2, 313-6, and 313-6-1 also incur disqualification from public tenders for a maximum period of five years. 

Note: This answer was generated by an AI.


In [41]:
messages.append({
    "role": "user",
    "content": "What penalties might a driver might have?"
})

retrieval_result = agent_client.retrieve(
    retrieval_request=KnowledgeAgentRetrievalRequest(
        messages=[
            KnowledgeAgentMessage(role=msg["role"],
                                  content=[
                                      KnowledgeAgentMessageTextContent(
                                          text=msg["content"])
                                  ]) for msg in messages
            if msg["role"] != "system"
        ],
        target_index_params=[
            KnowledgeAgentIndexParams(index_name=index_name,
                                      reranker_threshold=2.5)
        ]))
messages.append({
    "role": "assistant",
    "content": retrieval_result.response[0].content[0].text
})

In [42]:
print("***** Activity *****")
print("\033[1;31;34m")
print(json.dumps([a.as_dict() for a in retrieval_result.activity], indent=5))

***** Activity *****
[1;31;34m
[
     {
          "id": 0,
          "type": "ModelQueryPlanning",
          "input_tokens": 8296,
          "output_tokens": 116
     },
     {
          "id": 1,
          "type": "AzureSearchQuery",
          "target_index": "agent-rag-index-demo",
          "query": {
               "search": "Penalties for drivers under the law"
          },
          "query_time": "2025-06-02T08:11:50.283Z",
          "count": 6,
          "elapsed_ms": 191
     },
     {
          "id": 2,
          "type": "AzureSearchSemanticRanker",
          "input_tokens": 23239
     }
]


In [43]:
print("***** Results *****")
print("\033[1;31;34m")
print(json.dumps([r.as_dict() for r in retrieval_result.references], indent=5))

***** Results *****
[1;31;34m
[
     {
          "type": "AzureSearchDoc",
          "id": "0",
          "activity_source": 1,
          "doc_key": "doc_031_02",
          "source_data": {
               "id": "doc_031_02",
               "text": "for by article 221-6 is committed by the driver a motor vehicle, manslaughter is punished by five years' imprisonment and\nby a fine of \u20ac75,000.\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0The penalties are increased to seven years' imprisonment and to a fine of \u20ac100,000 where:\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a01\u00ba the driver has deliberately violated an obligation of safety or prudence imposed by statute or Regulations other\nthan those outlined below;\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a02\u00ba the driver was manifestly drunk or in an alcoholic state characterised by a level of alcohol in the blood or breath\ngreater than the limits fixed by the legislative or statutory provisions of the Traffic Code, or where

In [44]:
data = [r.as_dict() for r in retrieval_result.references]
sorted_data = sorted(data, key=lambda x: x["doc_key"])
print("Sources:")
print("\033[1;31;34m")

for item in sorted_data:
    source = item["doc_key"].split("_")
    page = source[1]
    print(f"Page = {source[1]} chunk page: {source[2]}")

Sources:
[1;31;34m
Page = 031 chunk page: 01
Page = 031 chunk page: 02
Page = 038 chunk page: 02
Page = 038 chunk page: 03
Page = 039 chunk page: 01
Page = 066 chunk page: 02


In [45]:
response = aoai_client.responses.create(model=aoai_gpt_deployment,
                                        temperature=0.7,
                                        input=messages)

print("\033[1;31;34m")
print(response.output_text)

[1;31;34m
A driver who unintentionally causes a personal injury resulting in the victim being unable to work for four months can face the following penalties according to the Penal Code:

- For an unintended personal injury causing a total incapacity to work in excess of three months, the driver is punishable by three years' imprisonment and a fine of €45,000.
- The penalties increase to five years' imprisonment and a fine of €75,000 if any of the following aggravating circumstances are present:
  1. The driver deliberately violated an obligation of safety or prudence imposed by statute or regulations.
  2. The driver was manifestly drunk or in an alcoholic state above legal limits, or refused to take alcohol tests.
  3. The driver had used drugs or refused drug tests.
  4. The driver did not hold a valid driving license or it was annulled, invalidated, suspended, or revoked.
  5. The driver exceeded the maximum speed limit by 50 km/h or more.
  6. The driver caused the accident and f

## Post processing

In [46]:
index_client = SearchIndexClient(endpoint=azure_ai_search_endpoint,
                                 credential=credential)

index_client.delete_agent(agent_name)
print(f"Agent '{agent_name}' is deleted")

index_client.delete_index(index)
print(f"Azure AI Search index '{index_name}' is deleted")

Agent 'agent-rag-demo' is deleted
Azure AI Search index 'agent-rag-index-demo' is deleted
