# Chunking Experiments

This notebook contains code for testing best chunking strategies, libraries and snippets.

In [2]:
import os
from dotenv import load_dotenv

load_dotenv()

pdf_path = "../../data/cao-pdfs/Cao Bouw en Infra 2025 - 2027.pdf"

TOKENIZER_ENCODING = "cl100k_base"  # For OpenAI models
TOKENIZER_MAX_TOKENS = 8192  # Adjust based on your chosen model

MAX_TOKENS = 8192  # Adjust based on your chosen model
VECTOR_DIM = 1536  # Adjust based on your chosen embeddings model

AZURE_SEARCH_ENDPOINT = os.getenv("AZURE_SEARCH_ENDPOINT")
AZURE_SEARCH_API_KEY = os.getenv("AZURE_SEARCH_API_KEY")  # Ensure this is your Admin Key
AZURE_SEARCH_INDEX_NAME = os.getenv("AZURE_SEARCH_INDEX_NAME", "cao-rag-sample")
AZURE_OPENAI_ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
AZURE_OPENAI_API_KEY = os.getenv("AZURE_OPENAI_API_KEY")
AZURE_OPENAI_API_VERSION = os.getenv("AZURE_OPENAI_API_VERSION", "2024-10-21")
AZURE_OPENAI_CHAT_MODEL_NAME = os.getenv(
    "AZURE_OPENAI_CHAT_MODEL_NAME"
)
AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME = os.getenv(
    "AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME", "text-embedding-3-large"
)  # Using a deployed model named "text-embeddings-3-large
AZURE_OPENAI_EMBEDDING_MODEL_NAME = os.getenv(
    "AZURE_OPENAI_EMBEDDING_MODEL_NAME", "text-embedding-3-large"
)  # Using a deployed model named "text-embeddings-3-large

In [3]:
import torch

print(f"PyTorch version: {torch.__version__}")

if torch.cuda.is_available():
    print(f"CUDA is available. Using GPU: {torch.cuda.get_device_name(0)}: {torch.cuda.device_count()} GPU(s)")
else:
    print("CUDA is not available. Using CPU.")
    print(f"{torch.cpu.device_count()} CPU core(s) available")


PyTorch version: 2.8.0+cu129
CUDA is available. Using GPU: NVIDIA GeForce RTX 5080 Laptop GPU: 1 GPU(s)


In [4]:
from pathlib import Path
from docling.document_converter import DocumentConverter

converter = DocumentConverter() 

  from .autonotebook import tqdm as notebook_tqdm


In [5]:
from docling.datamodel.accelerator_options import AcceleratorDevice, AcceleratorOptions
from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import (
    PdfPipelineOptions,
)
from docling.datamodel.settings import settings
from docling.document_converter import DocumentConverter, PdfFormatOption

# Explicitly set the accelerator options
accelerator_options = AcceleratorOptions(
    num_threads=8, device=AcceleratorDevice.CUDA
)

pipeline_options = PdfPipelineOptions()
pipeline_options.accelerator_options = accelerator_options
pipeline_options.do_ocr = False
pipeline_options.do_table_structure = True
pipeline_options.table_structure_options.do_cell_matching = True

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(
            pipeline_options=pipeline_options,
        )
    }
)

In [6]:
# Convert the document
conversion_result = converter.convert(pdf_path)

2025-09-22 13:07:48,020 - INFO - detected formats: [<InputFormat.PDF: 'pdf'>]
2025-09-22 13:07:51,096 - INFO - Going to convert document batch...
2025-09-22 13:07:51,097 - INFO - Initializing pipeline for StandardPdfPipeline with options hash 12a1aaae4d2de3c2950eea38387e79f2
2025-09-22 13:07:51,129 - INFO - Loading plugin 'docling_defaults'
2025-09-22 13:07:51,134 - INFO - Registered picture descriptions: ['vlm', 'api']
2025-09-22 13:07:51,166 - INFO - Loading plugin 'docling_defaults'
2025-09-22 13:07:51,179 - INFO - Registered ocr engines: ['easyocr', 'ocrmac', 'rapidocr', 'tesserocr', 'tesseract']
2025-09-22 13:07:51,224 - INFO - Accelerator device: 'cuda:0'
2025-09-22 13:07:54,092 - INFO - Accelerator device: 'cuda:0'
2025-09-22 13:07:54,912 - INFO - Processing document Cao Bouw en Infra 2025 - 2027.pdf
2025-09-22 13:09:44,542 - INFO - Finished converting document Cao Bouw en Infra 2025 - 2027.pdf in 116.52 sec.


In [7]:
output_dir = Path("outputs/02-chunking-experiments")
output_dir.mkdir(parents=True, exist_ok=True)

doc_filename = conversion_result.input.file.stem

In [8]:
print(f"Document has {len(conversion_result.document.pages)} pages and {len(conversion_result.document.tables)} tables.")
# print(f"Document text content:\n{conversion_result.document.export_to_markdown()}...")

Document has 178 pages and 68 tables.


In [9]:
# Export tables
'''
import pandas as pd

for table_ix, table in enumerate(conversion_result.document.tables):
    table_df: pd.DataFrame = table.export_to_dataframe(doc=conversion_result.document)
    print(f"## Table {table_ix}")
    print(table_df.to_markdown())

    # Save the table as CSV
    element_csv_filename = output_dir / f"{doc_filename}-table-{table_ix + 1}.csv"
    table_df.to_csv(element_csv_filename)
'''

'\nimport pandas as pd\n\nfor table_ix, table in enumerate(conversion_result.document.tables):\n    table_df: pd.DataFrame = table.export_to_dataframe(doc=conversion_result.document)\n    print(f"## Table {table_ix}")\n    print(table_df.to_markdown())\n\n    # Save the table as CSV\n    element_csv_filename = output_dir / f"{doc_filename}-table-{table_ix + 1}.csv"\n    table_df.to_csv(element_csv_filename)\n'

## Chunking

We convert the Document into smaller chunks for embedding and indexing. The built-in HierarchicalChunker preserves structure.


In [10]:
from docling.chunking import HierarchicalChunker
import tiktoken
from docling_core.transforms.chunker.tokenizer.openai import OpenAITokenizer

# Initialize tiktoken encoding for OpenAI embedding models
encoding = tiktoken.get_encoding(TOKENIZER_ENCODING)

# Create Docling's OpenAITokenizer wrapper
tokenizer = OpenAITokenizer(tokenizer=encoding, max_tokens=TOKENIZER_MAX_TOKENS)

# Instantiate HierarchicalChunker with tokenizer
chunker = HierarchicalChunker(tokenizer=tokenizer, merge_peers=True)

In [11]:
from xxhash import xxh64
doc_chunks = list(chunker.chunk(conversion_result.document))

all_chunks = []
for idx, c in enumerate(doc_chunks):
    # Enrich chunks (example: add custom metadata or transform)
    chunk_text = chunker.contextualize(c)

    byte_data = chunk_text.encode('utf-8')
    chunk_index = xxh64(byte_data).hexdigest()

    all_chunks.append((chunk_index, chunk_text))

print(f"Total chunks from PDF: {len(all_chunks)}")

Total chunks from PDF: 872


### Create Azure AI Search Index and Push Chunk Embeddings
We’ll define a vector index in Azure AI Search, then embed each chunk using Azure OpenAI and upload in batches.

In [12]:
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    AzureOpenAIVectorizer,
    AzureOpenAIVectorizerParameters,
    HnswAlgorithmConfiguration,
    SearchableField,
    SearchField,
    SearchFieldDataType,
    SearchIndex,
    SimpleField,
    VectorSearch,
    VectorSearchProfile,
)

VECTOR_FIELD_NAME = "content_vector"
CONTENT_FIELD_NAME = "content"

index_client = SearchIndexClient(
    AZURE_SEARCH_ENDPOINT, AzureKeyCredential(AZURE_SEARCH_API_KEY)
)

def create_search_index(index_name: str):
    # Define fields
    fields = [
        SimpleField(name="chunk_id", type=SearchFieldDataType.String, key=True),
        SearchableField(name=CONTENT_FIELD_NAME, type=SearchFieldDataType.String),
        SearchField(
            name=VECTOR_FIELD_NAME,
            type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
            searchable=True,
            filterable=False,
            sortable=False,
            facetable=False,
            vector_search_dimensions=VECTOR_DIM,
            vector_search_profile_name="default",
        ),
    ]
    # Vector search config with an AzureOpenAIVectorizer
    vector_search = VectorSearch(
        algorithms=[HnswAlgorithmConfiguration(name="default")],
        profiles=[
            VectorSearchProfile(
                name="default",
                algorithm_configuration_name="default",
                vectorizer_name="default",
            )
        ],
        vectorizers=[
            AzureOpenAIVectorizer(
                vectorizer_name="default",
                parameters=AzureOpenAIVectorizerParameters(
                    resource_url=AZURE_OPENAI_ENDPOINT,
                    deployment_name=AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME,
                    model_name=AZURE_OPENAI_EMBEDDING_MODEL_NAME,
                    api_key=AZURE_OPENAI_API_KEY,
                ),
            )
        ],
    )

    # Create or update the index
    new_index = SearchIndex(name=index_name, fields=fields, vector_search=vector_search)
    try:
        index_client.delete_index(index_name)
    except Exception:
        pass

    index_client.create_or_update_index(new_index)
    print(f"Index '{index_name}' created.")

In [13]:
create_search_index(AZURE_SEARCH_INDEX_NAME)

2025-09-22 13:09:49,494 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')?api-version=REDACTED'
Request method: 'DELETE'
Request headers:
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=minimal'
    'x-ms-client-request-id': 'a7d9275c-97a4-11f0-987a-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-11-10.0.26100-SP0)'
No body was attached to the request
2025-09-22 13:09:50,913 - INFO - Response status: 204
Response headers:
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'request-id': 'a7d9275c-97a4-11f0-987a-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:09:52 GMT'
2025-09-22 13:09:50,918 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')?api-version=REDACTED'
Request method: 'PUT'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1039'
    'api

Index 'cao-rag-sample' created.


### Generate Embeddings and Upload to Azure AI Search

In [14]:
from azure.search.documents import SearchClient
from openai import AzureOpenAI

search_client = SearchClient(
    AZURE_SEARCH_ENDPOINT, AZURE_SEARCH_INDEX_NAME, AzureKeyCredential(AZURE_SEARCH_API_KEY)
)
openai_client = AzureOpenAI(
    api_key=AZURE_OPENAI_API_KEY,
    api_version=AZURE_OPENAI_API_VERSION,
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
)


def embed_text(text: str):
    """
    Helper to generate embeddings with Azure OpenAI.
    """
    response = openai_client.embeddings.create(
        input=text, model=AZURE_OPENAI_EMBEDDING_MODEL_NAME, dimensions=VECTOR_DIM
    )
    return response.data[0].embedding

In [15]:
upload_docs = []
for chunk_id, chunk_text in all_chunks:
    embedding_vector = embed_text(chunk_text)
    upload_docs.append(
        {
            "chunk_id": chunk_id,
            "content": chunk_text,
            "content_vector": embedding_vector,
        }
    )

2025-09-22 13:09:55,458 - INFO - HTTP Request: POST https://cao-intel-ai-services.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
2025-09-22 13:09:55,640 - INFO - HTTP Request: POST https://cao-intel-ai-services.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
2025-09-22 13:09:55,780 - INFO - HTTP Request: POST https://cao-intel-ai-services.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
2025-09-22 13:09:55,915 - INFO - HTTP Request: POST https://cao-intel-ai-services.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
2025-09-22 13:09:56,022 - INFO - HTTP Request: POST https://cao-intel-ai-services.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200

In [16]:
BATCH_SIZE = 50
for i in range(0, len(upload_docs), BATCH_SIZE):
    subset = upload_docs[i : i + BATCH_SIZE]
    resp = search_client.upload_documents(documents=subset)

    all_succeeded = all(r.succeeded for r in resp)
    print(
        f"Uploaded batch {i} -> {i + len(subset)}; all_succeeded: {all_succeeded}, "
        f"first_doc_status_code: {resp[0].status_code}"
    )

print("All chunks uploaded to Azure Search.")

2025-09-22 13:14:18,730 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1754957'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '4853499b-97a5-11f0-82be-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-11-10.0.26100-SP0)'
A body is sent with the request
2025-09-22 13:14:20,707 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '4853499b-97a5-11f0-82be-28a44ac6879c'
    'elapsed-time': 'RED

Uploaded batch 0 -> 50; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:21,485 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '498b57b3-97a5-11f0-bb52-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:21 GMT'
2025-09-22 13:14:21,545 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1746627'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '4a00dc91-97a5-11f0-b7d2-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 50 -> 100; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:22,329 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '4a00dc91-97a5-11f0-b7d2-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:22 GMT'
2025-09-22 13:14:22,400 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1744354'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '4a835348-97a5-11f0-b9fe-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 100 -> 150; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:23,699 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '4a835348-97a5-11f0-b9fe-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:24 GMT'
2025-09-22 13:14:23,773 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1748913'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '4b54cb21-97a5-11f0-895f-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 150 -> 200; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:26,374 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '4b54cb21-97a5-11f0-895f-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:27 GMT'
2025-09-22 13:14:26,446 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1741714'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '4cecb9d5-97a5-11f0-812d-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 200 -> 250; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:29,105 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '4cecb9d5-97a5-11f0-812d-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:30 GMT'
2025-09-22 13:14:29,178 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1735551'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '4e8d8dee-97a5-11f0-86a2-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 250 -> 300; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:31,436 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '4e8d8dee-97a5-11f0-86a2-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:32 GMT'
2025-09-22 13:14:31,507 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1743059'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '4ff0db14-97a5-11f0-9151-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 300 -> 350; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:33,652 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '4ff0db14-97a5-11f0-9151-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:34 GMT'
2025-09-22 13:14:33,719 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1741287'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '51426671-97a5-11f0-be6b-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 350 -> 400; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:36,284 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '51426671-97a5-11f0-be6b-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:37 GMT'
2025-09-22 13:14:36,352 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1727421'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '52d43c5e-97a5-11f0-ba1a-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 400 -> 450; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:38,693 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '52d43c5e-97a5-11f0-ba1a-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:39 GMT'
2025-09-22 13:14:38,766 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1727473'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '54447d37-97a5-11f0-b7b5-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 450 -> 500; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:41,449 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '54447d37-97a5-11f0-b7b5-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:42 GMT'
2025-09-22 13:14:41,519 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1730888'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '55e8aedc-97a5-11f0-b459-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 500 -> 550; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:44,422 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '55e8aedc-97a5-11f0-b459-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:45 GMT'
2025-09-22 13:14:44,485 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1741867'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '57ad4333-97a5-11f0-810b-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 550 -> 600; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:46,866 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '57ad4333-97a5-11f0-810b-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:48 GMT'
2025-09-22 13:14:46,956 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1738934'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '592651ab-97a5-11f0-bb8e-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 600 -> 650; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:49,005 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '592651ab-97a5-11f0-bb8e-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:50 GMT'
2025-09-22 13:14:49,084 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1734706'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '5a6ad45f-97a5-11f0-a807-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 650 -> 700; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:52,184 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '5a6ad45f-97a5-11f0-a807-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:52 GMT'
2025-09-22 13:14:52,257 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1742763'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '5c4f1360-97a5-11f0-8b19-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 700 -> 750; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:54,711 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '5c4f1360-97a5-11f0-8b19-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:55 GMT'
2025-09-22 13:14:54,779 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '1758738'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '5dcfe416-97a5-11f0-a032-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-1

Uploaded batch 750 -> 800; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:57,914 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '5dcfe416-97a5-11f0-a032-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:14:59 GMT'
2025-09-22 13:14:57,947 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '765156'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '5fb343b2-97a5-11f0-856d-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-11

Uploaded batch 800 -> 850; all_succeeded: True, first_doc_status_code: 201


2025-09-22 13:14:59,276 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '5fb343b2-97a5-11f0-856d-28a44ac6879c'
    'elapsed-time': 'REDACTED'
    'Date': 'Mon, 22 Sep 2025 11:15:00 GMT'


Uploaded batch 850 -> 872; all_succeeded: True, first_doc_status_code: 201
All chunks uploaded to Azure Search.


### Perform RAG over PDF
Combine retrieval from Azure AI Search with Azure OpenAI Chat Completions (aka. grounding your LLM)

In [17]:
from typing import Optional

from azure.search.documents.models import VectorizableTextQuery

def generate_chat_response(prompt: str, system_message: Optional[str] = None):
    """
    Generates a single-turn chat response using Azure OpenAI Chat.
    If you need multi-turn conversation or follow-up queries, you'll have to
    maintain the messages list externally.
    """
    messages = []
    if system_message:
        messages.append({"role": "system", "content": system_message})
    messages.append({"role": "user", "content": prompt})

    completion = openai_client.chat.completions.create(
        model=AZURE_OPENAI_CHAT_MODEL_NAME, messages=messages, temperature=1
    )
    return completion.choices[0].message.content


user_query = "What are the Probationary period duration of 1-2 year employment contract as per the cao 2025-2027?"
user_embed = embed_text(user_query)

vector_query = VectorizableTextQuery(
    text=user_query,  # passing in text for a hybrid search
    k_nearest_neighbors=5,
    fields=VECTOR_FIELD_NAME,
)

2025-09-22 13:15:00,169 - INFO - HTTP Request: POST https://cao-intel-ai-services.openai.azure.com/openai/deployments/text-embedding-3-large/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


In [18]:
search_results = search_client.search(
    search_text=user_query, vector_queries=[vector_query], select=[CONTENT_FIELD_NAME], top=10
)

retrieved_chunks = []
for result in search_results:
    snippet = result[CONTENT_FIELD_NAME]
    retrieved_chunks.append(snippet)

2025-09-22 13:15:00,182 - INFO - Request URL: 'https://doc-intel-ais.search.windows.net/indexes('cao-rag-sample')/docs/search.post.search?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '329'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '6108670a-97a5-11f0-960d-28a44ac6879c'
    'User-Agent': 'azsdk-python-search-documents/11.5.3 Python/3.13.3 (Windows-11-10.0.26100-SP0)'
A body is sent with the request
2025-09-22 13:15:00,712 - INFO - Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '6108670a-97a5-11f0-960d-28a44ac6879c'
    'elapsed-time': 'R

In [19]:
context_str = "\n---\n".join(retrieved_chunks)
rag_prompt = f"""
You are an AI assistant helping answering questions about Dutch CAO.
Use ONLY the text below to answer the user's question.
If the answer isn't in the text, say you don't know.

Context:
{context_str}

Question: {user_query}
Answer:
"""

final_answer = generate_chat_response(rag_prompt)

2025-09-22 13:15:08,440 - INFO - HTTP Request: POST https://cao-intel-ai-services.openai.azure.com/openai/deployments/gpt-5-nano/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"


In [20]:
print("\nRAG Prompt and Response:")
print(rag_prompt)


RAG Prompt and Response:

You are an AI assistant helping answering questions about Dutch CAO.
Use ONLY the text below to answer the user's question.
If the answer isn't in the text, say you don't know.

Context:
10.13.1 Looptijd, verlenging, opzegging en vernieuwing
- Deze cao geldt van 1 januari 2025 tot en met 31 maart 2027.
- De cao wordt geacht telkens voor een jaar te zijn verlengd, tenzij een of meer cao-partijen deze hebben opgezegd.
- Voor opzegging van de cao gelden de volgende regels:
- -het moet ten minste drie maanden voorafgaand aan de einddatum gebeuren,
- -bij aangetekend schrijven aan alle cao-partijen.
- Heeft een cao-partij voorstellen ingediend om de cao aan te passen of te vernieuwen? Dan beginnen cao-partijen daar zo snel mogelijk onderhandelingen over.
---
Tabel 1.3.3  Proeftijd
duur arbeidsovereenkomst, 1 = specificatie. duur arbeidsovereenkomst, 2 = maximale proeftijd. duur arbeidsovereenkomst, 3 = maximale proeftijd. , 1 = . , 2 = bouwplaats. , 3 = uta. onbep

In [21]:
print("\nFinal Answer:")
print(final_answer)


Final Answer:
1 maand.
