# Part 7: Minimal Knowledge Base

In Parts 1-6, you used knowledge bases with full agentic reasoning capabilities. Part 7 dives into the **minimal reasoning effort** optimization, which trades some reasoning sophistication for significant improvements in speed and cost. This approach is ideal when you need fast, cost-effective retrieval for simpler queries.

## Step 1: Load Environment Variables

Run below cell to load the configuration for your Azure resources, choose the **.venv(3.11.9)** environment that is created for you. 

This time you'll create a knowledge base optimized for minimal reasoning effort.

> **⚠️ Troubleshooting**
>
> If code cells get stuck and keep spinning, select **Restart** from the notebook toolbar at the top. If the issue persists after a couple of tries, close VS Code completely and reopen it.

In [1]:
import os

from azure.core.credentials import AzureKeyCredential
from dotenv import load_dotenv

load_dotenv(override=True) # take environment variables from .env.

# Azure AI Search configuration
endpoint = os.environ["AZURE_SEARCH_SERVICE_ENDPOINT"]
credential = AzureKeyCredential(os.environ["AZURE_SEARCH_ADMIN_KEY"])

# Knowledge base name
knowledge_base_name = "minimal-knowledge-base"

# Azure OpenAI configuration (identity-based auth, no API key needed)
azure_openai_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
azure_openai_chatgpt_deployment = os.getenv("AZURE_OPENAI_CHATGPT_DEPLOYMENT", "gpt-4.1")
azure_openai_chatgpt_model_name = os.getenv("AZURE_OPENAI_CHATGPT_MODEL_NAME", "gpt-4.1")

print("Environment variables loaded")

Environment variables loaded


## Step 2: Create Minimal Reasoning Knowledge Base

Let's create a knowledge base optimized for speed and cost using `KnowledgeRetrievalMinimalReasoningEffort`. 

Notice two key differences from previous parts:

1. **No Azure OpenAI model configuration**: The knowledge base skips LLM-powered query planning and answer synthesis
2. **`EXTRACTIVE_DATA` output mode**: Returns raw chunks instead of synthesized answers

This configuration prioritizes speed and cost savings over sophisticated reasoning capabilities.

In [2]:
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import KnowledgeBase, KnowledgeRetrievalMinimalReasoningEffort, KnowledgeRetrievalOutputMode, KnowledgeSourceReference

index_client = SearchIndexClient(endpoint=endpoint, credential=credential)

knowledge_base = KnowledgeBase(
    name=knowledge_base_name,
    knowledge_sources=[
        KnowledgeSourceReference(name="healthdocs-knowledge-source"),
        KnowledgeSourceReference(name="hrdocs-knowledge-source"),
    ],
    output_mode=KnowledgeRetrievalOutputMode.EXTRACTIVE_DATA,
    retrieval_reasoning_effort=KnowledgeRetrievalMinimalReasoningEffort()
)

index_client.create_or_update_knowledge_base(knowledge_base)
print(f"Knowledge base '{knowledge_base_name}' created or updated successfully.")

Knowledge base 'minimal-knowledge-base' created or updated successfully.


## Step 3: Query with Semantic Intents

Minimal reasoning knowledge bases use `KnowledgeRetrievalSemanticIntent` instead of conversational messages. Each intent executes as a direct search query against your knowledge sources.

The code below runs two semantic intents and returns a result that contains references and activity log, but no synthesized answer.

In [3]:
from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient
from azure.search.documents.knowledgebases.models import KnowledgeBaseRetrievalRequest, KnowledgeRetrievalSemanticIntent, SearchIndexKnowledgeSourceParams

knowledge_base_client = KnowledgeBaseRetrievalClient(endpoint=endpoint, knowledge_base_name=knowledge_base_name, credential=credential)
healthdocs_ks_params = SearchIndexKnowledgeSourceParams(
    knowledge_source_name="healthdocs-knowledge-source",
    include_references=True,
    include_reference_source_data=True,
)
hrdocs_ks_params = SearchIndexKnowledgeSourceParams(
    knowledge_source_name="hrdocs-knowledge-source",
    include_references=True,
    include_reference_source_data=True,
)
req = KnowledgeBaseRetrievalRequest(
    intents=[
        KnowledgeRetrievalSemanticIntent(
            search="What is the responsibility of the Zava CEO?"
        ),
        KnowledgeRetrievalSemanticIntent(
            search="What Zava health plan would you recommend if they wanted the best coverage for mental health services?"
        )
    ],
    knowledge_source_params=[
        healthdocs_ks_params,
        hrdocs_ks_params
    ],
    include_activity=True,
    retrieval_reasoning_effort=KnowledgeRetrievalMinimalReasoningEffort()
)
result = knowledge_base_client.retrieve(retrieval_request=req)

## Step 4: Review Raw Retrieved Data

The results from the minimal reasoning knowledge base will include raw data chunks retrieved from your knowledge sources. A base with minimal effort is helpful when you are integrating agentic retrieval into an application that has its own LLM question-answering logic or if you are surfacing the results directly.

Run the code and observe the results to see how minimal reasoning knowledge bases can efficiently handle your queries.

In [4]:
import json

references = json.dumps([ref.as_dict() for ref in result.references], indent=2)
print(references)

[
  {
    "type": "searchIndex",
    "id": "0",
    "activity_source": 2,
    "source_data": {
      "uid": "810256edf976_aHR0cHM6Ly9tYWdvdHRlaWFkbHNnZW4yLmJsb2IuY29yZS53aW5kb3dzLm5ldC9sYWJkb2NzL3JvbGVfbGlicmFyeS5wZGY1_pages_0",
      "blob_path": "/hrdocs/role_library.pdf",
      "snippet": "Roles Descriptions at \n\nZava \n\n  \n\n\n\nThis document contains information generated using a language model (Azure OpenAI). The \n\ninformation contained in this document is only for demonstration purposes and does not \n\nreflect the opinions or beliefs of Microsoft. Microsoft makes no representations or \n\nwarranties of any kind, express or implied, about the completeness, accuracy, reliability, \n\nsuitability or availability with respect to the information contained in this document. \n\nAll rights reserved to Microsoft \n\n  \n\n\n\nZava Role Library \n\nLast Updated: 2023-03-05 \n\nChief Executive Officer \n\nJob Description: \n\nPosition: Chief Executive Officer \n\nCompany: Zava \n\n

In [5]:
import pandas as pd

activity_types = [{"type": a.type} for a in result.activity]

df = pd.DataFrame(activity_types)

print("Activity Log Steps")
df

Activity Log Steps


Unnamed: 0,type
0,searchIndex
1,searchIndex
2,searchIndex
3,searchIndex
4,agenticReasoning


In [6]:
activity_content = json.dumps([a.as_dict() for a in result.activity], indent=2)
print("Activity Details")
print(activity_content)

Activity Details
[
  {
    "id": 0,
    "type": "searchIndex",
    "elapsed_ms": 396,
    "knowledge_source_name": "healthdocs-knowledge-source",
    "query_time": "2026-02-20T04:24:24.782Z",
    "count": 1,
    "search_index_arguments": {
      "search": "What is the responsibility of the Zava CEO?",
      "source_data_fields": [
        {
          "name": "snippet"
        },
        {
          "name": "uid"
        },
        {
          "name": "blob_path"
        }
      ],
      "search_fields": [
        {
          "name": "snippet"
        }
      ],
      "semantic_configuration_name": "semantic-configuration"
    }
  },
  {
    "id": 1,
    "type": "searchIndex",
    "elapsed_ms": 524,
    "knowledge_source_name": "healthdocs-knowledge-source",
    "query_time": "2026-02-20T04:24:25.307Z",
    "count": 43,
    "search_index_arguments": {
      "search": "What Zava health plan would you recommend if they wanted the best coverage for mental health services?",
      "source_d

## Step 5: Generate Synthesized Answer

If you want a synthesized answer from the retrieved chunks, you can pass them to an LLM for processing. The code below demonstrates how to generate a concise answer using the extracted data.

In [7]:
from IPython.display import display, Markdown
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from openai import AsyncAzureOpenAI

# Using identity-based auth via DefaultAzureCredential (no API key needed)
token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")

openai_client = AsyncAzureOpenAI(
    azure_endpoint=azure_openai_endpoint,
    azure_ad_token_provider=token_provider,
    api_version="2025-03-01-preview"
)

response = await openai_client.responses.create(
    model=azure_openai_chatgpt_deployment,
    instructions="You are a helpful assistant that provides concise answers based on the provided context.",
    input=f"""
    Based on the following context, answer the question concisely.\n\nContext:\n{result.response[0].content[0].text}
    \n\nQuestion:\nWhat is the responsibility of the Zava CEO?
    What Zava health plan would you recommend if they wanted the best coverage for mental health services?
    """
)

display(Markdown(response.output_text))

The Zava CEO is responsible for providing strategic direction and oversight to ensure the company’s long-term success and profitability, managing daily operations, developing strategy, ensuring compliance, overseeing marketing, stakeholder relationships, and representing the company.

For the best coverage for mental health services, the recommended health plan is Northwind Health Plus, as it offers comprehensive coverage, access to a network of providers, and additional mental health resources.

## Summary

You've now experienced minimal reasoning knowledge bases optimized for speed and cost over sophisticated reasoning capabilities.

**Key concepts to remember:**
- `KnowledgeRetrievalMinimalReasoningEffort` prioritizes speed and cost savings over advanced reasoning
- `EXTRACTIVE_DATA` output mode returns raw chunks instead of synthesized answers
- `KnowledgeRetrievalSemanticIntent` structures queries as focused search intents rather than conversational messages
- No Azure OpenAI model required in the knowledge base configuration, reducing baseline costs
- Optional post-processing with Azure OpenAI gives you control over when to use LLM synthesis

### What's Next?

➡️ Continue to [Part 8: Medium Knowledge Base](part8-medium-knowledge-base.ipynb) to learn how medium reasoning effort balances sophistication with performance.