# Notebook 4: Agentic RAG Introduction

In this notebook you will learn how to classify a user search and have specialize agents called depending on the user's question. This notebook does not use the Azure AI Search index to create a full solution for you - you will get to do that in the hands on notebook.

> NOTE: This notebook is heavily inspired by the [Simple Handoff sample](https://github.com/microsoft/agent-framework/blob/main/python/samples/getting_started/workflows/orchestration/handoff_simple.py) in the agent-framework source tree.

## Learning Objectives
- Learn about the Handoff pattern
- Implement a classifier to recognize the user's intent and route to a specialized agent
- Implement agents to respond to specific search types


### Install Required Packages

In [None]:
%pip install agent-framework -q
%pip install azure-identity -q
%pip install python-dotenv -q

### Setup the Module Imports

In [None]:
from collections.abc import AsyncIterable
from typing import Annotated, cast

from agent_framework import (
    ChatAgent,
    ChatMessage,
    HandoffBuilder,
    HandoffUserInputRequest,
    RequestInfoEvent,
    Role,
    WorkflowEvent,
    WorkflowOutputEvent,
    WorkflowRunState,
    WorkflowStatusEvent,
    ai_function,
)
from agent_framework.azure import AzureOpenAIChatClient
from azure.identity import DefaultAzureCredential

Define AI functions you'll use for testing the specific search type in this notebook.

> NOTE: these are where the actual implementation goes for doing specific search types in the hands on notebook

In [None]:
@ai_function
def yes_or_no_search(user_question: Annotated[str, "User question to answer with yes or no"]) -> str:
    """Answers yes or no questions."""
    return f"Question: {user_question}\nAnswer: NO"

@ai_function
def count_search(user_question: Annotated[str, "User question requiring counting items"]) -> str:
    """Answers questinos that required counting."""
    return f"Question: {user_question}\nAnswer: 42"

@ai_function
def semantic_search(user_question: Annotated[str, "User question requiring semantic search"]) -> str:
    """Answers simple question that requires semantic search."""
    return f"Question: {user_question}\nAnswer: They all the gray or metallic in color."

Create a prompt for the classifier to be able to determine how to route the user's question.

In [None]:
CLASSIFIER_AGENT_INSTRUCTIONS = """
You are a query classification system for an IT support ticket database. Your task is to route them to specialist search agents based on the user question.

## Database Schema
The database contains IT support tickets with these fields:
- Id: unique identifier
- Subject: ticket subject
- Body: ticket question/description
- Answer: ticket response/solution
- Type: ticket type (e.g., "Incident", "Request", "Problem")
- Queue: department name (e.g., "Human Resources", "IT", "Finance")
- Priority: "high", "medium", or "low"
- Language: ticket language
- Business_Type: business category
- Tags: categorization tags

## Types of Searches Agents can do:

**YES_NO_AGENT**: Simple yes/no questions
   - Keywords: "is", "are", "can", "does", "do", "will", "should"
   - Examples:
     - "Is my account locked?"
     - "Can I access the VPN?"
     - "Does the printer support color printing?"
     - "Will my password expire soon?"

**COUNT_AGENT**: Questions requiring counting items
   - Keywords: "how many", "number of", "count of", "total"
    - Examples:
      - "How many devices are connected to the network?"
      - "What is the total number of open tickets?"
      - "Count of users with admin access"

**SEMANTIC_SEARCH_AGENT**: Queries looking for similar issues or solutions based on meaning
   - Keywords: "how to", "why", "what causes", "solve", "fix", "issue with", "problem with"
   - Examples:
     - "How do I reset my password?"
     - "Why is my VPN not connecting?"
     - "Issues with email synchronization"
     - "Laptop won't turn on"

**NO_SEARCH_NEEDED**: Queries that do not require any search
   - Examples:
      - "Hello I need assistance."
      - "Thank you for your help."
      - "Goodbye."
      - "No further questions."

Consider:
1. The primary intent - what is the user trying to achieve?
2. Whether they need specific data points (analytical) vs similar content (semantic)
3. Whether filters/constraints are present
4. The type of expected response (number, list, specific tickets, similar issues)
5. If no search is needed, respond with NO_SEARCH_NEEDED.

"""

Define a method that will create all the agents and connect the AI functions created earlier to their respective agents.

In [None]:
def create_agents(chat_client: AzureOpenAIChatClient) -> tuple[ChatAgent, ChatAgent, ChatAgent, ChatAgent]:
    """Create and configure the classifier and search specialist agents.

    The classifier agent is responsible for:
    - Receiving all user input first
    - Deciding whether to handle the request directly or hand off to a search specialist
    - Signaling handoff by calling one of the explicit handoff tools exposed to it

    Search specialist agents are invoked only when the classifier agent explicitly hands off to them.
    After a specialist responds, control returns to the classifier agent, which then prompts
    the user for their next message.

    Returns:
        Tuple of (classifier_agent, yes_no_agent, count_agent, semantic_search_agent)
    """
    # Classifier agent: Acts as the search planner and dispatcher
    classifier_agent = chat_client.create_agent(
        instructions=CLASSIFIER_AGENT_INSTRUCTIONS,
        name="classifier_agent",
    )

    # Yes/No search specialist: Handles yes/no questions
    yes_no_agent = chat_client.create_agent(
        instructions="You handle yes/no questions.",
        name="yes_no_agent",
        tools=[yes_or_no_search],
    )

    # Count search specialist: Handles counting questions
    count_agent = chat_client.create_agent(
        instructions="You handle questions that require counting items.",
        name="count_agent",
        tools=[count_search],
    )

    # Semantic search specialist: Handles semantic search questions
    semantic_search_agent = chat_client.create_agent(
        instructions="You handle questions that require semantic search.",
        name="semantic_search_agent",
        tools=[semantic_search],
    )

    return classifier_agent, yes_no_agent, count_agent, semantic_search_agent

Create a utility method that will help handle the Workflow's events synchronously.

In [None]:
async def _drain(stream: AsyncIterable[WorkflowEvent]) -> list[WorkflowEvent]:
    """Collect all events from an async stream into a list.

    This helper drains the workflow's event stream so we can process events
    synchronously after each workflow step completes.

    Args:
        stream: Async iterable of WorkflowEvent

    Returns:
        List of all events from the stream
    """
    return [event async for event in stream]

Define a utility method that will show us the agent messages

In [None]:
def _print_agent_responses_since_last_user_message(request: HandoffUserInputRequest) -> None:
    """Display agent responses since the last user message in a handoff request.

    The HandoffUserInputRequest contains the full conversation history so far,
    allowing the user to see what's been discussed before providing their next input.

    Args:
        request: The user input request containing conversation and prompt
    """
    if not request.conversation:
        raise RuntimeError("HandoffUserInputRequest missing conversation history.")

    # Reverse iterate to collect agent responses since last user message
    agent_responses: list[ChatMessage] = []
    for message in request.conversation[::-1]:
        if message.role == Role.USER:
            break
        agent_responses.append(message)

    # Print agent responses in original order
    agent_responses.reverse()
    for message in agent_responses:
        speaker = message.author_name or message.role.value
        print(f"- {speaker}: {message.text}")

Define a utility method that writes out the workflow status or writes out the messages.

In [None]:
def _handle_events(events: list[WorkflowEvent]) -> list[RequestInfoEvent]:
    """Process workflow events and extract any pending user input requests.

    This function inspects each event type and:
    - Prints workflow status changes (IDLE, IDLE_WITH_PENDING_REQUESTS, etc.)
    - Displays final conversation snapshots when workflow completes
    - Prints user input request prompts
    - Collects all RequestInfoEvent instances for response handling

    Args:
        events: List of WorkflowEvent to process

    Returns:
        List of RequestInfoEvent representing pending user input requests
    """
    requests: list[RequestInfoEvent] = []

    for event in events:
        # WorkflowStatusEvent: Indicates workflow state changes
        if isinstance(event, WorkflowStatusEvent) and event.state in {
            WorkflowRunState.IDLE,
            WorkflowRunState.IDLE_WITH_PENDING_REQUESTS,
        }:
            print(f"\n[Workflow Status] {event.state.name}")

        # WorkflowOutputEvent: Contains the final conversation when workflow terminates
        elif isinstance(event, WorkflowOutputEvent):
            conversation = cast(list[ChatMessage], event.data)
            if isinstance(conversation, list):
                print("\n=== Final Conversation Snapshot ===")
                for message in conversation:
                    speaker = message.author_name or message.role.value
                    print(f"- {speaker}: {message.text}")
                print("===================================")

        # RequestInfoEvent: Workflow is requesting user input
        elif isinstance(event, RequestInfoEvent):
            if isinstance(event.data, HandoffUserInputRequest):
                _print_agent_responses_since_last_user_message(event.data)
            requests.append(event)

    return requests

Now all the code is defined, you get to used it.

Create a chat client and all the agents.

In [None]:
# Initialize the Azure OpenAI chat client
chat_client = AzureOpenAIChatClient(credential=DefaultAzureCredential())

# Create all agents: classifier + specialists
classifier, yes_no, count, semantic_search = create_agents(chat_client)

Create the workflow, using the [HandoffBuilder](https://learn.microsoft.com/en-us/python/api/agent-framework-core/agent_framework.handoffbuilder?view=agent-framework-python-latest), setting all the agents as participants and the classifier as the handoff coordinator.

In [None]:
workflow = (
    HandoffBuilder(
        name="agentic_rag_workflow",
        participants=[classifier, yes_no, count, semantic_search],
    )
    .set_coordinator(classifier)
    .build()
)

For demo purposes create a list of some sample questions.

In [None]:
scripted_responses = [
    "How many tickets were logged and Incidents for Human Resources and low priority?", # count search
    "Are there any issues logged for Dell XPS laptops?" # yes/no search
]

Ask and initial question and then the above demo questions to see if it correctly classifies the search types.

In [None]:
initial_message = "What problems are there with Surface devices?" # semantic search
print(f"- User: {initial_message}")

events = await _drain(workflow.run_stream(initial_message))
pending_requests = _handle_events(events)

while pending_requests and scripted_responses:
    # Get the next scripted response
    user_response = scripted_responses.pop(0)
    print(f"\n- User: {user_response}")

    # Send response(s) to all pending requests
    # In this demo, there's typically one request per cycle, but the API supports multiple
    responses = {req.request_id: user_response for req in pending_requests}

    # Send responses and get new events
    # We use send_responses_streaming() to get events as they occur, allowing us to
    # display agent responses in real-time and handle new requests as they arrive
    events = await _drain(workflow.send_responses_streaming(responses))
    pending_requests = _handle_events(events)

As I've pointed out in the other notebooks, the question "How many tickets were logged and Incidents for Human Resources and low priority?" is a hard one to get the system to answer - and this time the classifier correctly identified it as something the count_agent should take care of. That is a start!

Next is the hands on notebook where you get to make turn these lessons from the notebooks into an Agentic RAG solution for the IT support search index we've been using.