# Tools, API & Microservices

Now that we have seen the power of prompts and a look how they come together in a simple agent, lets explore formally a few other concepts.
Think of these concepts as the building blocks that allow our AI to not just talk, but to *do* things by interacting with other systems or data sources. This is much like your monitoring tools use APIs to gather metrics, or your automation scripts execute specific commands to manage infrastructure.

1.  Function calling
2.  Tool Calling
3.  Introduction to Agents
4.  Agents calling tools
5.  Agentic Patterns
6.  Agents and Microservices

_Each module is typically dependent on the prior modules having been completed successfully_

### Verifying Python Software Installation

The following `pip install` command ensures all necessary Python software components (dependencies) are present. It's safe to run for verification, even if you've already completed Module 02 (which should have handled this).

### Verifying Python Software Installation

The following `pip install` command ensures all necessary Python software components (dependencies) are present. It's safe to run for verification, even if you've already completed Module 02 (which should have handled this).

In [1]:
%pip install -r requirements.txt


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.2.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [19]:
import openai
import re # Standard Python library for regular expressions (text pattern matching)
import httpx # A modern HTTP client library, used here for making web requests (like 'curl' or 'wget')
import os # Standard Python library for interacting with the operating system (e.g., environment variables)
# import rich # A library for rich text and beautiful formatting in the terminal (commented out)
import json # Standard Python library for working with JSON data
from openai import OpenAI # The official OpenAI library to interact with their models (or compatible APIs)
# import requests # Another popular library for making HTTP requests
from rich import print

os.environ["OPENAI_API_KEY"] = "sk-dummy_key"  # <-- This silences OpenAI tracing output

# --- Configuration for the AI Model ---

# API key for authentication. For local models, this might be a placeholder.
# In a production scenario, this would be a real secret, managed securely like any other API key.
api_key = "sk-placeholder"  

# Defines the specific Large Language Model (LLM) we want to use.
# Think of this like specifying a Docker image tag (e.g., 'nginx:latest' or 'python:3.9-slim').
# Different models have different capabilities, sizes, and performance characteristics.

model = "llama3.2:3b-instruct-fp16" 

# The network address (endpoint) of the LLM service.
# Here, it's pointing to a locally running service (e.g., Ollama, vLLM) on port 11434.
# In an Ops context, this is like the API endpoint for a service you manage or consume.
base_url = "http://localhost:11434/v1/" 

# --- Initialize the Client ---
# This creates an 'client' object that we'll use to send requests to the LLM.
# It's configured with the base_url and api_key defined above.
# This is like setting up your 'kubectl' context to point to the correct Kubernetes cluster
# or configuring an SDK to talk to a specific cloud service endpoint.
client = OpenAI(
    base_url=base_url,
    api_key=api_key,
)

print("[green] Imports complete, Client initialized, Model setup[/green]")

# Quick Test: Verify LLM Connectivity

Before we dive into the complexities of tool calling, let's perform a basic "health check." The code below sends a simple request to the Large Language Model (LLM) we just configured. This is analogous to pinging a server to ensure it's responsive or checking a service status endpoint before relying on it for critical operations. If this works, we know our basic setup and connection to the LLM are good.

> TIP: You can re-run cells and see varying results, for example try resetting `temperature=0` below to say `.8` or `.9`

In [20]:
# Create a chat completion request.
# This sends our message to the LLM and asks it to generate a response.
chat_completion = client.chat.completions.create(
    model=model, # Specifies which LLM to use (defined in the previous cell)
    messages=[ # The conversation history or current prompt
        # 'role: "user"' indicates the message is from the human user.
        # 'content: ...' is the actual text of the message.
        {"role": "user", "content": "What is AWS CloudFormations used for?"}
    ],
    temperature=.1, # Controls randomness. 0 means more deterministic, predictable output.
                   # Higher values (e.g., 0.7) make the output more creative/random.
)

# Print the model name being used for this request
print(f"Model used: {model}")
# Print the content of the LLM's response.
# chat_completion.choices[0].message.content extracts the text part of the reply.
print(f"LLM Response:\n{chat_completion.choices[0].message.content}")

# Tool/Function Calling: Giving the AI New Capabilities

Now we get to the core idea: enabling our AI to use 'tools'. In an Ops context, a "tool" could be anything:
* A script that fetches current system load (`uptime`, `vmstat`).
* A command that restarts a service (`systemctl restart myapp`).
* An API call to a cloud provider (e.g., to list S3 buckets or check EC2 instance status).
* A database query.

Here, we'll teach the AI how to use a pre-defined Python function as its first tool. This is the foundation of making the AI an active participant that can interact with external systems and data.

In [22]:
import requests # Ensure requests is imported if not done globally or if this cell is run independently

# This Python function, 'get_weather', acts as our external tool.
# Think of it as a script you might write to query a specific weather monitoring service or API.
def get_weather(latitude, longitude):
    """
    Fetches the current temperature for given latitude and longitude coordinates
    by calling an external weather API.
    Args:
        latitude (float): The latitude.
        longitude (float): The longitude.
    Returns:
        float: The current temperature in Celsius, or None if an error occurs.
    """
    # Construct the API URL with the provided latitude and longitude.
    # This is a public API from Open-Meteo.
    api_url = f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m"
    
    print(f"[get_weather tool] Querying API: {api_url}") # For Ops, logging what the tool is doing
    
    # Make the HTTP GET request to the weather API.
    # This is like running 'curl <api_url>'
    response = requests.get(api_url)
    
    # Check if the request was successful (HTTP status code 200)
    if response.status_code == 200:
        # Parse the JSON response from the API into a Python dictionary.
        # APIs often return data in JSON format.
        data = response.json()
        # Extract the specific piece of data we need: current temperature.
        # This requires knowing the structure of the API's JSON response.
        current_temperature = data['current']['temperature_2m']
        print(f"[get_weather tool] API Response (temperature): {current_temperature}°C")
        return current_temperature
    else:
        print(f"[get_weather tool] Error fetching weather data. Status code: {response.status_code}")
        return None # Return None or raise an error to indicate failure

print("[green] Defined the get_weather external tool [/green]")

# Validating the Tool Manually

Before we let the AI try to use our `get_weather` tool, it's good practice to test it ourselves. This ensures the tool (our Python function) works as expected and returns the data we need. This is like manually running a new script or an `ansible-playbook --check` before fully automating it or putting it into production.

In [23]:
# We're calling our 'get_weather' function directly, just like any other Python function,
# to test it with coordinates for Paris.
latitude_paris = 48.8566
longitude_paris = 2.3522

print(f"Manually calling get_weather for Paris (Lat: {latitude_paris}, Lon: {longitude_paris})...")
weather_in_paris = get_weather(latitude_paris, longitude_paris)

# Print the result from our manual call.
if weather_in_paris is not None:
    print(f"[green] Manual Test: Current temperature in Paris is {weather_in_paris}°C[/green]")
else:
    print("Manual Test: Failed to get weather for Paris.")

## LLM's Ability to Use Tools

The choice of LLM can influence its ability to understand *when* and *how* to use the tools you provide. Generally, larger and more capable models often have better reasoning capabilities. However today we are using the smaller and compact `llama3.2:3b-instruct-fp16` model that fits well on our relatively modest Nvidia L4 lab machines.

In more complex scenarios where the decision to use a tool, and with what inputs, isn't straightforward typically larger models are recomended.

Let's see how our current LLM decides whether to use the `get_weather` tool based on a user's question.

In [24]:
# --- Defining the Tool for the LLM ---
# This 'tools' list describes the 'get_weather' function in a way the LLM can understand.
# It's like providing a "man page" or API specification for the tool.

tools = [{
    "type": "function", # Specifies this tool is a function.
    "function": {
        "name": "get_weather", # The exact name of our Python function.
        "description": "Get the current temperature for a provided geographical location (latitude and longitude) in Celsius.",
                       # This description is CRUCIAL. The LLM uses it to decide WHEN this tool is appropriate.
        "parameters": { # Defines the inputs (arguments) the function expects.
            "type": "object", # Parameters are described as a JSON schema object.
            "properties": {   # Each property is an argument.
                "latitude": {
                    "type": "number", # The data type of the latitude argument.
                    "description": "The latitude of the location." # Description for the LLM.
                },
                "longitude": {
                    "type": "number", # The data type of the longitude argument.
                    "description": "The longitude of the location."
                }
            },
            "required": ["latitude", "longitude"], # Tells the LLM these arguments are mandatory.
            "additionalProperties": False # For stricter validation (not explicitly used here by Ollama but good practice)
        },
        "strict": True # Another flag for strictness (behavior might vary by LLM server)
    }
}]

# --- The User's Question ---
# This is the input that will make the LLM consider using the tool.
messages = [{"role": "user", "content": "What's the weather like in Paris today?"}]

print(f"Asking LLM: \"{messages[0]['content']}\" with tool definition provided.")

# --- Asking the LLM to Process the Message with Tool Information ---
# We send the message and the tool definition to the LLM.
# The LLM will NOT execute the function itself. It will respond by saying
# WHICH function it THINKS should be called, and with WHAT arguments.
completion = client.chat.completions.create(
    model=model,
    messages=messages,
    tools=tools, # Provide the list of available tools to the LLM.
    tool_choice="auto", # "auto" means the LLM decides if/which tool to call.
                       # Can also be {"type": "function", "function": {"name": "get_weather"}} to force a tool.
)

# --- Examining the LLM's Decision ---
# The LLM's response might include 'tool_calls' if it decided a tool is needed.
llm_decision = completion.choices[0].message.tool_calls
print("\nLLM's decision on tool usage:")
if llm_decision:
    print(llm_decision)
    # This output will show that the LLM wants to call 'get_weather'
    # and has (hopefully) determined the latitude and longitude for Paris.
else:
    print("LLM decided not to use any tool for this query.")

# Understanding the LLM's Decision

If you look at the output from the previous cell, you should see something like:
`[ChatCompletionMessageToolCall(id='call_ookamoag', function=Function(arguments='{"latitude":48.8566,"longitude":2.3522}', name='get_weather'), type='function', index=0)]`

This output is critical:
* **`name='get_weather'`**: The LLM correctly identified that our `get_weather` tool should be used to answer the question.
    * We would reference this, programatically, as `llm_decision[0].id` 
* **`arguments='{"latitude":48.8566,"longitude":2.3522}'`**: Importantly, the LLM also figured out the *arguments* (latitude and longitude for Paris) to pass to the tool. It inferred these from the word "Paris" in our question. This demonstrates the LLM's reasoning capability.
* **The LLM has *not yet called* the function.** It has only stated its *intent* to call it and with which parameters. Our Python code is still in control. This is like a supervisor approving a plan before it's executed.

In [25]:
messages.append(completion.choices[0].message)  # append model's function call message
messages.append({                               # append result message
    "role": "tool",
    "tool_call_id": llm_decision[0].id,
    "content": str(llm_decision[0].function)
})

completion_2 = client.chat.completions.create(
    model=model,
    messages=messages,
    tools=tools,
)
print(completion_2.choices[0].message.content)

In [26]:
# import json # Make sure json is imported

# Check if the LLM actually decided to call a tool.
if completion.choices[0].message.tool_calls:
    # --- Step 1: Extract the LLM's intended tool call ---
    # Get the first tool call the LLM decided on (it could potentially suggest multiple).
    tool_call = completion.choices[0].message.tool_calls[0]
    function_name = tool_call.function.name
    
    print(f"\nLLM wants to call function: '{function_name}'")
    
    # --- Step 2: Parse the arguments the LLM provided ---
    # The arguments are a JSON string, so we need to parse them into a Python dictionary.
    try:
        args = json.loads(tool_call.function.arguments)
        print(f"With arguments: {args}")
    except json.JSONDecodeError:
        print(f"Error: LLM provided invalid JSON arguments: {tool_call.function.arguments}")
        args = None # Handle error state

    # --- Step 3: Execute the actual Python function ---
    # This is where OUR code calls OUR function. The LLM doesn't run this directly.
    # This is a crucial security and control point.
    if function_name == "get_weather" and args:
        print(f"Our code is now executing the '{function_name}' function based on LLM's plan...")
        # Call our 'get_weather' function with the arguments extracted from the LLM's decision.
        result = get_weather(args["latitude"], args["longitude"])
        
        if result is not None:
            print(f"\nTool execution successful. Result for Paris: {result}°C")
        else:
            print(f"\nTool execution failed or returned no data for Paris.")
    else:
        print(f"Error: LLM wanted to call an unknown function ('{function_name}') or arguments were invalid.")
        result = "Error: Could not execute the tool as planned."
else:
    print("LLM did not request any tool calls in the previous step.")
    result = "No tool was called." # Placeholder if no tool call was made

# This entire flow is a two-step process:
# 1. LLM analyzes the request and decides *which* tool to use and *what* arguments to use (planning step).
# 2. Our Python code takes that plan, validates it (optional but recommended), and then *executes* the actual tool/function (execution step).

In [27]:
# --- Continuing the Conversation with the LLM ---
# Now that we've executed the tool and have a result, we need to give this
# information back to the LLM so it can formulate a natural language answer
# to the user's original question.

# 'messages' currently contains: {"role": "user", "content": "What's the weather like in Paris today?"}

# Add the LLM's previous response (its decision to call a tool) to the conversation history.
# This tells the LLM "you decided to make a tool call".
if completion.choices[0].message.tool_calls: # Ensure a tool call was made
    messages.append(completion.choices[0].message)

    # Add a new message with the *result* from our tool execution.
    # 'role: "tool"' signifies that this message contains the output of a tool call.
    # 'tool_call_id' links this result back to the LLM's specific tool call request.
    # 'content' is the actual data returned by our 'get_weather' function.
    messages.append({
        "role": "tool",
        "tool_call_id": tool_call.id, # From the LLM's previous response
        "content": str(result) # The temperature we got, converted to a string.
    })

    print("\nUpdated conversation history sent back to LLM (including tool result):")
    for msg_item in messages: # Renamed msg to msg_item to avoid conflict if msg was a global
        print(msg_item)

    # --- Ask the LLM to Generate a Final Answer Using the Tool's Output ---
    # Now, send the entire conversation history (original question + LLM's tool call + tool's result)
    # back to the LLM.
    print("\nRequesting final natural language answer from LLM based on tool result...")
    completion_with_tool_result = client.chat.completions.create(
        model=model,
        messages=messages, # The updated conversation history
        tools=tools,       # It's good practice to pass tools again, though some models might not need it here.
    )

    # The LLM should now respond with a natural language answer, like "The current temperature in Paris is 17.4°C."
    final_answer = completion_with_tool_result.choices[0].message.content
    print(f"\nLLM's Final Answer: {final_answer}")
else:
    print("\nNo tool call was made, so no result to send back to LLM for summarization.")
    # If the LLM didn't call a tool, its first response might already be the final answer.
    # Or, you might handle this case differently.
    # For this example, let's assume the first response was it if no tool call.
    final_answer = completion.choices[0].message.content
    print(f"\nLLM's Initial (and Final) Answer (no tool used): {final_answer}")

In [29]:
# --- Using an Agent Framework to Simplify Tool Calling ---
# The previous cells showed the manual, step-by-step process of tool calling:
# 1. Define tool for LLM.
# 2. LLM decides to use tool + suggests arguments.
# 3. Our code extracts this, calls our actual Python function.
# 4. Our code sends the function's result back to the LLM.
# 5. LLM gives final answer.
#
# Agent frameworks (like the one 'agents' library used here, or LangChain, LlamaIndex)
# abstract away much of this boilerplate.

# This import likely brings in classes for defining Agents, Models, and Tools more easily.
from agents import Agent, ModelSettings, function_tool, Runner,AsyncOpenAI,OpenAIChatCompletionsModel

# --- Redefining our Tool using a Decorator ---
# The '@function_tool' decorator is a shortcut provided by the agent library.
# It automatically handles creating the necessary JSON schema tool definition
# (like we did manually in the 'tools' variable earlier) based on the Python function's
# signature and docstring. This makes tool definition much cleaner.
@function_tool
def get_weather(latitude:str, longitude:str) ->str: # Type hints (latitude:str, ->str) help the decorator
    """Fetches current temperature for given coordinates. 
       Use this tool to find the weather when asked about temperature.
       The latitude and longitude are strings representing numbers.
    """ # The docstring is often used for the tool's "description" for the LLM.
    response = requests.get(f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m")
    # Important: The agent framework might expect the tool to return a string.
    # Let's make sure our tool is robust and handles API errors.
    try:
        data = response.json()
        temperature = data['current']['temperature_2m']
        return f"{temperature}" # Returning as string
    except (KeyError, TypeError, requests.exceptions.JSONDecodeError) as e: # Added JSONDecodeError for robustness
        print(f"[get_weather_tool_decorated] Error: {e}. Response text: {response.text}") # Log error and response
        return f"Error retrieving temperature: {e}"


# --- Configure the Model for the Agent ---
# This sets up the LLM configuration, similar to how we initialized the 'client' earlier,
# but now it's wrapped in a way the agent framework understands.
# 'AsyncOpenAI' suggests it might be using asynchronous operations, common in modern Python.
model_config = OpenAIChatCompletionsModel( 
    model=model,
    openai_client=AsyncOpenAI(base_url="http://localhost:11434/v1",api_key=api_key) # LLM connection details
)

# --- Define the Agent ---
# An 'Agent' bundles the LLM, its instructions, and the tools it can use.
# Think of it as a specialized worker:
# - 'name': Just a label.
# - 'instructions': Tells the agent its purpose and how it should behave. This is its primary prompt.
# - 'model': The LLM configuration it will use.
# - 'tools': The list of tools (our decorated 'get_weather' function) available to this agent.

agent = Agent(
    name="WeatherBot", # More descriptive agent name
    instructions="You are a helpful assistant that provides weather information. Answer the question asked very precisely. Please think before answering and use available tools if necessary.",
    model= model_config, # Use the configured model
    tools=[get_weather], # Make our decorated function available to this agent
)

# --- Run the Agent ---
# The 'Runner' takes the agent and the user's question, and then manages the
# entire interaction flow automatically. This includes:
# - Sending the question and tool definitions to the LLM.
# - If the LLM decides to use a tool:
#   - Parsing the LLM's intended function call and arguments.
#   - Executing the actual Python tool function.
#   - Sending the tool's result back to the LLM.
# - Getting the final natural language response from the LLM.
# 'await' indicates this is an asynchronous operation.
print("Running agent to find out which is warmer: Paris or Manila...")
# The agent should automatically call get_weather twice, once for Paris and once for Manila.
result_from_agent = await Runner.run(agent, "which is warmer now: Paris or Manila?")

# Print the final output from the agent.
# The 'result_from_agent' object might contain more details about the interaction (e.g., tool calls made).
print("\nAgent's Final Output:")
print(result_from_agent.final_output)

# To see the step-by-step thought process and tool calls the agent made:
print("[green] Detailed agent interaction trace: [/green]")
print(result_from_agent) 

In [14]:
print("[green] Detailed agent interaction trace: [/green]") 
print(result_from_agent)

# Tools (Simplified Definition)

In the previous "agent" example, we used an "adornment" (specifically, a Python **decorator** called `@function_tool`) to easily define our `get_weather` function as a tool for the agent. This is a common and convenient pattern provided by many AI agent libraries, as it simplifies the manual JSON schema definition we did earlier.

More broadly, there's a trend in the AI community and industry towards standardizing how tools (or functions, or APIs) are described and made available to AI models. This might sometimes be referred to by concepts like "Model Context Protocol" (MCP) or similar initiatives. The goal is to make it easier for AI agents to discover, understand, and reliably use a wide range of external capabilities, whether they are simple Python functions, complex microservices, or external SaaS APIs. For an Ops team, this means the "tools" your AI can use could be your existing monitoring endpoints, automation scripts, or infrastructure APIs.

# Agentic Patterns: Orchestrating AI Workers

Once you have AI agents that can use tools, you can start combining them in powerful ways. These "agentic patterns" are like design patterns in software engineering, but for building more complex AI systems. They describe common ways to structure interactions between multiple agents or between agents and tools.

We explore below 3 common agentic paradigms:

1.  **Agents Collaborating:** Multiple agents work together on a task, often with one reviewing or refining the work of another.
    * **Ops Analogy:** Imagine a junior admin drafting a complex firewall change request (first agent). A senior admin or a security specialist (second agent) then reviews the draft, provides feedback, or makes corrections before it's applied. This iterative process enhances quality and reduces errors.
    ![Collaboration Pattern](resources/images/agent_collaborate.png)

2.  **Agents Routing (or Supervising/Dispatching):** One agent acts as a controller or dispatcher, analyzing an incoming request and routing it to the appropriate specialized agent.
    * **Ops Analogy:** This is like a central IT helpdesk system or a primary on-call engineer. An incoming alert or ticket (the request) is first analyzed by a "triage agent." This agent then routes the issue to the specialized team (another agent) best equipped to handle it – e.g., the database team agent, the networking team agent, or the application support agent.
    ![pattern-1](resources/images/agent_supervisor_pattern.png)  
    ![pattern-2](resources/images/agent_hierarchical.png) 

3.  **Agents in a Workflow (or Chain/Sequence):** Agents perform tasks in a predefined sequence, where the output of one agent becomes the input for the next.
    * **Ops Analogy:** Think of an automated incident response plan or a CI/CD pipeline.
        * **Incident Response:** An "alert-processing agent" receives an alert. It passes details to a "log-analysis agent" to gather relevant logs. The log summary then goes to a "remediation-suggestion agent" which proposes actions. This is like an Ansible playbook executing a series of tasks in order.
        * **CI/CD:** Code commit (input) -> Build Agent -> Test Agent -> Deployment Agent.
    ![Workflow Pattern](resources/images/agent_plan_execute.png)

There are other agentic patterns as well, but these basic concepts should help you understand how more complex multi-agent systems can be designed to tackle sophisticated problems.

_The graphics have been used from [langraph tutorial](https://github.com/langchain-ai/langgraph/blob/main/docs/docs/tutorials)_

## Agents Collaborating: Enhancing Quality through Teamwork

This pattern is particularly powerful for improving the accuracy and quality of AI-generated content or decisions.
1.  It simply demonstrates an agent reviewing the work of another agent – much like peer review among humans.
2.  This is one of the primary reasons why systems of smaller, specialized agents can sometimes outperform a single, much larger monolithic model. Just like in Ops, a "second pair of eyes" (another admin reviewing a script or a change plan) can catch errors or suggest improvements.
3.  This pattern can be used in many scenarios:
    * Generating code and having another agent review it for bugs or style.
    * Drafting a report and having another agent check it for factual accuracy or clarity.
    * Creating a plan and having another agent critique its feasibility.

In [None]:
from dataclasses import dataclass
from typing import Literal

# Assuming 'Agent', 'ItemHelpers', 'Runner', 'TResponseInputItem', 'trace' are from the 'agents' library
from agents import Agent, ItemHelpers, Runner, TResponseInputItem, trace

"""
This example shows agents collaborating: one agent reviews the work of another and provides feedback.
- The first agent ('story_outline_generator') generates an outline for a story.
- The second agent ('evaluator') judges the outline and provides structured feedback.
We loop, refining the outline based on feedback, until the judge is satisfied.
This is like a dev writing code and a QA engineer testing it, iterating until it passes.
"""
# It's generally better to use a specific model version for reproducibility.
# model_name_for_collaboration = "gpt-4o" # As in the notebook
# For local testing, you might use the previously defined 'model' variable if it's compatible.
# Let's assume we use the 'model' variable if 'gpt-4o' is not available locally.
# If you have 'gpt-4o' via OpenAI API, use that. Otherwise, adapt to your local model.
# For consistency with earlier cells, let's assume 'model' refers to the 'qwen3:32b' or similar local model
# if 'model_config' (from agent cell) is not directly usable here, or re-initialize a client.
# The notebook uses "gpt-4o", which implies an OpenAI API call.
# If using a local model, its ability to follow complex instructions for evaluation will vary.
collab_model_to_use = model_config

# Agent 1: The Content Creator
story_outline_generator = Agent(
    name="story_outline_generator",
    instructions=(
        "You are a creative writer. Generate a very short story outline based on the user's input theme or topic."
        "If feedback is provided by an evaluator, use that feedback to revise and improve your outline."
    ),
    model= collab_model_to_use, 
)


# --- Define the Structure for Feedback ---
# This '@dataclass' defines a specific format for the evaluator's output.
# It's like defining the fields in a structured log or a JIRA ticket.
# This ensures the feedback is consistent and easy for the other agent (and our code) to parse.
@dataclass
class EvaluationFeedback:
    feedback: str  # Textual feedback on what to improve.
    score: Literal["pass", "needs_improvement", "fail"] # A fixed set of possible scores.
                                                        # 'Literal' ensures only these strings are valid.


# Agent 2: The Critic/Evaluator
evaluator = Agent(
    name="evaluator",
    instructions=(
        "You are a strict editor. Evaluate the provided story outline."
        "Decide if it's good enough based on clarity, creativity, and completeness for a short story."
        "Provide specific feedback on what needs to be improved if it's not 'pass'."
        "You MUST output your evaluation in the format defined by 'EvaluationFeedback' (feedback string and score)."
    ),
    model= collab_model_to_use, 
    output_type=EvaluationFeedback, # Crucial: Tells the agent to structure its output like this dataclass.
)

print("Collaborating agents (generator and evaluator) have been defined.")

In [None]:
# Get initial input from the user (e.g., the theme of the story).
msg = input("What kind of story would you like an outline for? (e.g., 'a space adventure', 'a mystery') ")

# Initial input for the story_outline_generator.
# 'TResponseInputItem' is likely a type defined by the 'agents' library for conversation history.
input_items: list[TResponseInputItem] = [{"content": msg, "role": "user"}]

latest_outline: str | None = None # To store the most recent outline from the generator.
max_iterations = 3 # Safety break to prevent infinite loops, like a circuit breaker.

print(f"\nStarting collaboration for a '{msg}' story outline (max {max_iterations} iterations)...")

# 'trace' is likely a context manager from the 'agents' library for logging or monitoring.
with trace("StoryOutlineCollaboration"): # Name for this traced operation.
    for i in range(max_iterations):
        print(f"\n--- Iteration {i + 1} ---")
        
        # --- Step 1: Generate Outline ---
        # The story_outline_generator takes the current input_items (which includes user request and any prior feedback).
        print("Generator agent is creating/revising the outline...")
        story_outline_result = await Runner.run(
            story_outline_generator,
            input_items,
        )

        # Update input_items with the generator's response for the evaluator.
        # 'to_input_list()' likely converts the agent's output into the format needed for the next agent.
        input_items = story_outline_result.to_input_list()
        # Extract the text of the generated outline.
        # 'ItemHelpers.text_message_outputs' probably gets the main content from the agent's output.
        latest_outline = ItemHelpers.text_message_outputs(story_outline_result.new_items)
        print(f"Generator Agent Output:\n{latest_outline}")

        # --- Step 2: Evaluate Outline ---
        # The evaluator agent takes the generated outline (now part of input_items).
        print("\nEvaluator agent is reviewing the outline...")
        evaluator_result = await Runner.run(evaluator, input_items)
        
        # The 'evaluator' agent was defined with 'output_type=EvaluationFeedback',
        # so 'final_output' should be an instance of our EvaluationFeedback dataclass.
        evaluation: EvaluationFeedback = evaluator_result.final_output
        
        print(f"Evaluator Agent Score: {evaluation.score}")
        if evaluation.score != "pass":
            print(f"Evaluator Agent Feedback: {evaluation.feedback}")

        # --- Step 3: Decide to Continue or Stop ---
        if evaluation.score == "pass":
            print("\nStory outline is good enough! Collaboration complete.")
            break # Exit the loop.
        
        if i == max_iterations - 1: # Check if we've hit the max iterations
            print("\nMaximum number of iterations reached. Exiting collaboration.")
            break

        print("Outline needs improvement. Re-running generator with feedback...")
        
        # --- Step 4: Incorporate Feedback for Next Iteration ---
        # Add the evaluator's feedback as a new 'user' message to guide the generator's next attempt.
        # This simulates a conversation where the generator "reads" the feedback.
        input_items.append({"content": f"Evaluator's Feedback: {evaluation.feedback}", "role": "user"})
    else: # This 'else' belongs to the 'for' loop, executes if the loop completed without 'break'.
        print("\nCollaboration loop finished (either by pass or max iterations).")

print(f"\n--- Final Story Outline ---")
if latest_outline:
    print(latest_outline)
else:
    print("No outline was successfully generated.")

## Agents Routing: The Intelligent Dispatcher

This pattern is about efficiently directing tasks to the correct specialist.
1.  It demonstrates a "supervisor" or "triage" agent that doesn't do the main work itself, but instead decides which other agent is best suited for the job.
2.  This is a very common and practical agentic pattern, especially for building systems that need to handle diverse requests.
    * **Ops Analogy:** Think of a sophisticated monitoring system's alert router. When an alert comes in, a central 'router' component analyzes the alert's source (e.g., server name, application ID, error type) and forwards it to the specific on-call team's dashboard or notification channel (database team, network team, application team).
3.  The example "Ask the question in German and see what happens!" highlights the need for robustness. **Crucially, in a real-world Ops system using this pattern, you *must* have a well-defined fallback or default handler.** This 'fallback agent' (like `know_all_agent` in the code) gracefully handles requests that the triage agent doesn't have a specific specialist for, preventing errors and ensuring a predictable user experience (e.g., "Sorry, I can only handle requests in English, French, or Spanish. Please rephrase your request.").

In [None]:
import uuid # For generating unique IDs, though not explicitly used in this agent setup snippet.
#from openai.types.responses import ResponseContentPartDoneEvent, ResponseTextDeltaEvent # Not used in this snippet

# Assuming 'Agent', 'RawResponsesStreamEvent', 'Runner', 'TResponseInputItem', 'trace' are from the 'agents' library
from agents import Agent, RawResponsesStreamEvent, Runner, TResponseInputItem, trace

"""
This example shows the handoffs/routing pattern.
- The 'triage_agent' receives the user's first message.
- Based on the language of the request (or other criteria), it "hands off" the work
  to an appropriate specialist agent (French, Spanish, English).
- If no specialist is found, it hands off to a 'know_all_agent' (our fallback).
This is like a call center IVR that directs you to the right department.
"""

# Adapting for local model use:
model_for_routing = model_config

# --- Define Specialist Agents ---
# Each of these agents is specialized for a single language.
# Their instructions tell them their linguistic scope.
french_agent = Agent(
    name="french_agent",
    instructions="You are a helpful assistant. You MUST respond ONLY in French.",
    model=model_for_routing,
)

spanish_agent = Agent(
    name="spanish_agent",
    instructions="You are a helpful assistant. You MUST respond ONLY in Spanish.",
    model=model_for_routing,
)

english_agent = Agent(
    name="english_agent",
    instructions="You are a helpful assistant. You MUST respond ONLY in English. Answer the question you received directly.",
    model=model_for_routing,
)

# --- Define the Fallback Agent ---
# This agent handles cases where the language isn't recognized or supported by specialists.
# Its instructions are key to providing a good user experience for unhandled cases.
know_all_agent = Agent(
    name="know_all_agent", # Perhaps "fallback_handler_agent" or "unsupported_language_agent"
    instructions=(
        "You are a polite assistant. You primarily speak English. "
        "If the user's query is not in English, French, or Spanish, you should state that you can only fully assist in English, French, or Spanish. "
        "Politely ask the user to repeat their question in one of those languages. "
        "Do not attempt to answer questions in languages you are not explicitly designed for."
    ),
    model=model_for_routing,
)

# --- Define the Triage/Routing Agent ---
# This is the smart router.
# 'instructions' tell it how to make routing decisions.
# 'handoffs' lists the specialist agents it knows about and can delegate work to.
# The order in 'handoffs' might matter if the LLM considers them sequentially.
triage_agent = Agent(
    name="triage_agent",
    instructions=(
        "You are a language detection and routing specialist. "
        "Analyze the user's request to determine its language. "
        "If the language is French, handoff to 'french_agent'. "
        "If the language is Spanish, handoff to 'spanish_agent'. "
        "If the language is English, handoff to 'english_agent'. "
        "If the language is not one of these or you are unsure, handoff to 'know_all_agent'. "
        "You do not answer questions yourself; your sole job is to route to the correct agent."
    ),
    handoffs=[french_agent, spanish_agent, english_agent, know_all_agent], # The list of agents it can delegate to.
    model=model_for_routing,
)

print("Routing agents (triage, French, Spanish, English, fallback) defined.")
# Experiment idea from notebook:
# Remove "So answer in English even if you understand the language that is being used."
# from know_all_agent and then ask "Wie geht es dir" (German).
# The goal is to see if 'know_all_agent' correctly identifies it can't handle German
# and gives the polite refusal, rather than attempting a (potentially poor) German response.

In [None]:
# Get input from the user.
msg_route = input("Hi! We support French, Spanish, and English. How can I help you today? ") # Renamed msg to msg_route

# Prepare the input for the triage_agent.
inputs_route: list[TResponseInputItem] = [{"content": msg_route, "role": "user"}]

print(f"\nSending user message to triage_agent: '{msg_route}'")

# 'trace' is for logging/monitoring the agent interaction.
with trace("LanguageRouterFlow"): # Naming the traced operation
    # --- Run the Triage Agent ---
    # We send the user's input to our 'triage_agent'.
    # The 'triage_agent' will internally analyze the message, decide which specialist agent
    # (French, Spanish, English, or the fallback 'know_all_agent') should handle this,
    # and then transparently pass the work to that specialist.
    # The 'result' we get back will be the final output from whichever specialist agent handled the request.
    routing_result = await Runner.run(triage_agent, inputs_route)
    
    # For Ops, seeing the internal decision-making can be very useful for debugging.
    # The original notebook suggests uncommenting a pprint to see details.
    # For example, how does the 'agents' library represent the handoff in the trace?
    # from pprint import pprint
    # print("\n--- Detailed Trace of Routing ---")
    # pprint(routing_result) # This would show the full interaction details.

    print("--------------------------")
    print("Final Response (from the specialist agent via triage):")
    # 'final_output' should be the actual text response from the specialist agent.
    if routing_result and hasattr(routing_result, 'final_output'):
        print(routing_result.final_output)
    else:
        print("No valid response received from the agent system.")

# Try inputs like:
# "bonjour monsieur" (should go to french_agent)
# "hola como estas" (should go to spanish_agent)
# "hello how are you" (should go to english_agent)
# "Wie geht es dir" (should go to know_all_agent, our fallback)

## Agents in a Deterministic Workflow: The Assembly Line

This pattern demonstrates agents working together in a predefined, fixed sequence to complete a task, much like steps in an assembly line or a checklist. Each agent performs its specific part of the job and then hands off its output to the next agent in the chain.

1.  It shows agents calling other agents sequentially to complete a well-defined workflow.
2.  This is a very common and highly practical agentic pattern.
    * **Ops Analogy 1: CI/CD Pipeline.** A code commit triggers a "build agent." If successful, its output (a built artifact) goes to a "test agent." If tests pass, the output goes to a "deployment agent." Each step is distinct and sequential.
    * **Ops Analogy 2: Automated Server Provisioning.**
        1.  "Request_Validation_Agent": Validates user request for a new VM (e.g., required parameters like CPU, RAM, OS).
        2.  "Infrastructure_Provisioning_Agent": Takes validated request, calls cloud APIs (e.g., OpenStack, vSphere, AWS) to create the VM. Output is VM details.
        3.  "Configuration_Management_Agent": Takes VM details, runs Ansible/Chef/Puppet to install software and configure the OS. Output is configured VM status.
        4.  "Notification_Agent": Takes final status, notifies the requesting user.
3.  This workflow pattern can be combined with others, like the "collaboration pattern" (e.g., one step in the workflow might itself involve two agents collaborating) to build very sophisticated automated processes.

In [None]:
from dataclasses import dataclass
from typing import Literal # Literal isn't used here, but good practice if defining fixed choices.

# Assuming 'Agent', 'ItemHelpers', 'Runner', 'TResponseInputItem', 'trace' are from the 'agents' library.
from agents import Agent, ItemHelpers, Runner, TResponseInputItem, trace

"""
This example shows how different agents can be chained to complete a deterministic workflow.
The workflow is: Planner Agent -> Writer Agent -> Editor Agent.
1. User provides an essay topic.
2. Planner Agent creates an outline.
3. Writer Agent takes the outline and writes a draft essay.
4. Editor Agent takes the draft, polishes it, and produces the final version.
This is like an editorial process in publishing.
"""
# Adapting for local model use:
model_for_workflow = model_config

# --- Define Data Structures for Agent Outputs (Optional but Recommended) ---
# These dataclasses define the expected structure of the output from each agent.
# This helps with type safety and makes it clear what each agent is supposed to produce.
# The 'body: str' implies the main content from each agent will be a string.
@dataclass
class PlannerOutput: # Renamed for clarity from 'Planner' to avoid conflict if an Agent class was also 'Planner'
    body: str # The essay outline

@dataclass
class WriterOutput: # Renamed for clarity
    body: str # The draft essay

@dataclass
class EditorOutput: # Renamed for clarity
    body: str # The final, polished essay

# --- Define the Agents in the Workflow ---

# Agent 1: The Planner
# Takes a topic and creates an outline with source references.
planner_agent = Agent(
    name="planner_agent",
    instructions=(
        "You are an academic research assistant. Given a user's essay theme/topic, "
        "create a brief but comprehensive outline for the essay. "
        "The outline should clearly list the main points to be covered in each section (e.g., Introduction, Body Paragraphs, Conclusion). "
        "Crucially, make sure to suggest or include references to actual source materials where appropriate for an academic essay."
    ),
    model=model_for_workflow,
    output_type=PlannerOutput, # Expects output to match this structure.
)



# Agent 2: The Writer
# Takes the outline and expands it into a full essay draft.
writer_agent = Agent(
    name="writer_agent",
    instructions=(
        "You are a skilled academic writer. You will be given an essay outline. "
        "Your task is to expand this outline into a complete, well-structured essay. "
        "Ensure you elaborate on each point in the outline. "
        "Maintain an academic tone and cite references appropriately if they are provided or implied in the outline. "
        "Do not invent information; stick to plausible elaborations of the outline points."
    ),
    model=model_for_workflow,
    output_type=WriterOutput, # Expects output to match this structure.
)

# Agent 3: The Editor
# Takes the draft essay, reviews, polishes, and finalizes it.
editor_agent = Agent(
    name="editor_agent",
    instructions=(
        "You are a meticulous editor with a keen eye for detail. You will be given a draft essay. "
        "Your job is to review and polish the draft. "
        "Focus on improving language, fixing grammatical errors and inconsistencies, enhancing flow and coherence. "
        "Ensure the essay is logically sound and that arguments are well-supported. "
        "Verify that any references are correctly formatted (if present). "
        "Return the final, publication-ready version of the essay."
    ),
    model=model_for_workflow,
    output_type=EditorOutput, # Expects output to match this structure.
)

print("Workflow agents (planner, writer, editor) defined.")

In [23]:
# Get the essay topic from the user.
msg_workflow = input("Hi! I am an AI Research Assistant. Give me any topic, and I will write a well-researched essay about it: ") # Renamed msg

# Initial input for the first agent in the workflow (the planner).
inputs_workflow: list[TResponseInputItem] = [{"content": msg_workflow, "role": "user"}]

print(f"\nStarting essay workflow for topic: '{msg_workflow}'")

# 'trace' for logging the entire workflow.
with trace("EssayWritingWorkflow"):
    # --- Step 1: Planner Agent ---
    # The planner_agent takes the user's topic and creates an outline.
    print("\n---------- Planner Agent Output ----------")
    planner_result_container = await Runner.run(planner_agent, inputs_workflow)
    # Assuming the agent's output (an instance of PlannerOutput) is in 'final_output'.
    planner_output: PlannerOutput = planner_result_container.final_output
    print(planner_output.body) # Print the outline.
    
    # The output of the planner (the outline) becomes the input for the writer.
    writer_input_content = planner_output.body # The outline text.

    # --- Step 2: Writer Agent ---
    # The writer_agent takes the outline and writes a draft essay.
    print("\n---------- Writer Agent Output ----------")
    writer_inputs: list[TResponseInputItem] = [{"content": writer_input_content, "role": "user"}]
    writer_result_container = await Runner.run(writer_agent, writer_inputs)
    writer_output: WriterOutput = writer_result_container.final_output
    print(writer_output.body) # Print the draft essay.

    # The output of the writer (the draft essay) becomes the input for the editor.
    editor_input_content = writer_output.body # The draft essay text.

    # --- Step 3: Editor Agent ---
    # The editor_agent takes the draft essay and polishes it.
    print("\n---------- Editor Agent Output ----------")
    editor_inputs: list[TResponseInputItem] = [{"content": editor_input_content, "role": "user"}]
    editor_result_container = await Runner.run(editor_agent, editor_inputs)
    editor_output: EditorOutput = editor_result_container.final_output
    print(editor_output.body) # Print the final, polished essay.

    print("\n--- Essay workflow complete! ---")

Hi! I am an AI Research Assistant. Give me any topic, and I will write a well-researched essay about it:  french


Error getting response: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-dummy_key. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}. (request_id: req_56b7cb05e6f8855fef91488415a9f0a0)


AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-dummy_key. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

# Microservices and AI Agents: An Ops Perspective

For those with an Operations background, there are several useful parallels (and key differences) between LLM-based AI agents (especially when they use tools) and the microservices architecture you're likely familiar with:

## Similarities

#### Specialized Functionality:
* **Microservices:** Each service is designed to handle a specific business capability (e.g., user authentication, payment processing, inventory management).
* **AI Agents:** Agents can be specialized for particular tasks (e.g., answering customer support FAQs), knowledge domains (e.g., medical information), or interactions (e.g., booking appointments via tools).
    * **Ops Takeaway:** You can build a "team" of specialized AI agents, much like a suite of microservices, each expert in its own area.

#### Independent Operation & Deployment (Potentially):
* **Microservices:** Can often be developed, deployed, scaled, and updated independently.
* **AI Agents:** Individual agents (especially if they are distinct processes or use different models/prompts) can potentially be managed somewhat independently. An agent that uses a specific tool (which might be a microservice itself) can be updated if that tool's API changes.
    * **Ops Takeaway:** This modularity can lead to more resilient and maintainable AI systems.

#### Communication via APIs/Messages:
* **Microservices:** Communicate via well-defined APIs (e.g., REST, gRPC) or message queues.
* **AI Agents:** Receive prompts (their "API input") and return responses. When using tools, the agent effectively makes an "API call" to the tool (which could be a Python function, an external HTTP API, or a command-line interface).
    * **Ops Takeaway:** Managing the "API contracts" for tools becomes important, just like managing microservice API versions.

#### Composability & Orchestration:
* **Microservices:** Combined and orchestrated (e.g., using Kubernetes, service meshes, or workflow engines like Camunda/Prefect) to build complex applications.
* **AI Agents:** Multiple AI agents can be combined in patterns (collaboration, routing, workflow as seen above) to create more sophisticated AI-driven solutions. The "agentic patterns" are forms of orchestration.
    * **Ops Takeaway:** You'll be thinking about how agents connect, pass data, and how to manage these multi-agent "applications."

#### Statelessness vs. Statefulness:
* **Microservices:** Often designed to be stateless for scalability, with state managed externally (e.g., in databases, caches). Some microservices are inherently stateful.
* **AI Agents:** Basic LLM calls can be stateless. However, "agents" often maintain conversation history (a form of state) to provide context. Tools used by agents might interact with stateful systems.
    * **Ops Takeaway:** Understanding where state is managed in an agentic system is crucial for reliability and debugging.

#### Scaling, Monitoring, and Versioning:
* **Microservices:** Face operational challenges around scaling individual services, monitoring their health and performance, and managing different versions.
* **AI Agents:** Similar challenges apply. How do you scale an agent that uses a rate-limited tool? How do you monitor an agent's "decision quality" or tool usage? How do you version prompts, models, and tool integrations?
    * **Ops Takeaway:** Your existing Ops principles for microservices (logging, metrics, tracing, version control, CICD) will be highly relevant for productionizing agentic AI.

## Key Differences

#### Implementation Logic:
* **Microservices:** Built with traditional, deterministic code. Given the same input and state, they produce the same output.
* **LLM Agents:** Core logic is based on probabilistic models (LLMs). Their behavior is learned from data, not explicitly programmed for every eventuality.
    * **Ops Takeaway:** This means agent behavior can be less predictable than traditional code. Robust error handling, fallbacks, and human oversight become even more critical. Tools called by agents, however, can (and often should) be deterministic. For example, a Python function used as a tool will execute deterministically.

#### Predictability of Output:
* **Microservices:** Highly predictable outputs for given inputs.
* **LLM Agents:** LLM responses can vary even for the same prompt (especially with higher "temperature" settings). While tool execution itself might be predictable, the LLM's decision to *use* a tool, *which* tool, and with *what arguments* can have variability.
    * **Ops Takeaway:** Testing agentic systems requires different strategies. You'll need to test not just for correctness but also for robustness against unexpected LLM behavior or "hallucinations" when it comes to tool parameters.

**In essence:** Systems that expose tools (like APIs or Python functions) for AI agents to use are often built with traditional, deterministic code, just like your existing microservices. The "agent" part is the LLM deciding how and when to use these tools. This blend brings new capabilities but also new operational considerations.

# AFTERWORD: The Power and Responsibility of AI Agents

Agents are an extremely powerful construct in the field of Generative AI, moving LLMs from being passive text generators to active participants capable of accomplishing tasks.

1.  **Achieving Complex Tasks:** You can achieve complex, multi-step tasks by designing appropriate agents, equipping them with the right tools (which could be your existing scripts, APIs, or new functions), and orchestrating interactions between different agents (using patterns like collaboration, routing, or workflows).
    * **Ops View:** Think of this as building sophisticated automation that can reason and adapt in ways traditional scripts cannot.

2.  **Improving Accuracy:** There are known methods to improve the accuracy and reliability of agent outputs. Much like human teams use peer review or quality checks, having agents collaborate (e.g., one agent drafts, another critiques) can significantly enhance results.
    * **Ops View:** This is like implementing automated validation checks or "four-eyes principles" within your AI systems.

3.  **Interacting with the Real World (via Tools):** External data retrieval, system queries, and actions on external systems are carried out through tools. Tools are the agent's hands, eyes, and ears, allowing it to interact with and gather information from systems beyond its own pre-trained knowledge.
    * **Ops View:** This is how agents move from being just "chatbots" to actual "doers" that can integrate with your existing infrastructure and services. The tools are the interfaces to your operational environment.

4.  **Human-in-the-Loop (HITL) for Oversight:** If an agent's processing or actions need to be vetted or approved before proceeding (especially for critical or irreversible operations), ensure that a human is involved in the loop. This "human-in-the-loop" (HITL) step allows a person to review and approve (or reject) an agent's proposed plan or action before it's executed.
    * **Ops View:** This is absolutely critical for enterprise use and safety. It’s directly analogous to having a mandatory approval step in a change management process before a critical command is executed on a production server. It ensures human oversight, control, and accountability when agents are performing sensitive or high-impact tasks.

By understanding these concepts, an Operations-focused team can begin to see how AI agents can be integrated into their workflows, not as black boxes, but as manageable, tool-using components that can augment their capabilities.