# L5: Instruction Tuning and Guardrails

In this lesson, you'll take everything you've learned from previous lessons and add advanced control mechanisms that transform your agents from helpful assistants into specialized, production-ready systems.

### Building on previous lessons

So far in this course, you've built:

- **Lesson 1**: Basic agents with Google search capabilities
- **Lesson 2**: Session, State & Memory in agents
- **Lesson 3**: Interactive chat agents with financial data integration (`get_financial_context`)  
- **Lesson 4**: Coordinator agents with file persistence (`save_news_to_markdown`) and structured workflows

### What's new in Lesson 5

Now you'll add another piece: **programmatic control systems** to ensure your agents behave reliably in production environments:

- **Callback Systems**: Programmatic guardrails that automatically enforce policies
    1. **Domain Filtering**: Before Tool callback that blocks certain sources thus controlling information access
    2. **Response Enhancement**: After Tool callback that adds transparency and audit trails into agent outputs
- **Update agent's instructions**: We'll update the agent's instructions to make it callback-aware.

By the end of this lesson, you'll have a production-ready agent that uses all the tools from previous lessons but with effective control mechanisms.

## 5.1 Setting up your development environment

Before you dive into building production-ready agents with advanced control mechanisms, let's set up a new folder structure with ADK's built-in project scaffolding using the `adk create` command.

You'll continue to use the gemini-2.0-flash-live model and the Google Gemini API Key.

In [None]:
from helper import *

# Load environment variables from .env file
load_env()

In [None]:
# First we create our expected agent folder 
# You can explore available option: !adk create --help 

!adk create --type=code app5 --model gemini-2.0-flash-live-001 --api_key $GEMINI_API_KEY

## 5.2 Reusing components from previous lessons

Let's start by implementing the tools you developed in previous lessons. You'll be building on the foundation you've already established.

First, you'll implement the `get_financial_context` function that you learned in Lesson 3. Run the cell below to write to the `agent.py` file created using the `adk create` command.

In [None]:
%%writefile app5/agent.py
import pathlib
from typing import Dict, List
import yfinance as yf

from google.adk.agents import Agent
from google.adk.tools import google_search


def get_financial_context(tickers: List[str]) -> Dict[str, str]:
    """
    Fetches the current stock price and daily change for a list of stock tickers
    using the yfinance library.  

    Args:
        tickers: A list of stock market tickers (e.g., ["NVDA", "MSFT"]).

    Returns:
        A dictionary mapping each ticker to its formatted financial data string.
    """
    financial_data: Dict[str, str] = {}
    for ticker_symbol in tickers:
        try:
            # Create a Ticker object
            stock = yf.Ticker(ticker_symbol)
            
            # Fetch the info dictionary
            info = stock.info
            
            # Safely access the required data points
            price = info.get("currentPrice") or info.get("regularMarketPrice")
            change_percent = info.get("regularMarketChangePercent")
            
            if price is not None and change_percent is not None:
                # Format the percentage and the final string
                change_str = f"{change_percent * 100:+.2f}%"
                financial_data[ticker_symbol] = f"\${price:.2f} ({change_str})"
            else:
                # Handle cases where the ticker is valid but data is missing
                financial_data[ticker_symbol] = "Price data not available."

        except Exception:
            # This handles invalid tickers or other yfinance errors gracefully
            financial_data[ticker_symbol] = "Invalid Ticker or Data Error"
            
    return financial_data

Next, we add the `save_news_to_markdown` function from Lesson 4.

In [None]:
%%writefile -a app5/agent.py

def save_news_to_markdown(filename: str, content: str) -> Dict[str, str]:
    """
    Saves the given content to a Markdown file in the current directory.

    Args:
        filename: The name of the file to save (e.g., 'ai_news.md').
        content: The Markdown-formatted string to write to the file.

    Returns:
        A dictionary with the status of the operation.
    """
    try:
        if not filename.endswith(".md"):
            filename += ".md"
        current_directory = pathlib.Path.cwd()
        file_path = current_directory / filename
        file_path.write_text(content, encoding="utf-8")
        return {
            "status": "success",
            "message": f"Successfully saved news to {file_path.resolve()}",
        }
    except Exception as e:
        return {"status": "error", "message": f"Failed to save file: {str(e)}"}


## 5.3 Callbacks

Now you'll implement the core of this lesson: **callback-based control mechanisms**. Callbacks are Python functions that run at specific checkpoints in an agent's lifecycle, providing programmatic control over behavior.

### Understanding ADK Callback types

ADK provides several callback points:
- **before_agent_callback**: Runs before agent execution starts
- **after_agent_callback**: Runs after agent execution completes  
- **before_tool_callback**: Runs before any tool is executed
- **after_tool_callback**: Runs after any tool completes
- **before_model_callback**: Runs before LLM calls
- **after_model_callback**: Runs after LLM responses

### Callback 1: Source filtering Callback (Before Tool Callback)

This callback demonstrates **programmatic policy enforcement**. It automatically blocks search queries that require the agent to fetch news from certain sources.

#### How It Works:

  1. **Interception**: Runs before every `google_search` tool call
  2. **Query analysis**: Examines the search query for blocked domains
  3. **Policy enforcement**: Blocks searches targeting certain sources like Wikipedia, Reddit, Medium
  4. **Error response**: Returns structured error messages when domains are blocked
  5. **Transparency**: Logs allowed/blocked decisions for debugging


Note how the error messages are descriptive!

In [None]:
%%writefile -a app5/agent.py

BLOCKED_DOMAINS = [
    "wikipedia.org",      # General info, not latest news
    "reddit.com",         # Discussion forums, not primary news
    "youtube.com",        # Video content not useful for text processing
    "medium.com",         # Blog platform with variable quality
    "investopedia.com",   # Financial definitions, not tech news
    "quora.com",          # Q&A site, opinions not reports
]

def filter_news_sources_callback(tool, args, tool_context):
    """
    Callback: Blocks search requests that target certain domains which are not necessarily news sources.
    Demonstrates content quality enforcement through request blocking.
    """
    if tool.name == "google_search":
        query = args.get("query", "").lower()

        # Check if query explicitly targets blocked domains
        for domain in BLOCKED_DOMAINS:
            if f"site:{domain}" in query or domain.replace(".org", "").replace(".com", "") in query:
                print(f"BLOCKED: Domains from blocked list detected: '{query}'")
                return {
                    "error": "blocked_source",
                    "reason": f"Searches targeting {domain} or similar are not allowed. Please search for professional news sources."
                }

        print(f"ALLOWED: Professional source query: '{query}'")
        return None


### Callback 2: Response enhancement (After Tool Callback)

The next callback demonstrates a sophisticated pattern: **response enhancement**. Instead of blocking requests, this callback enriches tool responses with additional metadata.

In previous lessons, when tools returned results, the agent had no visibility into what control mechanisms were active. This callback solves that by making the control system transparent to the LLM. This pattern is crucial for enterprise deployments where audit trails and transparency are required.

#### How this callback works

When this callback is triggered after the tool execution (google_search), the following actions are taken:

1. **Callback trigger**: Monitors when `google_search` tools finish execution.
2. **Domain extraction**: Automatically parses URLs from search results to identify source domains.
3. **State management**: Maintains a persistent log across multiple tool calls using `tool_context.state`.
4. **Response transformation**: Converts simple string responses into structured data with metadata.
5. **Write to the report**: Makes callback actions visible to the LLM through process logs. This log is written to the generated markdown report.

This pattern transforms your agent from a "black box" into a transparent, auditable system suitable for production deployment.

In [None]:
%%writefile -a app5/agent.py

from google.adk.tools import ToolContext

def initialize_process_log(tool_context: ToolContext):
    """Helper to ensure the process_log list exists in the state."""
    if 'process_log' not in tool_context.state:
        tool_context.state['process_log'] = []

In [None]:
%%writefile -a app5/agent.py

def inject_process_log_after_search(tool, args, tool_context, tool_response):
    """
    Callback: After a successful search, this injects the process_log into the response
    and adds a specific note about which domains were sourced. This makes the callbacks'
    actions visible to the LLM.
    """
    if tool.name == "google_search" and isinstance(tool_response, str):
        # Extract source domains from the search results
        urls = re.findall(r'https?://[^\s/]+', tool_response)
        unique_domains = sorted(list(set(urlparse(url).netloc for url in urls)))
        
        if unique_domains:
            sourcing_log = f"Action: Sourced news from the following domains: {', '.join(unique_domains)}."
            # Prepend the new log to the existing one for better readability in the report
            current_log = tool_context.state.get('process_log', [])
            tool_context.state['process_log'] = [sourcing_log] + current_log

        final_log = tool_context.state.get('process_log', [])
        print(f"CALLBACK LOG: Injecting process log into tool response: {final_log}")
        return {
            "search_results": tool_response,
            "process_log": final_log
        }
    return tool_response

## 5.4 Modifying the agent

Now, let's update the agent.

1) First, add the before and after tool callbacks to the agent. 

```
    before_tool_callback=[filter_news_sources_callback],
    after_tool_callback=[inject_process_log_after_search],
```

2) Then, **modify the agent's instructions** to add the following:

```    
    **Understanding Callback-Modified Tool Outputs:**
    The `google_search` tool is enhanced by pre- and post-processing callbacks. 
    Its final output is a JSON object with two keys:
    1.  `search_results`: A string containing the actual search results.
    2.  `process_log`: A list of strings describing the filtering actions performed, including which domains were sourced.

    **Callback System Awareness:**
    You have a before tool callback "filter_news_sources_callback" that will automatically intercepts or 
    blocks your tool calls. Ensure you call it before each tool.

    **When Testing Callbacks:**
    If users ask you to test the callback system, be conversational and explain what's happening:
    - Acknowledge when callbacks modify your search queries
    - Describe the policy enforcement you observe
    - Help users understand how the layered control system works in practice
```

The instructions are now callback-aware, meaning the agent understands that tool outputs have been enhanced. The workflow accommodates structured reporting that includes process logs, and the agent meets enterprise-ready requirements for error handling and transparency.

In [None]:
%%writefile -a app5/agent.py

root_agent = Agent(
    name="ai_news_research_coordinator",
    model="gemini-2.0-flash-live-001",
    tools=[google_search, get_financial_context, save_news_to_markdown],
    instruction="""
    **Your Core Identity and Sole Purpose:**
    You are a specialized AI News Assistant that creates structured podcast content. Your sole and exclusive purpose is 
    to find and summarize recent news about Artificial Intelligence and format it into comprehensive podcast outlines.

    **Execution Plan:**

    1.  
        *   **Step 1:** Call `google_search` to find 5 recent AI news articles.
        *   **Step 2:** Analyze the results to find company stock tickers.
        *   **Step 3:** Call `get_financial_context` with the list of tickers.
        *   **Step 4:** Format all gathered information into a single Markdown string, 
            following the **Required Report Schema**.
        *   **Step 5:** Call `save_news_to_markdown` with the filename `ai_research_report.md` and the 
            formatted Markdown content.

    2.  **After `save_news_to_markdown` succeeds, your final response to the user MUST be:** "All done. 
        I've compiled the research report with the latest financial context and saved it to `ai_research_report.md`."

    **Required Report Schema:**
    ```markdown
    # AI Industry News Report

    ## Top Headlines

    ### 1. {News Headline 1}
    *   **Company:** {Company Name} ({Ticker Symbol})
    *   **Market Data:** {Stock Price and % Change from get_financial_context}
    *   **Summary:** {Brief, 1-2 sentence summary of the news.}
    *   **Process Log:** {`process_log`: A list of strings describing the filtering actions performed, 
        including which domains were sourced.}

    (Continue for all news items)
    ```

    **Understanding Callback-Modified Tool Outputs:**
    The `google_search` tool is enhanced by pre- and post-processing callbacks. 
    Its final output is a JSON object with two keys:
    1.  `search_results`: A string containing the actual search results.
    2.  `process_log`: A list of strings describing the filtering actions performed, including which domains were sourced.

    **Callback System Awareness:**
    You have a before tool callback "filter_news_sources_callback" that will automatically intercepts or 
    blocks your tool calls. Ensure you call it before each tool.

    **When Testing Callbacks:**
    If users ask you to test the callback system, be conversational and explain what's happening:
    - Acknowledge when callbacks modify your search queries
    - Describe the policy enforcement you observe
    - Help users understand how the layered control system works in practice

    **Crucial Operational Rule:**
    Do NOT show any intermediate content (raw search results, draft summaries, or processing steps) in your responses. 
    Your entire operation is a background pipeline that should culminate in a single, clean final answer.  
    """,
    before_tool_callback=[
        filter_news_sources_callback,         # Exclude certain domains
    ],
    after_tool_callback=[
        inject_process_log_after_search,
    ]
)

## 5.5 Testing the agent

Let's test your agent that combines everything from previous lessons with the new callback-based control systems.

**Terminal Instructions**
1. To open the terminal, run the cell below.
2. Change to the current lesson directory with the following command: `cd L5`
3. Start your agent using the following command: `adk web --host 0.0.0.0 --port 8001`
4. Run the cell below the terminal, to display the proxy URL. (This step is only needed within this codelab. When running externally, you can directly access the local server link.)

In [None]:
# start a new terminal
import os
from IPython.display import IFrame

IFrame(f"{os.environ.get('DLAI_LOCAL_URL').format(port=8888)}terminals/5", 
       width=600, height=768)

# cd L5

### Testing the Callbacks

The callback system will automatically:
- Block searches to certain domains (wikipedia, reddit, etc.)
- Log which professional sources are being used in the markdown file
- Inject process information into the final report
- Maintain transparency about control mechanisms

Open the proxy URL below and try asking: 
- "Find me the latest AI news" and watch how the callbacks work behind the scenes!
- "Can you explain the different callbacks you have?". The agent is callback aware and should be able to walk through how the callbacks work.

**Run the cell below and open the link in a new tab.**

In [None]:
import os
print(os.environ.get('DLAI_LOCAL_URL').format(port='8001'))

## 5.6 Display report

Run the cell below to view the generated report/ markdown file. Note how the "Process Log" mentions the sources used.

In [None]:
from IPython.display import Markdown, display

# Read and display the markdown file
with open('ai_research_report.md', 'r', encoding='utf-8') as f:
    content = f.read()
    
display(Markdown(content))

# 🚨 **IMPORTANT** 🚨

After finishing, make sure to run the cell below to close your connection so it does not interfere as you progress through the notebooks.

In [None]:
# Terminate ADK process
!pkill -f "adk web"

<p style="background-color:#f7fff8; padding:15px; border-width:3px; border-color:#e0f0e0; border-style:solid; border-radius:6px"> 🚨
&nbsp; <b>Different Run Results:</b> The output generated by AI chat models can vary with each execution due to their dynamic, probabilistic nature. Don't be surprised if your results differ from those shown in the video.</p>

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
<p> 💻 &nbsp; <b> To Access the <code>requirements.txt</code> file or the <code>papers</code> folder: </b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Open"</em> and finally 3) click on <em>"L5"</em>.
</div>

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">


<p> ⬇ &nbsp; <b>Download Notebooks:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Download as"</em> and select <em>"Notebook (.ipynb)"</em>.</p>

</div>