[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://drive.google.com/drive/folders/1IrwoNrb3AWLAhAqjlAkJNYa39p9eT9ui?usp=sharing)

# Flotorch Agent Usage Metric Evaluation (News Summary Agent Use Case)

This notebook demonstrates how to measure and analyze the **usage and cost metrics** of a **News Summary Agent** built with **Flotorch ADK** using the **Flotorch Eval** framework.

The **News Summary Agent** fetches and summarizes the latest news using news APIs (e.g., NewsAPI). The evaluation relies on **OpenTelemetry Traces** generated during the agent's run to provide a detailed breakdown of token usage and costs across all operations (LLM calls, tool execution, etc.).

---

## Key Concepts

* **News Summary Agent**: An agent designed to fetch and summarize the latest news articles from various sources.
* **OpenTelemetry Traces**: Detailed records of the agent's execution steps (spans) used to measure token usage and costs for different operations.
* **UsageMetric**: A Flotorch Eval metric that extracts and summarizes usage information (tokens, costs) from the execution traces (agent trajectories). The evaluation metric used is **usage_summary**.

---

### Architecture Overview

![Workflow Diagram](diagrams/06_UsageMetric_Workflow_Diagram.drawio.png)
*Figure 2: Detailed workflow diagram showing the step-by-step process of usage metric evaluation from agent execution through trace collection to metric computation.*

---

## Requirements

* Flotorch account with configured models.
* Valid Flotorch API key and gateway base URL.
* Agent configured with OpenTelemetry tracing enabled.

---

## Agent Setup in Flotorch Console

**Important**: Before running this notebook, you need to create an agent in the Flotorch Console. This section provides step-by-step instructions on how to set up the agent.

### Step 1: Access Flotorch Console

1. **Log in to Flotorch Console**:
   - Navigate to your Flotorch Console (e.g., `https://dev-console.flotorch.cloud`)
   - Ensure you have the necessary permissions to create agents

2. **Navigate to Agents Section**:
   - Click on **"Agents"** in the left sidebar
   - You should see the "Agent Builder" option selected

### Step 2: Create New Agent

1. **Click "Create FloTorch Agent"**:
   - Look for the blue **"+ Create FloTorch Agent"** button in the top right corner
   - Click it to start creating a new agent

2. **Agent Configuration**:
   - **Agent Name**: Choose a unique name for your agent (e.g., `news-summary-agent`)
     - **Important**: The name should only contain alphanumeric characters and dashes (a-z, A-Z, 0-9, -)
     - **Note**: Copy this agent name - you'll need to use it in the `agent_name` variable later
   - **Description** (Optional): Add a description if desired

### Step 3: Configure Agent Details

After creating the agent, you'll be directed to the agent configuration page. Configure the following:

#### Required Configuration:

1. **Model** (`* Model`):
   - **Required**: Select a model from the available options
   - Example: `gpt-model` or any available model from your Flotorch gateway
   - Click the edit icon to configure

2. **Agent Details** (`* Agent Details`):
   - **Required**: Configure agent details
   - **System Prompt**: Copy and paste the following system prompt:

you are the helpful assistant. you need to call get_top_news tool when the new about the news you need give top 7  current news from the world wide  mainly from the India.

Available Tools:
get_top_news



   - **Goal**: Copy and paste the following goal:
   
you are the helpful assistant. you need to call get_top_news tool when the new about the news you need give top 7  current news from the world wide  mainly from the India.

#### Optional Configuration:

1. **Tools**:
   - Tools will be added programmatically via the notebook (see Section 8)
   - You can leave this as "Not Configured" in the console

2. **Input Schema**:
   - Optional: Leave as "Not Configured" for this use case

3. **Output Schema**:
   - Optional: Leave as "Not Configured" for this use case

### Step 4: Publish the Agent

1. **Review Configuration**:
   - Ensure the Model and Agent Details are configured correctly
   - Verify the System Prompt and Goal are set

2. **Publish Agent**:
   - After configuration, click **"Publish"** or **"Make a revision"** to publish the agent
   - Once published, the agent will have a version number (e.g., v1)

3. **Note the Agent Name**:
   - **Important**: Copy the exact agent name you used when creating the agent
   - You will need to replace `<your_agent_name>` in the `agent_name` variable in Section 2.1 (Global Provider Models and Agent Configuration)

### Step 5: Update Notebook Configuration

1. **Update Agent Name**:
   - Navigate to Section 2.1 in this notebook
   - Find the `agent_name` variable
   - Replace `<your_agent_name>` with the exact agent name you created in the console

**Example**:
- If you created an agent named `news-summary-agent` in the console
- Set `agent_name = "news-summary-agent"` in the notebook

### Summary of Required vs Optional Settings

| Setting | Required/Optional | Value |
|---------|------------------|-------|
| **Agent Name** | **Required** | Choose a unique name (copy it for notebook) |
| **Model** | **Required** | Select from available models |
| **System Prompt** | **Required** | Use the system prompt provided above |
| **Goal** | **Required** | Use the goal provided above |
| **Tools** | **Optional** | Will be added via notebook code |
| **Input Schema** | **Optional** | Can leave as "Not Configured" |
| **Output Schema** | **Optional** | Can leave as "Not Configured" |

**Note**: The tools (Knowledge Base, Web Search, Weather, News) will be added to the agent programmatically in the notebook code, so you don't need to configure them manually in the console.

---


## 1. Setup and Installation

### Purpose
Install the necessary packages for the Flotorch Evaluation framework required for usage and cost metric evaluation.

### Key Components
- **`flotorch-eval`**: Flotorch evaluation framework with all dependencies for usage metrics


In [None]:
# Install Flotorch Eval packages
# flotorch-eval: Flotorch evaluation framework with all dependencies

%pip install flotorch-eval==2.0.0b1 flotorch[adk]==3.1.0b1

## 2.Authentication and Credentials

### Purpose
Configure your Flotorch API credentials and gateway URL for authentication.

### Key Components
This cell configures the essential authentication and connection parameters:

**Authentication Parameters**:

| Parameter | Description | Example |
|-----------|-------------|---------|
| `FLOTORCH_API_KEY` | Your API authentication key (found in your Flotorch Console). Securely entered using `getpass` to avoid displaying in the notebook | `sk_...` |
| `FLOTORCH_BASE_URL` | Your Flotorch gateway endpoint URL | `https://dev-console.flotorch.cloud` |

**Note**: Use secure credential management in production environments.


In [None]:
import getpass  # Securely prompt without echoing in Prefect/notebooks

# authentication for Flotorch access
try:
    FLOTORCH_API_KEY = getpass.getpass("Paste your API key here: ")  
    print(f"✓ FLOTORCH_API_KEY set successfully")
except getpass.GetPassWarning as e:
    print(f"Warning: {e}")
    FLOTORCH_API_KEY = ""
    print(f"✗ FLOTORCH_API_KEY not set")

FLOTORCH_BASE_URL = input("Paste your Flotorch Base URL here: ")  # Prefect gateway or cloud endpoint          || https://dev-console.flotorch.cloud
print(f"✓ FLOTORCH_BASE_URL set: {FLOTORCH_BASE_URL}")

print("✓ All credentials configured successfully!")

### 2.1. Global Provider Models and Agent Configuration

### Purpose
Define available models from the Flotorch gateway and configure agent-specific parameters.

### Key Components

**Global Provider Models**: These are the available models from the Flotorch gateway that can be used for evaluation and agent operations:

| Model Variable | Model Name | Description |
|----------------|------------|-------------|
| `MODEL_CLAUDE_HAIKU` | `flotorch/flotorch-claude-haiku-4-5` | Claude Haiku model via Flotorch gateway |
| `MODEL_CLAUDE_SONNET` | `flotorch/flotorch-claude-sonnet-3-5-v2` | Claude Sonnet model via Flotorch gateway |
| `MODEL_AWS_NOVA_PRO` | `flotorch/flotorch-aws-nova-pro` | AWS Nova Pro model via Flotorch gateway |
| `MODEL_AWS_NOVA_LITE` | `flotorch/flotorch-aws-nova-lite` | AWS Nova Lite model via Flotorch gateway |
| `MODEL_AWS_NOVA_MICRO` | `flotorch/flotorch-aws-nova-micro` | AWS Nova Micro model via Flotorch gateway |

**Agent Configuration Parameters**:

| Parameter | Description | Example |
|-----------|-------------|---------|
| `default_evaluator` | The LLM model used for evaluation (can use MODEL_* variables above) | `MODEL_CLAUDE_SONNET` or `flotorch/flotorch-model` |
| `agent_name` | The name of your Flotorch ADK agent | `news-summary-agent` |
| `app_name` | The application name identifier | `agent-evaluation-app-name_06` |
| `user_id` | The user identifier | `agent-evaliation-user-06` |


In [None]:
# ============================================================================
# Global Provider Models (Flotorch Gateway Models)
# ============================================================================
# These models are available from the Flotorch gateway and can be used
# for evaluation, agent operations, and other tasks.

MODEL_CLAUDE_HAIKU = "flotorch/flotorch-claude-haiku-4-5"
MODEL_CLAUDE_SONNET = "flotorch/flotorch-claude-sonnet-3-5-v2"
MODEL_AWS_NOVA_PRO = "flotorch/flotorch-aws-nova-pro"
MODEL_AWS_NOVA_LITE = "flotorch/flotorch-aws-nova-lite"
MODEL_AWS_NOVA_MICRO = "flotorch/flotorch-aws-nova-micro"

print("✓ Global provider models defined")

# The LLM model used for evaluation.
# Can be modified to use any MODEL_* constant above (e.g., MODEL_CLAUDE_SONNET, MODEL_AWS_NOVA_PRO)
# You can use your own models from Flotorch Console as well
default_evaluator = "<your_default_evaluator>"                                                                 # ex : flotorch/gpt-4o-model

agent_name = "<your_agent_name>"  # The name of your Flotorch ADK agent                                        || ex : news-summary-agent
app_name = "<your_app_name>"  # The application name identifier                                                || ex : agent-evaluation-app-name_06
user_id = "<your_user_id>"  # The user identifier                                                              || ex : agent-evaliation-user-06

print("✓ Agent Configuration Parameter defined ")


## 3. Import Required Libraries

### Purpose
Import all required components for evaluating the News Summary Agent usage metrics using Flotorch Eval.

### Key Components
- **`AgentEvaluator`**: Core client for agent evaluation orchestration and trace fetching
- **`UsageMetric`**: Flotorch Eval metric that evaluates usage and cost data
- **`FlotorchADKAgent`**: Creates and configures Flotorch ADK agents with custom tools and tracing
- **`FlotorchADKSession`**: Manages agent sessions for multi-turn conversations
- **`Runner`**: Executes agent queries and coordinates the agent execution flow
- **`FunctionTool`**: Wraps Python functions as tools that can be used by the agent
- **`types`**: Google ADK types for creating message content and handling agent events
- **`pandas`**: Data manipulation and display for formatted results tables
- **`display`**: IPython display utility for rendering formatted outputs

In [None]:
# Required imports
# Flotorch Eval components
from flotorch_eval.agent_eval.core.client import AgentEvaluator
from flotorch_eval.agent_eval.metrics.usage_metrics import UsageMetric

# Flotorch ADK components
from flotorch.adk.agent import FlotorchADKAgent
from flotorch.adk.sessions import FlotorchADKSession

# Google ADK components
from google.adk.runners import Runner
from google.adk.tools import FunctionTool
from google.genai import types

# Utilities
import pandas as pd
from IPython.display import display

print("✓ Imported necessary libraries successfully")

## 4. News Summary Agent Setup

### Purpose
Set up the News Summary Agent with OpenTelemetry tracing enabled to capture detailed execution data for usage and cost metric evaluation.

### Key Components
1. **FlotorchADKAgent** (`agent_client`):
   - Initializes the agent for news fetching and summarization tasks
   - Configures `tracer_config` with `enabled: True` and `sampling_rate: 1` to capture 100% of traces
   - Essential for evaluation as traces contain complete usage and cost information
2. **FlotorchADKSession** (`session_service`): Manages agent sessions for multi-turn conversations
3. **Runner** (`runner`): Executes agent queries and coordinates the agent execution flow

These components work together to run the News Summary Agent and generate OpenTelemetry traces for usage and cost analysis.

### Custom Tool: News Fetching

The News Summary Agent uses a custom tool (`get_top_news`) that integrates with RSS feeds to retrieve the latest news articles. This tool:
- Accepts a limit parameter to specify the maximum number of articles to return
- Uses Google News RSS feeds to fetch the latest news articles from worldwide sources, with priority on India
- Parses RSS feeds and extracts structured information including titles, descriptions, links, and publication dates
- Returns structured news information with article details
- Handles errors gracefully with exception handling

The tool is wrapped as a `FunctionTool` that can be used by the agent to deliver real-time news summaries via RSS feed integration.

In [None]:
import requests
from typing import Dict, Any
from datetime import datetime
import xml.etree.ElementTree as ET

def get_top_news(limit: int = 7) -> Dict[str, Any]:
    """Get the latest top news articles from worldwide, with priority on India.
    Fetches and summarizes the latest news using news APIs (e.g., NewsAPI).
    Returns today's latest news articles from major sources. No API key needed.

    Args:
        limit: The maximum number of articles to return (default: 7)

    Returns:
        A dictionary containing news articles with titles, descriptions, and links
    """
    try:
        articles = []
        headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
        }

        # Helper function to parse RSS feed and extract articles
        def parse_rss_feed(url: str, max_items: int) -> list:
            parsed_articles = []
            try:
                response = requests.get(url, timeout=10, headers=headers)
                if response.status_code == 200:
                    root = ET.fromstring(response.content)
                    for item in root.findall('.//item'):
                        if len(parsed_articles) >= max_items:
                            break
                        title_elem = item.find('title')
                        link_elem = item.find('link')
                        pub_date_elem = item.find('pubDate')
                        desc_elem = item.find('description')

                        title = title_elem.text if title_elem is not None else "No title"
                        # Clean title (remove source prefixes like "BBC News - ")
                        if " - " in title:
                            title = title.split(" - ", 1)[-1]

                        # Skip duplicates
                        if not any(a["title"] == title for a in articles + parsed_articles):
                            parsed_articles.append({
                                "title": title,
                                "description": desc_elem.text if desc_elem is not None else "",
                                "url": link_elem.text if link_elem is not None else "",
                                "publishedAt": pub_date_elem.text if pub_date_elem is not None else str(datetime.now()),
                                "source": {"name": "Google News"}
                            })
            except Exception:
                pass
            return parsed_articles

        # Priority 1: Get India news (today's latest) - try multiple sources
        india_news_urls = [
            "https://news.google.com/rss/search?q=india+when:1d&hl=en-IN&gl=IN&ceid=IN:en",
            "https://news.google.com/rss/headlines/section/topic/WORLD?hl=en-IN&gl=IN&ceid=IN:en",
            "https://news.google.com/rss/search?q=India+latest+news+today&hl=en-IN&gl=IN&ceid=IN:en"
        ]

        for url in india_news_urls:
            if len(articles) >= limit:
                break
            new_articles = parse_rss_feed(url, limit - len(articles))
            articles.extend(new_articles)

        # Priority 2: Fill with worldwide news if needed
        if len(articles) < limit:
            world_news_urls = [
                "https://news.google.com/rss/search?q=world+news+when:1d&hl=en&gl=US&ceid=US:en",
                "https://news.google.com/rss/headlines/section/topic/WORLD?hl=en&gl=US&ceid=US:en"
            ]

            for url in world_news_urls:
                if len(articles) >= limit:
                    break
                new_articles = parse_rss_feed(url, limit - len(articles))
                articles.extend(new_articles)

        # Limit to requested number
        articles = articles[:limit]

        return {
            "status": "success",
            "totalResults": len(articles),
            "articles": articles,
            "fetchedAt": str(datetime.now())
        }

    except Exception as e:
        return {
            "status": "error",
            "message": f"Failed to fetch news: {str(e)}",
            "articles": []
        }

# --- Wrap as ADK Tool ---
tools = [FunctionTool(get_top_news)]

print("✓ News fetching tool defined and registered successfully")

## 5. Agent and Runner Initialization

### Purpose
Set up the Flotorch ADK Agent and Runner with OpenTelemetry tracing enabled to capture detailed execution data for usage and cost metric evaluation.

### Key Components
1. **FlotorchADKAgent** (`agent_client`):
   - Initializes the agent with custom news fetching tools
   - Configures `tracer_config` with `enabled: True` and `sampling_rate: 1` to capture 100% of traces
   - Essential for evaluation as traces contain complete usage and cost information
2. **FlotorchADKSession** (`session_service`): Manages agent sessions for multi-turn conversations
3. **Runner** (`runner`): Executes agent queries and coordinates the agent execution flow

These components work together to run the News Summary Agent and generate OpenTelemetry traces for usage and cost analysis.

In [None]:
# Initialize Flotorch ADK Agent with tracing enabled
agent_client = FlotorchADKAgent(
    agent_name=agent_name,
    custom_tools=tools,
    base_url=FLOTORCH_BASE_URL,
    api_key=FLOTORCH_API_KEY,
    tracer_config={
        "enabled": True,                                                   # Enable tracing for Usage measurement
        "endpoint": "https://dev-observability.flotorch.cloud/v1/traces",  # Dev observability OTLP HTTP endpoint (used by QA)
        "sampling_rate": 1                                                 # Sample 100% of traces
    }
)
agent = agent_client.get_agent()

# Initialize session service
session_service = FlotorchADKSession(
    api_key=FLOTORCH_API_KEY,
    base_url=FLOTORCH_BASE_URL,
)

# Create the ADK Runner to execute agent queries
runner = Runner(
    agent=agent,
    app_name=app_name,
    session_service=session_service
)

print("✓ Agent and runner and session initialized successfully")

## 6. Helper Function for Running a Query

### Purpose
Define a helper function that executes a single-turn query with the agent and extracts the final response. The agent execution is automatically traced for usage and cost metric evaluation.

### Functionality
The `run_single_turn` function:
- Accepts a `Runner`, query string, session ID, and user ID as parameters
- Creates a user message using Google ADK types
- Executes the query through the runner
- Iterates through events to find and return the final agent response
- Returns a fallback message if no response is found

This function simplifies the process of running queries and ensures trace generation during execution.

In [None]:
def run_single_turn(runner: Runner, query: str, session_id: str, user_id: str) -> str:
    """
    Execute a single-turn query with the agent and return the final response.
    The agent execution is traced automatically.
    """
    content = types.Content(role="user", parts=[types.Part(text=query)])
    events = runner.run(user_id=user_id, session_id=session_id, new_message=content)

    # Extract the final response
    for event in events:
        if event.is_final_response() and event.content and event.content.parts:
            return event.content.parts[0].text
    return "No response from agent."

print("✓ Helper function defined successfully")

## 7. Define Query

### Purpose
Define the sample news query that will be executed by the News Summary Agent to generate OpenTelemetry traces for usage and cost metric evaluation.

### Key Components
- **`query`**: A sample news request that will be processed by the agent
  - This query will trigger the agent to fetch and summarize news articles using news API tools
  - The query will test the agent's ability to retrieve and process multiple news articles
  - The execution will be automatically traced to capture token usage and cost information
  - The usage data (tokens, costs) will be evaluated using the UsageMetric to compute usage_summary metrics
  - Example: "Get me the top 7 latest news articles from around the world, especially from India"

The query can be modified to test different news scenarios and evaluate usage and cost metrics for various types of news-related requests.


In [None]:
# Execute the query to generate traces

query = "Get me the top 7 latest news articles from around the world, especially from India"

print(f"Query defined : {query}")

## 7. Run the Query and Get Trace ID

### Purpose
Execute a sample news query with the News Summary Agent to generate OpenTelemetry traces that contain usage and cost data for evaluation.

### Process
1. **Create Session**: Initialize a new session for the agent interaction
2. **Execute Query**: Run a sample news query (e.g., "Get me the top 7 latest news articles from around the world, especially from India") through the agent
3. **Retrieve Trace IDs**: Extract the generated trace IDs from the agent client
4. **Display Results**: Print the agent response and trace ID for verification

The execution automatically generates OpenTelemetry traces that record usage and cost information, which will be used for usage metric evaluation.

In [None]:
# Create a new session
session = await runner.session_service.create_session(
    app_name=app_name,
    user_id=user_id,
)
print(f"Session created: {session.id}")

response = run_single_turn(
    runner=runner,
    query=query,
    session_id=session.id,
    user_id=user_id
)

# Retrieve the generated trace IDs
trace_ids = agent_client.get_tracer_ids()
print(trace_ids)
print("Agent Response:")
print(response[:200] + "..." if len(response) > 200 else response)
print(f"Found {len(trace_ids)} trace(s). First trace ID: {trace_ids[0] if trace_ids else 'N/A'}")

print(f"✓ Query execution completed successfully")

In [None]:
# ============================================================================
# PATCH: Fix for model name lookup in pricing data (Colab compatibility)
# ============================================================================
# This patch handles cases where model names in traces (e.g., 'gpt-4o-2024-08-06')
# don't match the exact format in the pricing API (e.g., 'gpt-4o').

from flotorch_eval.common.cost_compute_utils import PriceCache

def normalize_model_name(model_id: str) -> list:
    """Generate possible model name variations for lookup."""
    variations = [model_id]  # Try original first

    # Remove date suffixes (e.g., 'gpt-4o-2024-08-06' -> 'gpt-4o')
    if '-' in model_id:
        parts = model_id.split('-')
        # Check if last 3 parts form a date (YYYY-MM-DD format)
        if len(parts) >= 4:
            try:
                if (len(parts[-3]) == 4 and parts[-3].isdigit() and
                    len(parts[-2]) == 2 and parts[-2].isdigit() and
                    len(parts[-1]) == 2 and parts[-1].isdigit()):
                    base_name = '-'.join(parts[:-3])
                    if base_name not in variations:
                        variations.append(base_name)
            except (ValueError, IndexError):
                pass

        # Also try removing just the last segment
        if len(parts) > 1:
            base_name = '-'.join(parts[:-1])
            if base_name not in variations:
                variations.append(base_name)

    # Handle flotorch/ prefix
    if model_id.startswith('flotorch/'):
        without_prefix = model_id.replace('flotorch/', '')
        if without_prefix not in variations:
            variations.append(without_prefix)

    return variations

# Patch the PriceCache.get_model_cost method
original_get_model_cost = PriceCache.get_model_cost

async def patched_get_model_cost(self, model_id: str, force_refresh=False):
    """
    Patched version that tries multiple model name variations.
    """
    # Call get_price_list on the instance - this should work correctly
    price_list = await self.get_price_list(force_refresh=force_refresh)

    # Try original model_id first
    if model_id in price_list:
        return price_list[model_id]["cost"]

    # Try normalized variations
    variations = normalize_model_name(model_id)
    for variation in variations[1:]:  # Skip first since we already tried it
        if variation in price_list:
            return price_list[variation]["cost"]

    # If still not found, try force refresh to get latest pricing data
    if not force_refresh:
        try:
            price_list = await self.get_price_list(force_refresh=True)
            if model_id in price_list:
                return price_list[model_id]["cost"]

            # Try variations again with fresh data
            for variation in variations[1:]:
                if variation in price_list:
                    return price_list[variation]["cost"]
        except Exception:
            pass  # If refresh fails, continue to error

    # Generate helpful error message
    available_models = list(price_list.keys())[:10]  # Show first 10 as examples
    error_msg = (
        f"Model '{model_id}' not found in pricing data.\n"
        f"Tried variations: {', '.join(variations)}\n"
        f"Available models (sample): {', '.join(available_models)}...\n"
        f"Total models in pricing data: {len(price_list)}\n"
        f"Tip: The pricing data may need to be refreshed or the model name format may differ."
    )
    raise ValueError(error_msg)

# Apply the patch - assign directly to class, Python will bind it correctly
PriceCache.get_model_cost = patched_get_model_cost

print("✓ Model name lookup patch applied successfully (Colab compatibility fix)")


## 9. Usage Metric Evaluation with Flotorch Eval

### Purpose
Initialize the `AgentEvaluator`, fetch the OpenTelemetry trace, and run the `UsageMetric` to evaluate usage and cost metrics. The evaluation metric **usage_summary** provides detailed assessment of token usage and costs for the News Summary Agent.

### Key Components
1. **UsageMetric**: Initializes the usage metric that will analyze trace data
2. **AgentEvaluator** (`client`):
   - Connects to the Flotorch gateway using API credentials
   - Configured with a default evaluator model
   - Provides methods to fetch and evaluate traces
3. **Trace Fetching**: Retrieves the complete trace data using the trace ID generated during agent execution

The fetched trace contains detailed information about token usage and costs, which will be analyzed by the UsageMetric to compute the usage_summary score.

In [None]:
def display_metrics(result):
    """
    Display usage summary metrics in a formatted table.
    """
    # Find the usage_summary metric
    metric = next((m for m in result.scores if m.name == "usage_summary"), None)
    if not metric:
        print("No usage_summary metric found.")
        return

    # Extract metric details
    d = metric.details

    # Get cost information
    total_cost = d.get("total_cost", "0.000000")
    avg_cost_per_call = d.get("average_cost_per_call", "0.000000")
    cost_breakdown = d.get("cost_breakdown", [])

    # Format cost breakdown
    if cost_breakdown:
        breakdown_lines = []
        for item in cost_breakdown:
            if isinstance(item, dict):
                # Format each cost item
                item_str = f"    - {item.get('operation', 'Unknown')}: ${item.get('cost', '0.000000')}"
                if 'count' in item:
                    item_str += f" ({item['count']} calls)"
                breakdown_lines.append(item_str)
            else:
                breakdown_lines.append(f"    - {item}")
        breakdown_text = "\n".join(breakdown_lines)
    else:
        breakdown_text = "    No cost breakdown available."

    # Format the details string
    details = (
        f"Total Cost: ${total_cost}\n"
        f"Average Cost per Call: ${avg_cost_per_call}\n"
        f"\nCost Breakdown:\n{breakdown_text}"
    )

    # Create DataFrame for display
    df = pd.DataFrame([{
        "Metric": metric.name.replace("_", " ").title(),
        "Score": f"${total_cost}",
        "Details": details
    }])

    # Display DataFrame with multiline support
    display(df.style.set_properties(
        subset=['Details'],
        **{'white-space': 'pre-wrap', 'text-align': 'left'}
    ))

print("✓ Display metrics function defined successfully")

In [None]:
# Initialize the UsageMetric
metrics = [UsageMetric()]

# Initialize the AgentEvaluator client
client = AgentEvaluator(
    api_key=FLOTORCH_API_KEY,
    base_url=FLOTORCH_BASE_URL,
    default_evaluator=default_evaluator
)

traces = None
if trace_ids:
    # Fetch the trace data from the Flotorch gateway
    traces = client.fetch_traces(trace_ids[0])
    print(f"✓ Trace fetched successfully")
else:
    print("✗ No trace IDs found to fetch.")

## 10. Run Evaluation

### Purpose
Execute the usage metric evaluation by processing the fetched OpenTelemetry trace using the UsageMetric to assess token usage and costs.

### Process
- Calls `client.evaluate()` with the trace data and UsageMetric
- The evaluator processes the trace to analyze usage and cost data
- Computes the **usage_summary** metric which includes:
  - Total cost for all agent operations
  - Average cost per call
  - Cost breakdown for individual operations:
    - LLM calls with token usage (input/output tokens) and associated costs
    - Tool execution costs (if applicable)
    - Per-operation cost breakdown
- Returns evaluation results with trajectory ID and metric scores

This step generates the usage and cost analysis that will be displayed in the next section.

In [None]:
if 'traces' in locals() and traces:
    # Evaluate the trace using the UsageMetric
    results = await client.evaluate(
        trace=traces,
        metrics=metrics
    )

    print("✓ Evaluation completed successfully!")
else:
    print("Cannot evaluate: No traces were available.")

## 11. Display and Interpret Results

### Purpose
Define helper functions to format and display the evaluation output clearly, showing the usage_summary metric results in a readable format.

### Functionality
The `display_metrics` function:
- Extracts the `usage_summary` metric from evaluation results
- Formats the cost information and usage details
- Creates a structured display showing:
  - Total Cost
  - Average Cost per Call
  - Cost breakdown for individual operations with token counts
- Uses pandas DataFrame with styled formatting for clean presentation

This function provides a user-friendly way to visualize usage and cost metrics.

## 12. View Usage Summary Results

### Purpose
Display the usage summary evaluation results in a formatted table showing the complete cost and usage assessment for the News Summary Agent.

### Output
The displayed table includes:
- **Metric**: The evaluation metric name (usage_summary)
- **Score**: The total cost in USD
- **Details**: Comprehensive usage and cost breakdown showing:
  - Total Cost
  - Average Cost per Call
  - Cost breakdown for individual operations:
    - LLM calls with token counts
    - Tool executions
    - Per-operation cost analysis

This visualization helps identify expensive operations and optimize the agent's cost efficiency.

In [None]:
if 'results' in locals():
    display_metrics(results)
else:
    print("No results object found. Please run sections 5 and 6 first.")

### Interpreting the Usage Summary

The **usage_summary** metric is a vital tool for cost monitoring and optimization of the News Summary Agent:

* **Total Cost**: The cumulative cost for all agent operations, including LLM calls and tool executions. This provides an overall view of operational expenses.
* **Average Cost per Call**: The mean cost per agent invocation, useful for estimating operational expenses at scale.
* **Cost Breakdown**: Detailed cost analysis for specific operations:
    * Individual LLM calls with their token usage (input/output tokens) and associated costs
    * Tool execution costs (if applicable)
    * Per-operation cost breakdown showing which operations consume the most resources

For a News Summary Agent, understanding the usage summary helps identify:
- **Cost optimization opportunities**: If certain operations (e.g., LLM calls or news fetching) are consuming excessive resources
- **Token usage patterns**: Track how many tokens are used for news summarization tasks
- **Operational expenses**: Estimate costs at scale for news summary generation
- **Model selection**: Make informed decisions about model selection and usage optimization to balance cost and quality

## 13. Summary of News Summary Agent Usage Metric Evaluation Notebook

This notebook demonstrates the professional methodology for evaluating the usage and cost metrics of a **News Summary Agent** built with **Flotorch ADK** using the **Flotorch Eval framework**.

**Use Case**: News Summary Agent - Fetches and summarizes the latest news using news APIs (e.g., NewsAPI).

**Evaluation Metric**: usage_summary

## Core Process

### 1. Setup and Instrumentation
- Configure a `FlotorchADKAgent` with custom tools (e.g., a news fetching function that uses NewsAPI).
- Enable **OpenTelemetry Tracing** via the `tracer_config`.
- This instrumentation allows detailed capture of token usage and cost data for all operations.

### 2. Execution and Data Generation
- Run a sample news query through the agent using the **Runner**.
- This automatically generates an **Agent Trajectory** in the form of OpenTelemetry traces.
- The trace records the usage and cost of each component, including:
  - LLM calls (input/output tokens)
  - Tool executions (news fetching)
  - Step-by-step agent operations

### 3. Evaluation
- Use the `AgentEvaluator` client along with the specialized **UsageMetric** from `flotorch-eval`.
- The evaluator processes the trace data to compute usage and cost statistics using the **usage_summary** metric.

### 4. Analysis
- The notebook displays a thorough usage and cost breakdown, including:
  - **Total Cost**
  - **Average Cost per Call**
  - Cost breakdown for individual operations:
    - LLM calls with token counts
    - Tool executions
    - Per-operation cost analysis

## Purpose and Benefits

This evaluation provides **actionable cost and usage metrics** that help developers:

- Monitor and optimize operational costs for the News Summary Agent  
- Track token usage patterns in news summarization tasks  
- Identify expensive operations and optimize them  
- Make informed decisions about model selection and usage optimization  
- Estimate costs at scale for news summary generation