# Control Plane: Agent Fleet Management

----

This notebook demonstrates **batch registration** of agents/workflows and **real-time simulation** with live metrics tracking in Azure AI Foundry.

You will learn how to:

- **Batch Register Agents**: Create 5 agents at once for fleet management
- **Batch Register Workflows**: Create 5 workflows for multi-agent orchestration
- **Real-time Simulation**: Run continuous daemon simulations with live metrics
- **Sample Evaluation**: Evaluate agent performance on sample tasks 
- **Monitor in Portal**: View live metrics and traces in Azure AI Foundry Portal

* Reference Repository: [@guming3d, AI-Foundry-Agent-Simulation](https://github.com/guming3d/AI-Foundry-Agent-Simulation)

## Table of Contents

- [Setup](#setup)
- [Part 1: Batch Agent Registration](#part-1-batch-agent-registration)
- [Part 2: Batch Workflow Registration](#part-2-batch-workflow-registration)
- [Part 3: Real-time Daemon Simulation](#part-3-real-time-daemon-simulation)
- [Part 4: Sample Evaluations](#part-4-sample-evaluations)
- [Part 5: Portal Monitoring](#part-5-portal-monitoring)
- [Wrap-up](#wrap-up)

## Setup

This notebook reuses the configuration file (`.foundry_config.json`) created by `0_setup/1_setup.ipynb`.

- If the file is missing, run the setup notebook first.
- Make sure you can authenticate (e.g., `az login`), so `DefaultAzureCredential` can work.

In [15]:
# Environment setup and PATH configuration
import json
import os
import subprocess
import asyncio
import threading
import time
import random
from datetime import datetime
from dataclasses import dataclass, field
from typing import List, Dict, Any, Optional
from dotenv import load_dotenv

load_dotenv(override=True)

# Ensure the notebook kernel can find Azure CLI (`az`) on PATH
possible_paths = [
    '/opt/homebrew/bin',   # macOS (Apple Silicon)
    '/usr/local/bin',      # macOS (Intel) / Linux
    '/usr/bin',            # Linux / Codespaces
    '/home/linuxbrew/.linuxbrew/bin',  # Linux Homebrew
]

az_path = None
try:
    result = subprocess.run(['which', 'az'], capture_output=True, text=True)
    if result.returncode == 0:
        az_path = os.path.dirname(result.stdout.strip())
        print(f'üîç Azure CLI found: {result.stdout.strip()}')
except Exception:
    pass

paths_to_add: list[str] = []
if az_path and az_path not in os.environ.get('PATH', ''):
    paths_to_add.append(az_path)
else:
    for path in possible_paths:
        if os.path.exists(path) and path not in os.environ.get('PATH', ''):
            paths_to_add.append(path)

if paths_to_add:
    os.environ['PATH'] = ':'.join(paths_to_add) + ':' + os.environ.get('PATH', '')
    print(f"‚úÖ Added to PATH: {', '.join(paths_to_add)}")
else:
    print('‚úÖ PATH looks good already')

print(f"\nPATH (first 150 chars): {os.environ['PATH'][:150]}...")

üîç Azure CLI found: /anaconda/envs/azureml_py38/bin//az
‚úÖ PATH looks good already

PATH (first 150 chars): /anaconda/envs/azureml_py38/bin/:/afh/code/agent-operator-lab/.venv/bin:/home/azureuser/.vscode-server/cli/servers/Stable-c9d77990917f3102ada88be140d2...


In [16]:
# Load Foundry project settings from .foundry_config.json
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import PromptAgentDefinition
from azure.identity import DefaultAzureCredential

config_file = '../0_setup/.foundry_config.json'
try:
    with open(config_file, 'r', encoding='utf-8') as f:
        config = json.load(f)
except FileNotFoundError as e:
    print(f"‚ö†Ô∏è Could not find '{config_file}'.")
    print('üí° Run 0_setup/1_setup.ipynb first to create it.')
    raise e

FOUNDRY_NAME = config.get('FOUNDRY_NAME')
RESOURCE_GROUP = config.get('RESOURCE_GROUP')
LOCATION = config.get('LOCATION')
PROJECT_NAME = config.get('PROJECT_NAME', 'proj-default')
AZURE_AI_PROJECT_ENDPOINT = config.get('AZURE_AI_PROJECT_ENDPOINT')
AZURE_AI_MODEL_DEPLOYMENT_NAME = os.environ.get("AZURE_AI_MODEL_DEPLOYMENT_NAME")

os.environ['FOUNDRY_NAME'] = FOUNDRY_NAME or ''
os.environ['LOCATION'] = LOCATION or ''
os.environ['RESOURCE_GROUP'] = RESOURCE_GROUP or ''
os.environ['AZURE_SUBSCRIPTION_ID'] = config.get('AZURE_SUBSCRIPTION_ID', '')

print(f"‚úÖ Loaded settings from '{config_file}'.")
print(f"\nüìå Foundry name: {FOUNDRY_NAME}")
print(f"üìå Resource group: {RESOURCE_GROUP}")
print(f"üìå Location: {LOCATION}")
print(f"üìå Project endpoint: {AZURE_AI_PROJECT_ENDPOINT}")
print(f"üìå Model deployment: {AZURE_AI_MODEL_DEPLOYMENT_NAME}")

# Initialize credential for Azure services
credential = DefaultAzureCredential()

‚úÖ Loaded settings from '../0_setup/.foundry_config.json'.

üìå Foundry name: foundry-rq90gs
üìå Resource group: foundry-rg
üìå Location: swedencentral
üìå Project endpoint: https://foundry-rq90gs.services.ai.azure.com/api/projects/default-project
üìå Model deployment: gpt-5.2


## Part 1: Batch Agent Registration

Register **10 agents** at once for fleet management. Each agent has a specific role (e.g., CustomerSupport, DataAnalyst, etc.) to simulate a realistic enterprise deployment.

| Agent Type | Description | Count |
|------------|-------------|-------|
| CustomerSupport | Handle customer inquiries | 1 |
| DataAnalyst | Analyze business data | 1 |
| TechSupport | Technical troubleshooting | 1 |
| SalesAssistant | Sales and product queries | 1 |
| GeneralAssistant | General purpose queries | 1 |

In [17]:
# Define agent types with their instructions
AGENT_TYPES = [
    ("CustomerSupport", "You are a customer support agent. Help customers with inquiries, complaints, and account issues."),
    ("DataAnalyst", "You are a data analyst. Analyze data, create reports, and provide business insights."),
    ("TechSupport", "You are a technical support agent. Help with software issues, troubleshooting, and technical guidance."),
    ("SalesAssistant", "You are a sales assistant. Help with product information, pricing, and purchase recommendations."),
    ("GeneralAssistant", "You are a general assistant. Help with various tasks and questions."),
]

# Batch create 5 agents (1 per type)
uuid_suffix = datetime.now().strftime("%Y%m%d%H%M%S")
created_agents = []

print("=" * 70)
print("üöÄ Batch Agent Registration (5 Agents)")
print("=" * 70)

with AIProjectClient(endpoint=AZURE_AI_PROJECT_ENDPOINT, credential=credential) as project_client:
    agent_count = 0
    for agent_type, instructions in AGENT_TYPES:
        for i in range(1):  # 1 agent per type
            agent_name = f"fleet-{agent_type}-{uuid_suffix}-{i+1:02d}"
            try:
                agent = project_client.agents.create_version(
                    agent_name=agent_name,
                    definition=PromptAgentDefinition(
                        model=AZURE_AI_MODEL_DEPLOYMENT_NAME,
                        instructions=instructions,
                    ),
                )
                created_agents.append({
                    "name": agent.name,
                    "id": agent.id,
                    "type": agent_type,
                    "model": AZURE_AI_MODEL_DEPLOYMENT_NAME,
                })
                agent_count += 1
                print(f"   ‚úÖ [{agent_count:02d}/05] Created: {agent_name}")
            except Exception as e:
                print(f"   ‚ùå Failed to create {agent_name}: {str(e)[:50]}")

print(f"\nüìä Summary: {len(created_agents)}/05 agents created successfully")
print(f"   Agent IDs saved for simulation")

üöÄ Batch Agent Registration (5 Agents)
   ‚úÖ [01/05] Created: fleet-CustomerSupport-20260204081706-01
   ‚úÖ [02/05] Created: fleet-DataAnalyst-20260204081706-01
   ‚úÖ [03/05] Created: fleet-TechSupport-20260204081706-01
   ‚úÖ [04/05] Created: fleet-SalesAssistant-20260204081706-01
   ‚úÖ [05/05] Created: fleet-GeneralAssistant-20260204081706-01

üìä Summary: 5/05 agents created successfully
   Agent IDs saved for simulation


## Part 2: Batch Workflow Registration

Register **5 workflows** to orchestrate multi-agent interactions. Each workflow defines a sequence of agent calls for complex tasks.

| Workflow Type | Description | Agents Used |
|---------------|-------------|-------------|
| CustomerJourney | End-to-end customer support | CustomerSupport ‚Üí TechSupport |
| DataPipeline | Data analysis workflow | DataAnalyst ‚Üí GeneralAssistant |
| SalesProcess | Sales funnel workflow | SalesAssistant ‚Üí CustomerSupport |
| TechEscalation | Technical issue escalation | TechSupport ‚Üí DataAnalyst |
| GeneralInquiry | General purpose workflow | GeneralAssistant ‚Üí SalesAssistant |

In [18]:
# Import WorkflowAgentDefinition for workflow registration
from azure.ai.projects.models import WorkflowAgentDefinition

# Define workflow types with patterns
WORKFLOW_TYPES = [
    ("CustomerJourney", "End-to-end customer support workflow", "sequential"),
    ("DataPipeline", "Data analysis and reporting pipeline", "sequential"),
    ("SalesProcess", "Sales funnel and conversion workflow", "sequential"),
    ("TechEscalation", "Technical issue escalation workflow", "review_loop"),
    ("GeneralInquiry", "General purpose inquiry workflow", "sequential"),
]

# Map workflow types to agent pairs (from created agents)
def get_agents_for_workflow(workflow_type: str, agents: list) -> tuple:
    """Get agent pair for a workflow based on type."""
    agent_map = {
        "CustomerJourney": ("CustomerSupport", "TechSupport"),
        "DataPipeline": ("DataAnalyst", "GeneralAssistant"),
        "SalesProcess": ("SalesAssistant", "CustomerSupport"),
        "TechEscalation": ("TechSupport", "DataAnalyst"),
        "GeneralInquiry": ("GeneralAssistant", "SalesAssistant"),
    }
    primary_type, secondary_type = agent_map.get(workflow_type, ("GeneralAssistant", "GeneralAssistant"))
    
    # Find matching agents
    primary = next((a for a in agents if a["type"] == primary_type), agents[0] if agents else None)
    secondary = next((a for a in agents if a["type"] == secondary_type), agents[-1] if agents else None)
    return primary, secondary

def build_sequential_workflow_yaml(primary_agent: str, secondary_agent: str) -> str:
    """Build a sequential workflow YAML with two agents."""
    return f"""kind: workflow
trigger:
  kind: OnConversationStart
  id: workflow_start
  actions:
    - kind: SetVariable
      id: set_variable_input
      variable: Local.LatestMessage
      value: "=UserMessage(System.LastMessageText)"
    - kind: CreateConversation
      id: create_primary_conversation
      conversationId: Local.PrimaryConversationId
    - kind: InvokeAzureAgent
      id: primary_agent
      description: Primary Agent
      conversationId: "=Local.PrimaryConversationId"
      agent:
        name: {primary_agent}
      input:
        messages: "=Local.LatestMessage"
      output:
        messages: Local.LatestMessage
    - kind: CreateConversation
      id: create_secondary_conversation
      conversationId: Local.SecondaryConversationId
    - kind: InvokeAzureAgent
      id: secondary_agent
      description: Secondary Agent
      conversationId: "=Local.SecondaryConversationId"
      agent:
        name: {secondary_agent}
      input:
        messages: "=Local.LatestMessage"
      output:
        messages: Local.FinalMessage
        autoSend: true"""

def build_review_loop_workflow_yaml(primary_agent: str, reviewer_agent: str) -> str:
    """Build a review loop workflow YAML with primary and reviewer agents."""
    return f"""kind: workflow
trigger:
  kind: OnConversationStart
  id: workflow_start
  actions:
    - kind: SetVariable
      id: set_variable_input
      variable: Local.LatestMessage
      value: "=UserMessage(System.LastMessageText)"
    - kind: SetVariable
      id: set_variable_turncount
      variable: Local.TurnCount
      value: "=0"
    - kind: CreateConversation
      id: create_primary_conversation
      conversationId: Local.PrimaryConversationId
    - kind: CreateConversation
      id: create_reviewer_conversation
      conversationId: Local.ReviewerConversationId
    - kind: InvokeAzureAgent
      id: primary_agent
      description: Primary Agent
      conversationId: "=Local.PrimaryConversationId"
      agent:
        name: {primary_agent}
      input:
        messages: "=Local.LatestMessage"
      output:
        messages: Local.LatestMessage
    - kind: InvokeAzureAgent
      id: reviewer_agent
      description: Reviewer Agent
      conversationId: "=Local.ReviewerConversationId"
      agent:
        name: {reviewer_agent}
      input:
        messages: "=Local.LatestMessage"
      output:
        messages: Local.LatestMessage
    - kind: SetVariable
      id: increment_turncount
      variable: Local.TurnCount
      value: "=Local.TurnCount + 1"
    - kind: ConditionGroup
      id: completion_check
      conditions:
        - condition: '=!IsBlank(Find("[COMPLETE]", Upper(Last(Local.LatestMessage).Text)))'
          id: check_done
          actions:
            - kind: EndConversation
              id: end_workflow
        - condition: "=Local.TurnCount >= 3"
          id: check_turn_count_exceeded
          actions:
            - kind: SendActivity
              id: send_final
              activity: "Review complete."
      elseActions:
        - kind: GotoAction
          id: goto_primary_agent
          actionId: primary_agent"""

# Batch create 5 workflows (1 per type) and register in Azure
created_workflows = []

print("=" * 70)
print("üîÑ Batch Workflow Registration (5 Workflows)")
print("=" * 70)

if not created_agents:
    print("‚ö†Ô∏è No agents available. Run Part 1 first to create agents.")
else:
    with AIProjectClient(endpoint=AZURE_AI_PROJECT_ENDPOINT, credential=credential) as project_client:
        workflow_count = 0
        for workflow_type, description, pattern in WORKFLOW_TYPES:
            for i in range(1):  # 1 workflow per type
                workflow_name = f"wf-{workflow_type}-{uuid_suffix}-{i+1:02d}"
                
                # Get agents for this workflow
                primary_agent, secondary_agent = get_agents_for_workflow(workflow_type, created_agents)
                
                if not primary_agent or not secondary_agent:
                    print(f"   ‚ö†Ô∏è Skipping {workflow_name}: No agents available")
                    continue
                
                try:
                    # Build workflow YAML based on pattern
                    if pattern == "review_loop":
                        workflow_yaml = build_review_loop_workflow_yaml(
                            primary_agent["name"],
                            secondary_agent["name"]
                        )
                    else:
                        workflow_yaml = build_sequential_workflow_yaml(
                            primary_agent["name"],
                            secondary_agent["name"]
                        )
                    
                    # Register workflow in Azure AI Foundry
                    workflow = project_client.agents.create_version(
                        agent_name=workflow_name,
                        definition=WorkflowAgentDefinition(workflow=workflow_yaml),
                    )
                    
                    workflow_config = {
                        "name": workflow.name,
                        "id": workflow.id,
                        "version": workflow.version,
                        "type": workflow_type,
                        "pattern": pattern,
                        "description": description,
                        "agents": [primary_agent["name"], secondary_agent["name"]],
                        "created_at": datetime.now().isoformat(),
                    }
                    created_workflows.append(workflow_config)
                    workflow_count += 1
                    print(f"   ‚úÖ [{workflow_count:02d}/05] Registered: {workflow_name} ({pattern})")
                    
                except Exception as e:
                    print(f"   ‚ùå Failed to create {workflow_name}: {str(e)[:60]}")

    print(f"\nüìä Summary: {len(created_workflows)}/05 workflows registered in Azure AI Foundry")
    print(f"   Workflow patterns: sequential, review_loop")
    print(f"   Each workflow orchestrates 2 agents")

üîÑ Batch Workflow Registration (5 Workflows)
   ‚úÖ [01/05] Registered: wf-CustomerJourney-20260204081706-01 (sequential)
   ‚úÖ [02/05] Registered: wf-DataPipeline-20260204081706-01 (sequential)
   ‚úÖ [03/05] Registered: wf-SalesProcess-20260204081706-01 (sequential)
   ‚úÖ [04/05] Registered: wf-TechEscalation-20260204081706-01 (review_loop)
   ‚úÖ [05/05] Registered: wf-GeneralInquiry-20260204081706-01 (sequential)

üìä Summary: 5/05 workflows registered in Azure AI Foundry
   Workflow patterns: sequential, review_loop
   Each workflow orchestrates 2 agents


## Part 3: Real-time Daemon Simulation

Run a **continuous daemon simulation** that sends requests to agents and tracks live metrics. This simulates production traffic for monitoring and testing.

### Daemon Configuration

| Parameter | Value | Description |
|-----------|-------|-------------|
| Interval | 5 seconds | Time between batches |
| Calls per batch | 3-5 | Random calls per interval |
| Threads | 3 | Parallel execution threads |
| Duration | 60 seconds | Total simulation time |

In [19]:
# Daemon Configuration
@dataclass
class DaemonConfig:
    """Configuration for daemon simulation."""
    interval_seconds: float = 5.0
    calls_per_batch_min: int = 3
    calls_per_batch_max: int = 5
    threads: int = 3
    duration_seconds: int = 60
    run_evaluations: bool = False  # Enable/disable evaluations during simulation
    evaluation_count: int = 5     # Number of evaluation runs

@dataclass
class DaemonMetrics:
    """Live metrics for daemon monitoring."""
    total_calls: int = 0
    successful_calls: int = 0
    failed_calls: int = 0
    total_latency_ms: float = 0
    evaluation_runs: int = 0
    start_time: Optional[datetime] = None
    
    @property
    def success_rate(self) -> float:
        return (self.successful_calls / self.total_calls * 100) if self.total_calls > 0 else 0
    
    @property
    def avg_latency_ms(self) -> float:
        return (self.total_latency_ms / self.successful_calls) if self.successful_calls > 0 else 0
    
    @property
    def runtime(self) -> str:
        if not self.start_time:
            return "0s"
        elapsed = (datetime.now() - self.start_time).total_seconds()
        return f"{int(elapsed)}s"

# Sample queries for each agent type
QUERY_TEMPLATES = {
    "CustomerSupport": ["How do I reset my password?", "I need help with my order", "What's your return policy?"],
    "DataAnalyst": ["Analyze Q4 sales data", "Show revenue trends", "Compare regional performance"],
    "TechSupport": ["My app crashes on startup", "How do I install the SDK?", "Network connection issues"],
    "SalesAssistant": ["What's the pricing for enterprise?", "Compare product features", "Request a demo"],
    "GeneralAssistant": ["What time is it in Tokyo?", "Summarize this article", "Help me draft an email"],
}

print("‚úÖ Daemon configuration ready")
print(f"   Interval: {DaemonConfig().interval_seconds}s")
print(f"   Calls/batch: {DaemonConfig().calls_per_batch_min}-{DaemonConfig().calls_per_batch_max}")
print(f"   Duration: {DaemonConfig().duration_seconds}s")
print(f"   Evaluations enabled: {DaemonConfig().run_evaluations}")
print(f"   Evaluation count: {DaemonConfig().evaluation_count}")

‚úÖ Daemon configuration ready
   Interval: 5.0s
   Calls/batch: 3-5
   Duration: 60s
   Evaluations enabled: False
   Evaluation count: 5


In [None]:
# Daemon Runner - Simulates continuous production traffic with Evals API
from pathlib import Path
from typing import Union
from pprint import pprint
import yaml

# Import Evals API types
from openai.types.eval_create_params import DataSourceConfigCustom
from openai.types.evals.run_create_response import RunCreateResponse
from openai.types.evals.run_retrieve_response import RunRetrieveResponse

@dataclass
class EvaluationItem:
    """Single evaluation dataset row."""
    query: str
    context: str = ""
    ground_truth: str = ""

@dataclass
class EvaluationResult:
    """Result of a single evaluation run."""
    eval_id: str
    run_id: str
    status: str
    result_counts: Dict[str, Any]
    output_items: List[Any]
    success: bool
    latency_ms: float

class DaemonRunner:
    """Simple daemon runner for agent fleet simulation with Evals API evaluations."""
    
    def __init__(self, agents: List[Dict], config: DaemonConfig):
        self.agents = agents
        self.config = config
        self.metrics = DaemonMetrics()
        self._stop_requested = False
        self._lock = threading.Lock()
        self.evaluation_results: List[EvaluationResult] = []
    
    def _call_agent(self, agent: Dict) -> Dict[str, Any]:
        """Execute a single agent call and return metrics."""
        agent_type = agent.get("type", "GeneralAssistant")
        query = random.choice(QUERY_TEMPLATES.get(agent_type, QUERY_TEMPLATES["GeneralAssistant"]))
        
        start_time = time.time()
        success = False
        error_msg = None
        
        try:
            with AIProjectClient(endpoint=AZURE_AI_PROJECT_ENDPOINT, credential=credential) as client:
                openai_client = client.get_openai_client()
                conversation = openai_client.conversations.create()
                response = openai_client.responses.create(
                    conversation=conversation.id,
                    extra_body={"agent": {"name": agent["name"], "type": "agent_reference"}},
                    input=query,
                )
                success = True
        except Exception as e:
            error_msg = str(e)[:50]
        
        latency_ms = (time.time() - start_time) * 1000
        
        with self._lock:
            self.metrics.total_calls += 1
            if success:
                self.metrics.successful_calls += 1
                self.metrics.total_latency_ms += latency_ms
            else:
                self.metrics.failed_calls += 1
        
        return {"success": success, "latency_ms": latency_ms, "agent": agent["name"], "error": error_msg}
    
    def _run_batch(self, batch_size: int):
        """Run a batch of agent calls in parallel."""
        threads = []
        for _ in range(batch_size):
            agent = random.choice(self.agents)
            t = threading.Thread(target=self._call_agent, args=(agent,))
            t.start()
            threads.append(t)
        for t in threads:
            t.join()
    
    def _run_evals_api_evaluation(self, model: str = None) -> Optional[EvaluationResult]:
        """
        Run evaluation using the official OpenAI Evals API.
        
        Key differences from previous implementation:
        1. Uses openai_client.evals.create() instead of direct responses.create()
        2. Uses DataSourceConfigCustom for structured data source
        3. Uses builtin evaluators (e.g., builtin.violence) instead of local YAML
        4. Async execution with polling for completion
        """
        start_time = time.time()
        
        # Use provided model or default
        eval_model = model or AZURE_AI_MODEL_DEPLOYMENT_NAME or "gpt-4o"
        
        
        
        try:
            with (
                DefaultAzureCredential() as cred,
                AIProjectClient(endpoint=AZURE_AI_PROJECT_ENDPOINT, credential=cred) as project_client,
                project_client.get_openai_client() as openai_client,
            ):
                # 1. Create data source config (key difference: structured schema)
                data_source_config = DataSourceConfigCustom(
                    type="custom",
                    item_schema={
                        "type": "object",
                        "properties": {"query": {"type": "string"}},
                        "required": ["query"]
                    },
                    include_sample_schema=True,
                )
                
                # 2. Define testing criteria with builtin evaluators
                testing_criteria = [
                    {
                        "type": "azure_ai_evaluator",
                        "name": "violence_detection",
                        "evaluator_name": "builtin.violence",
                        "data_mapping": {
                            "query": "{{item.query}}",
                            "response": "{{sample.output_text}}"
                        },
                    },
                    {
                        "type": "azure_ai_evaluator",
                        "name": "coherence",
                        "evaluator_name": "builtin.coherence",
                        "initialization_parameters": {"deployment_name": f"{eval_model}"},
                        "data_mapping": {"query": "{{item.query}}", "response": "{{sample.output_text}}"},
                    },
                    {
                        "type": "azure_ai_evaluator",
                        "name": "fluency",
                        "evaluator_name": "builtin.fluency",
                        "initialization_parameters": {"deployment_name": f"{eval_model}"},
                        "data_mapping": {"query": "{{item.query}}", "response": "{{sample.output_text}}"},
                    },
                    {
                        "type": "azure_ai_evaluator",
                        "name": "relevance",
                        "evaluator_name": "builtin.relevance",
                        "initialization_parameters": {
                            "deployment_name": f"{eval_model}",
                            "is_reasoning_model": True, # if you use an AOAI reasoning model   
                        },
                        "data_mapping": {
                            "query": "{{item.query}}",
                            "response": "{{sample.output_text}}",
                        },
                    },
                    
                    
                ]
                
                # 3. Create evaluation object
                eval_object = openai_client.evals.create(
                    name=f"Fleet Evaluation {datetime.now().strftime('%H%M%S')}",
                    data_source_config=data_source_config,
                    testing_criteria=testing_criteria,  # type: ignore
                )
                print(f"      üìù Evaluation created (id: {eval_object.id})")
                
                # 4. Define data source with sample queries
                data_source = {
                    "type": "azure_ai_target_completions",
                    "source": {
                        "type": "file_content",
                        "content": [
                            {"item": {"query": "What is the capital of France?","response":"Paris"}},
                            {"item": {"query": "How do I reset my password?","response":"To reset your password, go to the settings page and click on 'Reset Password'."}},
                            {"item": {"query": "Explain machine learning briefly.","response":"Machine learning is a field of AI that uses algorithms to learn from data and make predictions or decisions without being explicitly programmed."}},
                        ],
                    },
                    "input_messages": {
                        "type": "template",
                        "template": [
                            {
                                "type": "message",
                                "role": "user",
                                "content": {"type": "input_text", "text": "{{item.query}}"}
                            }
                        ],
                    },
                    "target": {
                        "type": "azure_ai_model",
                        "model": eval_model,
                        "sampling_params": {
                            "top_p": 1.0,
                            "max_completion_tokens": 256,
                        },
                    },
                }
                
                # 5. Create and run evaluation
                eval_run: Union[RunCreateResponse, RunRetrieveResponse] = openai_client.evals.runs.create(
                    eval_id=eval_object.id,
                    name=f"Run for {eval_model}",
                    data_source=data_source  # type: ignore
                )
                print(f"      üöÄ Eval run started (id: {eval_run.id})")
                
                # 6. Poll for completion (key difference: async execution)
                while eval_run.status not in ["completed", "failed"]:
                    eval_run = openai_client.evals.runs.retrieve(
                        run_id=eval_run.id,
                        eval_id=eval_object.id
                    )
                    print(f"      ‚è≥ Status: {eval_run.status}...")
                    time.sleep(3)
                
                latency_ms = (time.time() - start_time) * 1000
                
                # 7. Collect results
                if eval_run.status == "completed":
                    output_items = list(
                        openai_client.evals.runs.output_items.list(
                            run_id=eval_run.id,
                            eval_id=eval_object.id
                        )
                    )
                    print(output_items)
                    result = EvaluationResult(
                        eval_id=eval_object.id,
                        run_id=eval_run.id,
                        status="completed",
                        result_counts=eval_run.result_counts or {},
                        output_items=output_items,
                        success=True,
                        latency_ms=latency_ms,
                    )
                    
                    with self._lock:
                        self.metrics.evaluation_runs += 1
                    
                    return result
                else:
                    return EvaluationResult(
                        eval_id=eval_object.id,
                        run_id=eval_run.id,
                        status="failed",
                        result_counts={},
                        output_items=[],
                        success=False,
                        latency_ms=latency_ms,
                    )
                    
        except Exception as e:
            print(f"      ‚ùå Evaluation error: {str(e)[:80]}")
            return None
    
    def _run_evaluations_batch(self, count: int):
        """Run evaluation batch using Evals API."""
        print(f"\n   üß™ Running {count} evaluation(s) with Evals API...")
        
        for i in range(count):
            print(f"\n   [{i+1}/{count}] Starting evaluation...")
            result = self._run_evals_api_evaluation()
            
            if result:
                self.evaluation_results.append(result)
                status = "‚úÖ" if result.success else "‚ùå"
                print(f"      {status} Completed in {result.latency_ms/1000:.1f}s")
                if result.result_counts:
                    print(f"      üìä Results: {result.result_counts}")
    
    def run(self):
        """Run the daemon simulation loop."""
        self.metrics.start_time = datetime.now()
        end_time = time.time() + self.config.duration_seconds
        batch_num = 0
        
        print("\n" + "=" * 70)
        print("üöÄ Starting Daemon Simulation")
        print("=" * 70)
        
        while time.time() < end_time and not self._stop_requested:
            batch_num += 1
            batch_size = random.randint(self.config.calls_per_batch_min, self.config.calls_per_batch_max)
            
            self._run_batch(batch_size)
            
            # Print live metrics
            print(f"\r   üìä Batch {batch_num:03d} | "
                  f"Calls: {self.metrics.total_calls} | "
                  f"Success: {self.metrics.success_rate:.1f}% | "
                  f"Avg Latency: {self.metrics.avg_latency_ms:.0f}ms | "
                  f"Runtime: {self.metrics.runtime}", end="", flush=True)
            
            time.sleep(self.config.interval_seconds)
        
        # Run evaluations at the end if enabled
        if self.config.run_evaluations:
            self._run_evaluations_batch(self.config.evaluation_count)
        
        print(f"\n\n‚úÖ Daemon simulation completed!")
        return self.metrics
    
    def stop(self):
        """Stop the daemon gracefully."""
        self._stop_requested = True

print("‚úÖ DaemonRunner class defined (with Evals API support)")
print("   Key changes from previous version:")
print("   - Uses openai_client.evals.create() API")
print("   - Uses DataSourceConfigCustom for structured data")
print("   - Uses builtin evaluators (builtin.violence)")
print("   - Async execution with polling for completion")

‚úÖ DaemonRunner class defined (with Evals API support)
   Key changes from previous version:
   - Uses openai_client.evals.create() API
   - Uses DataSourceConfigCustom for structured data
   - Uses builtin evaluators (builtin.violence)
   - Async execution with polling for completion


In [21]:
# Run the daemon simulation without evaluations
RUN_EVALUATIONS = False  

if not created_agents:
    print("‚ö†Ô∏è No agents created. Run Part 1 first.")
else:
    config = DaemonConfig(
        interval_seconds=5.0,
        calls_per_batch_min=2,
        calls_per_batch_max=4,
        threads=3,
        duration_seconds=60,        # 1 minute simulation
        run_evaluations=RUN_EVALUATIONS,
        evaluation_count=2,        # Run 2 evaluations
    )
    
    #
    
    daemon = DaemonRunner(agents=created_agents, config=config)
    final_metrics = daemon.run()
    
    # Print final summary
    print("\n" + "=" * 70)
    print("üìä Final Simulation Metrics")
    print("=" * 70)
    print(f"   Total Calls:      {final_metrics.total_calls}")
    print(f"   Successful:       {final_metrics.successful_calls}")
    print(f"   Failed:           {final_metrics.failed_calls}")
    print(f"   Success Rate:     {final_metrics.success_rate:.1f}%")
    print(f"   Avg Latency:      {final_metrics.avg_latency_ms:.0f}ms")
    print(f"   Total Runtime:    {final_metrics.runtime}")


üöÄ Starting Daemon Simulation
   üìä Batch 005 | Calls: 15 | Success: 100.0% | Avg Latency: 6180ms | Runtime: 57s

‚úÖ Daemon simulation completed!

üìä Final Simulation Metrics
   Total Calls:      15
   Successful:       15
   Failed:           0
   Success Rate:     100.0%
   Avg Latency:      6180ms
   Total Runtime:    62s


## Part 4: Sample Evaluations (Evals API)

The daemon simulation uses the **official OpenAI Evals API** for model evaluation.

### How Evals API Works

1. **Create Evaluation Object**: Define `data_source_config` and `testing_criteria`
2. **Create Eval Run**: Specify `data_source` with queries and target model
3. **Poll for Completion**: Wait for `status == "completed"`
4. **Retrieve Results**: Get `output_items` with detailed evaluation results

### How to Enable Evaluations

Set `RUN_EVALUATIONS = True` in the daemon run cell to enable evaluations.

In [None]:
# Run the daemon simulation with evaluations
RUN_EVALUATIONS = True  

if not created_agents:
    print("‚ö†Ô∏è No agents created. Run Part 1 first.")
else:
    config = DaemonConfig(
        interval_seconds=5.0,
        calls_per_batch_min=2,
        calls_per_batch_max=4,
        threads=3,
        duration_seconds=60,        # 1 minute simulation
        run_evaluations=RUN_EVALUATIONS,
        evaluation_count=2,        # Run 2 evaluations
    )
    
    #
    
    daemon = DaemonRunner(agents=created_agents, config=config)
    final_metrics = daemon.run()
    
    # Print final summary
    print("\n" + "=" * 70)
    print("üìä Final Simulation Metrics")
    print("=" * 70)
    print(f"   Total Calls:      {final_metrics.total_calls}")
    print(f"   Successful:       {final_metrics.successful_calls}")
    print(f"   Failed:           {final_metrics.failed_calls}")
    print(f"   Success Rate:     {final_metrics.success_rate:.1f}%")
    print(f"   Avg Latency:      {final_metrics.avg_latency_ms:.0f}ms")
    print(f"   Total Runtime:    {final_metrics.runtime}")
    
    if RUN_EVALUATIONS:
        print(f"\nüß™ Evaluation Results:")
        print(f"   Evaluation Runs:  {final_metrics.evaluation_runs}")
        print(f"   Templates Used:   4 (violence_detection, coherence, fluency, relevance)")


üöÄ Starting Daemon Simulation


   üìä Batch 005 | Calls: 16 | Success: 100.0% | Avg Latency: 7246ms | Runtime: 76s
   üß™ Running 2 evaluation(s) with Evals API...

   [1/2] Starting evaluation...
      üìù Evaluation created (id: eval_b19dbb85eba0400997cc435c25bd29f8)
      üöÄ Eval run started (id: evalrun_17433f24c8bb4d7184e8dd3d1d8ce61c)
      ‚è≥ Status: in_progress...
      ‚è≥ Status: in_progress...
      ‚è≥ Status: in_progress...
      ‚è≥ Status: in_progress...
      ‚è≥ Status: in_progress...
      ‚è≥ Status: in_progress...
      ‚è≥ Status: in_progress...
      ‚è≥ Status: in_progress...


In [None]:
# View evaluation results from Evals API
if daemon and daemon.evaluation_results:
    print("=" * 70)
    print("üß™ Detailed Evaluation Results (Evals API)")
    print("=" * 70)
    
    for i, result in enumerate(daemon.evaluation_results, 1):
        status = "‚úÖ" if result.success else "‚ùå"
        print(f"\n{i}. {status} Eval ID: {result.eval_id}")
        print(f"   Run ID: {result.run_id}")
        print(f"   Status: {result.status}")
        print(f"   Latency: {result.latency_ms/1000:.1f}s")
        
        if result.result_counts:
            print(f"   Result Counts: {result.result_counts}")
        
        if result.output_items:
            print(f"   Output Items ({len(result.output_items)}):")
            for j, item in enumerate(result.output_items[:3], 1):  # Show first 3
                print(f"      {j}. {str(item)[:100]}...")
else:
    print("No evaluation results available. Run daemon with RUN_EVALUATIONS=True")

üß™ Detailed Evaluation Results (Evals API)

1. ‚úÖ Eval ID: eval_05f778a7df8247edbdd9c54b3afabeb7
   Run ID: evalrun_2a0b833e5e634dfba535d0c5d0b4a830
   Status: completed
   Latency: 43.3s
   Result Counts: ResultCounts(errored=0, failed=1, passed=2, total=3)
   Output Items (3):
      1. OutputItemListResponse(id='1', created_at=1770193198, datasource_item={'query': 'What is the capital...
      2. OutputItemListResponse(id='2', created_at=1770193198, datasource_item={'query': 'How do I reset my p...
      3. OutputItemListResponse(id='3', created_at=1770193198, datasource_item={'query': 'Explain machine lea...

2. ‚úÖ Eval ID: eval_141a53017905457db97b1e6c1d804396
   Run ID: evalrun_51b91afe8b1e4b55b22a0aac23e27227
   Status: completed
   Latency: 39.0s
   Result Counts: ResultCounts(errored=0, failed=1, passed=2, total=3)
   Output Items (3):
      1. OutputItemListResponse(id='1', created_at=1770193239, datasource_item={'query': 'What is the capital...
      2. OutputItemListResp

## Wrap-up

### Key Takeaways

This notebook demonstrated:

| Feature | Description |
|---------|-------------|
| **Batch Agent Registration** | Created 5 agents with different roles in a single loop |
| **Workflow Registration** | Defined 5 workflows for multi-agent orchestration |
| **Daemon Simulation** | Ran continuous traffic simulation with live metrics |
| **Sample Evaluations** | Ran 2 evaluations using YAML templates (when enabled) |
| **Portal Integration** | Agents visible in Azure AI Foundry for monitoring |

### Metrics Summary

| Metric | Description |
|--------|-------------|
| `total_calls` | Total API calls made during simulation |
| `success_rate` | Percentage of successful calls |
| `avg_latency_ms` | Average response time in milliseconds |
| `evaluation_runs` | Number of evaluations executed (if enabled) |
| `runtime` | Total simulation duration |

### Next Steps

1. **Enable Evaluations**: Set `RUN_EVALUATIONS = True` to run sample evaluations
2. **Increase Duration**: Change `duration_seconds` for longer simulations
3. **Add Tracing**: Connect Application Insights (see `1_foundry_agent_monitoring.ipynb`)
4. **Scale Up**: Increase `calls_per_batch_max` for higher load testing
5. **View in Portal**: Navigate to Azure AI Foundry to see live traces

## Additional Resources

- [Azure AI Foundry Documentation](https://learn.microsoft.com/en-us/azure/ai-services/agents/)
- [Agent Fleet Simulation Reference](https://github.com/guming3d/AI-Foundry-Agent-Simulation)
- [Azure AI Projects SDK](https://learn.microsoft.com/en-us/python/api/azure-ai-projects/)
- [Application Insights for Tracing](https://learn.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview)