# Microsoft Foundry Hosted Agents: Build + Deploy from Notebook

> **Author:** Ozgur Guler | AI Solution Leader, AI Innovation Hub
> **Contact:** [ozgur.guler1@gmail.com](mailto:ozgur.guler1@gmail.com)
> **¬© 2025 Ozgur Guler. All rights reserved.**

---

This notebook demonstrates the **hosted agents** execution model (containerized, code-first) using the flow from the Microsoft Learn hosted agents page.

## What You'll Learn

1. **Create** agent code (Agent Framework + Azure AI)
2. **Package** as a managed hosted agent
3. **Deploy** via `azd` to Azure AI Foundry
4. **Invoke** the deployed agent endpoint

## Why Hosted Agents?

| Feature | Benefit |
|---------|---------|
| **Containerized** | Your code + dependencies in your container |
| **Versioned** | Deploy/rollback like any managed service |
| **Governed** | Agent ID, Conditional Access, RBAC |
| **Observable** | Consistent traces via Foundry + App Insights |
| **MCP-ready** | Connect to enterprise tools via Model Context Protocol |

## 0) Prerequisites & Configuration

**Before running this notebook**, ensure you have:

1. **Environment variables** set in a `.env` file (or exported):
   - `AZURE_AI_PROJECT_ENDPOINT` or `PROJECT_ENDPOINT` ‚Äî Your Foundry project endpoint URL
   - `MODEL_DEPLOYMENT_NAME` ‚Äî The name of your deployed model (e.g., `gpt-4o`)
   - `ACR_NAME` ‚Äî Your Azure Container Registry name (without `.azurecr.io`)

2. **Azure CLI logged in**: Run `az login` if not already authenticated

3. **Docker running**: The Docker daemon must be started

The cell below loads these values. If any are missing, you'll need to set them before proceeding to the deployment steps.

### ‚ö†Ô∏è Regional Availability ‚Äî IMPORTANT

**Hosted agents are currently (30.12.2025) are supported only in North Central US.**

If your Foundry project is in any other region, you'll get the error:
> `"Hosted Agents are not enabled in this region."`

**What to do:**
1. **Create a new Foundry project in North Central US**, then run `create_version(...)` there.
2. **Keep your ACR accessible** to that project. ACR can technically be in another region, but cross-region adds latency and potential policy friction ‚Äî keeping ACR in North Central US (or nearby) is cleaner.

This is a preview limitation and may expand to other regions in the future.

### ‚ö†Ô∏è CRITICAL: ACR Resource Group Requirement

**Your Azure Container Registry (ACR) MUST be in the SAME resource group as your Foundry project.**

This is an undocumented but critical requirement for `azd deploy` to work. If your ACR is in a different resource group:

```
ERROR: The Resource 'Microsoft.ContainerRegistry/registries/yourregistry' 
under resource group 'rg-your-foundry-project' was not found.
```

**How azd discovers ACR:**
- `azd` reads `AZURE_CONTAINER_REGISTRY_ENDPOINT` from your `.azure/<env>/.env` file
- It extracts the registry name and looks for it in the **same resource group** as your Foundry project
- If the ACR exists in a *different* resource group, `azd` will fail with "ParentResourceNotFound"

**Solution:**

1. **Create a new ACR in your Foundry resource group:**
   ```bash
   # Find your resource group (from .azure/<env>/.env or Portal)
   RESOURCE_GROUP="rg-your-foundry-project"
   
   # Create ACR in the same resource group
   az acr create --name yournewacr --resource-group $RESOURCE_GROUP --sku Basic
   ```

2. **Update your environment:**
   ```bash
   # Edit .azure/<env>/.env
   AZURE_CONTAINER_REGISTRY_ENDPOINT="yournewacr.azurecr.io"
   ```

3. **Grant AcrPull permission to the Foundry project managed identity:**
   ```bash
   # Find your project's managed identity principal ID
   az cognitiveservices account show \
     --name your-foundry-account \
     --resource-group $RESOURCE_GROUP \
     --query "identity.principalId" -o tsv
   
   # Grant AcrPull role
   az role assignment create \
     --assignee <principal-id-from-above> \
     --role "AcrPull" \
     --scope /subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.ContainerRegistry/registries/yournewacr
   ```

**Why this isn't documented:**
This appears to be an implementation detail of how `azd` discovers resources. The Azure CLI and SDK approaches (manual deployment) work with ACRs in any resource group as long as the managed identity has AcrPull permissions.

**TL;DR:** ACR in wrong resource group = `azd deploy` fails. ACR in same resource group + AcrPull role = success.

### Required: Grant OpenAI Permissions to Managed Identity

When your hosted agent calls Azure OpenAI, it uses the **project's managed identity** for authentication. This identity needs the **"Cognitive Services OpenAI User"** role.

**Error you'll see without this permission:**
```
PermissionDenied: The principal `xxx` lacks the required data action 
`Microsoft.CognitiveServices/accounts/OpenAI/deployments/chat/completions/action`
```

**Solution:**

1. Find your project's managed identity principal ID:
```bash
az cognitiveservices account show \
  --name <your-foundry-account-name> \
  --resource-group <your-resource-group> \
  --query "identity.principalId" -o tsv
```

2. Grant the OpenAI User role:
```bash
az role assignment create \
  --assignee <principal-id-from-step-1> \
  --role "Cognitive Services OpenAI User" \
  --scope /subscriptions/<sub-id>/resourceGroups/<rg>/providers/Microsoft.CognitiveServices/accounts/<account-name>
```

**Example for this project:**
```bash
az role assignment create \
  --assignee 1c170f70-01ac-4c0e-b55c-5ad20c7aec71 \
  --role "Cognitive Services OpenAI User" \
  --scope /subscriptions/a20bc194-9787-44ee-9c7f-7c3130e651b6/resourceGroups/rg-ozgurguler-7212/providers/Microsoft.CognitiveServices/accounts/ozgurguler-7212-resource
```

**Summary of required permissions for hosted agents:**
| Permission | Role | Why |
|------------|------|-----|
| Pull container images | AcrPull | To pull your agent container from ACR |
| Call Azure OpenAI | Cognitive Services OpenAI User | To make LLM API calls |


## Deployment Options: azd vs Manual

This notebook supports **two deployment approaches**:

### Option 1: Azure Developer CLI (azd) - RECOMMENDED

The `azd` approach is the **recommended method** for production deployments. It provides:
- **One-command deployment**: `azd up` handles everything
- **Automatic infrastructure**: Provisions ACR, App Insights, RBAC
- **Configuration-driven**: Uses `agent.yaml` for agent definition
- **Version management**: Proper versioning and rollback support

```bash
# Quick start with azd
cd 02-azd-deploy-hosted-agent

# Option A: Initialize for existing Foundry project
azd ai agent init --project-id /subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{project}

# Option B: Start fresh with starter template
azd init -t https://github.com/Azure-Samples/azd-ai-starter-basic
azd ai agent init -m hosted_agent_app/agent.yaml

# Deploy everything
azd up
```

### Option 2: Manual SDK/CLI Approach

The manual approach (used in this notebook) gives you more visibility into each step:
- Explicit Docker build and push
- Direct SDK calls for agent version creation
- CLI commands for start/stop/manage

**Use the manual approach for:**
- Learning and understanding the internals
- Debugging deployment issues
- Environments where `azd` is not available

### Project Structure for azd

```
02-azd-deploy-hosted-agent/
‚îú‚îÄ‚îÄ azure.yaml              # azd project configuration
‚îú‚îÄ‚îÄ hosted_agent_app/
‚îÇ   ‚îú‚îÄ‚îÄ agent.yaml          # Agent definition for azd
‚îÇ   ‚îú‚îÄ‚îÄ main.py             # Agent code (BaseAgent pattern)
‚îÇ   ‚îú‚îÄ‚îÄ Dockerfile          # Container definition
‚îÇ   ‚îî‚îÄ‚îÄ requirements.txt    # Python dependencies
‚îî‚îÄ‚îÄ af-foundry-agent-hosted.ipynb  # This notebook
```

> **Note**: The cells below demonstrate the **manual approach**. For `azd` deployment, use the CLI commands above.

## Option 1: Deploy with Azure Developer CLI (azd)

The `azd` approach is the **recommended method** for production. Run the cells below to deploy using `azd`.

> **Note**: If you prefer the manual step-by-step approach (for learning or debugging), skip to **Section 1** below.

In [None]:
# Step 1: Check if azd is installed and has the ai agent extension
import subprocess
import shutil

def check_azd():
    """Check azd installation and ai agent extension."""
    
    # Check if azd is installed
    azd_path = shutil.which("azd")
    if not azd_path:
        print("‚ùå azd is NOT installed.")
        print("\nInstall azd:")
        print("  macOS:   brew install azd")
        print("  Windows: winget install microsoft.azd")
        print("  Linux:   curl -fsSL https://aka.ms/install-azd.sh | bash")
        print("\nDocs: https://learn.microsoft.com/azure/developer/azure-developer-cli/install-azd")
        return False
    
    # Get azd version
    version = subprocess.check_output(["azd", "version"], text=True).strip()
    print(f"‚úÖ azd installed: {version}")
    
    # Check for ai agent extension
    try:
        extensions = subprocess.check_output(["azd", "ext", "list"], text=True, stderr=subprocess.STDOUT)
        if "ai" in extensions.lower() or "agent" in extensions.lower():
            print("‚úÖ ai agent extension found")
        else:
            print("‚ö†Ô∏è  ai agent extension not found. Installing...")
            subprocess.run(["azd", "ext", "install", "ai"], check=True)
            print("‚úÖ ai agent extension installed")
    except subprocess.CalledProcessError:
        print("‚ö†Ô∏è  Could not check extensions. The extension may install automatically.")
    
    # Check Azure login
    try:
        account = subprocess.check_output(
            ["az", "account", "show", "--query", "name", "-o", "tsv"], 
            text=True, stderr=subprocess.DEVNULL
        ).strip()
        print(f"‚úÖ Azure CLI logged in: {account}")
    except:
        print("‚ö†Ô∏è  Azure CLI not logged in. Run: az login")
    
    return True

AZD_AVAILABLE = check_azd()

In [None]:
# Step 2: Generate your Foundry Project ID for azd
# This constructs the full Azure Resource ID needed by azd ai agent init

import os
from dotenv import load_dotenv
load_dotenv(override=True)

# Get values from environment or set them here
SUBSCRIPTION_ID = os.getenv("AZ_SUBSCRIPTION_ID") or ""
RESOURCE_GROUP = os.getenv("AZ_RESOURCE_GROUP") or ""
FOUNDRY_ACCOUNT_NAME = os.getenv("FOUNDRY_ACCOUNT_NAME") or ""
FOUNDRY_PROJECT_NAME = os.getenv("FOUNDRY_PROJECT_NAME") or ""

# Build the project resource ID
if all([SUBSCRIPTION_ID, RESOURCE_GROUP, FOUNDRY_ACCOUNT_NAME, FOUNDRY_PROJECT_NAME]):
    PROJECT_RESOURCE_ID = (
        f"/subscriptions/{SUBSCRIPTION_ID}"
        f"/resourceGroups/{RESOURCE_GROUP}"
        f"/providers/Microsoft.CognitiveServices"
        f"/accounts/{FOUNDRY_ACCOUNT_NAME}"
        f"/projects/{FOUNDRY_PROJECT_NAME}"
    )
    print("‚úÖ Project Resource ID generated:\n")
    print(PROJECT_RESOURCE_ID)
    print("\n" + "="*60)
    print("Copy this for the next step, or run the next cell directly.")
else:
    missing = [k for k, v in {
        "AZ_SUBSCRIPTION_ID": SUBSCRIPTION_ID,
        "AZ_RESOURCE_GROUP": RESOURCE_GROUP,
        "FOUNDRY_ACCOUNT_NAME": FOUNDRY_ACCOUNT_NAME,
        "FOUNDRY_PROJECT_NAME": FOUNDRY_PROJECT_NAME,
    }.items() if not v]
    print(f"‚ùå Missing environment variables: {', '.join(missing)}")
    print("\nSet these in your .env file or run the helper cells in Section 7.")

In [None]:
# Step 3: Initialize azd for your Foundry project
# This configures azd to deploy to your existing Foundry project

import subprocess
import os

# Ensure we're in the right directory
NOTEBOOK_DIR = os.path.dirname(os.path.abspath("__file__")) if "__file__" in dir() else os.getcwd()
os.chdir(NOTEBOOK_DIR)

print(f"Working directory: {os.getcwd()}")
print(f"Agent manifest: hosted_agent_app/agent.yaml")
print()

# Check if PROJECT_RESOURCE_ID exists from previous cell
if 'PROJECT_RESOURCE_ID' not in globals() or not PROJECT_RESOURCE_ID:
    print("‚ùå Run the previous cell first to generate PROJECT_RESOURCE_ID")
else:
    print("Running: azd ai agent init")
    print(f"  --project-id {PROJECT_RESOURCE_ID}")
    print()
    
    try:
        # Run azd ai agent init
        result = subprocess.run(
            ["azd", "ai", "agent", "init", "--project-id", PROJECT_RESOURCE_ID],
            capture_output=True,
            text=True,
            timeout=120
        )
        
        if result.returncode == 0:
            print("‚úÖ azd initialized successfully!")
            print(result.stdout)
        else:
            print(f"‚ö†Ô∏è  azd returned code {result.returncode}")
            print("stdout:", result.stdout)
            print("stderr:", result.stderr)
            
            if "already initialized" in result.stderr.lower() or "already exists" in result.stderr.lower():
                print("\n‚úÖ Project already initialized. You can proceed to 'azd up'.")
                
    except FileNotFoundError:
        print("‚ùå azd not found. Install it first (see Step 1).")
    except subprocess.TimeoutExpired:
        print("‚ö†Ô∏è  Command timed out. Run manually in terminal:")
        print(f"  azd ai agent init --project-id {PROJECT_RESOURCE_ID}")

### Step 4: Deploy with `azd deploy`

```bash
cd /path/to/02-azd-deploy-hosted-agent
azd deploy
```

This is the **only supported deployment method**. It handles:
- Building the Docker image
- Pushing to ACR (correct path)
- Creating the agent version
- Starting the container

**Do NOT use manual SDK deployment** - it causes container readiness probe failures.

After successful deployment, you'll see:
```
SUCCESS: Your application was deployed to Azure
- Agent playground (portal): https://ai.azure.com/...
- Agent endpoint: https://...
```

In [None]:
# Step 4: Run azd up to deploy everything
# This cell streams output but terminal is recommended for better visibility

import subprocess
import sys

print("="*60)
print("DEPLOYING WITH azd up")
print("="*60)
print("\nThis will:")
print("  1. Provision infrastructure (ACR, App Insights)")
print("  2. Build container image")
print("  3. Push to ACR")
print("  4. Create hosted agent version")
print("  5. Start deployment")
print("\nThis may take 5-10 minutes...")
print("="*60 + "\n")

try:
    # Run azd up with real-time output streaming
    process = subprocess.Popen(
        ["azd", "up", "--no-prompt"],  # --no-prompt uses defaults
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        text=True,
        bufsize=1
    )
    
    # Stream output line by line
    for line in iter(process.stdout.readline, ''):
        print(line, end='')
        sys.stdout.flush()
    
    process.wait()
    
    if process.returncode == 0:
        print("\n" + "="*60)
        print("‚úÖ DEPLOYMENT SUCCESSFUL!")
        print("="*60)
        print("\nYour hosted agent is now running.")
        print("Use the cells in Section 11 to invoke it, or check the Foundry Portal.")
    else:
        print(f"\n‚ö†Ô∏è  azd up exited with code {process.returncode}")
        print("Check the output above for errors.")
        print("\nCommon issues:")
        print("  - Not logged in: run 'azd auth login'")
        print("  - Missing permissions: check RBAC roles")
        print("  - Region not supported: hosted agents only in North Central US")

except FileNotFoundError:
    print("‚ùå azd not found. Install it first (see Step 1).")
except KeyboardInterrupt:
    print("\n\n‚ö†Ô∏è  Deployment interrupted by user.")
    print("You can resume with: azd up")

### Step 5: Manage Deployment with azd

After deploying with `azd up`, you can manage your deployment:

| Command | Purpose |
|---------|---------|
| `azd up` | Deploy/update the agent |
| `azd deploy` | Redeploy without reprovisioning |
| `azd down` | Delete all resources (cleanup) |
| `azd monitor` | Open Application Insights |

**For agent-specific operations** (start/stop/status), use the Azure CLI cells in **Section 10** below.

In [None]:
# (Optional) Cleanup with azd down
# WARNING: This will delete all deployed resources!
# Only run this when you're completely done with the project.

CONFIRM_DELETE = False  # Set to True to enable deletion

if CONFIRM_DELETE:
    import subprocess
    print("‚ö†Ô∏è  DELETING ALL RESOURCES with azd down...")
    print("This may take several minutes...\n")
    
    result = subprocess.run(
        ["azd", "down", "--force", "--purge"],
        capture_output=True,
        text=True
    )
    
    if result.returncode == 0:
        print("‚úÖ All resources deleted.")
    else:
        print(f"‚ö†Ô∏è  azd down returned code {result.returncode}")
        print(result.stderr)
else:
    print("üîí Cleanup is DISABLED for safety.")
    print("To delete all resources:")
    print("  1. Set CONFIRM_DELETE = True above")
    print("  2. Re-run this cell")
    print("\nOr run in terminal: azd down")

---

## Option 2: Manual Step-by-Step Deployment

The sections below demonstrate the **manual approach** using SDK and CLI commands directly.

**Use this approach if:**
- You want to understand each step in detail
- `azd` is not available in your environment
- You need fine-grained control over the deployment
- You're debugging deployment issues

> **Already deployed with azd?** You can skip to **Section 10** to manage your agent, or **Section 11** to invoke it.

In [1]:
import os
from pathlib import Path

# Minimal config for the early (local) sections.
# Provide Azure/ACR variables when you reach the deploy sections.
APP_DIR = Path("hosted_agent_app")
PROJECT_ENDPOINT = os.getenv("AZURE_AI_PROJECT_ENDPOINT") or os.getenv("PROJECT_ENDPOINT") or ""
MODEL_DEPLOYMENT_NAME = os.getenv("MODEL_DEPLOYMENT_NAME") or os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME") or ""

print("APP_DIR:", APP_DIR)


APP_DIR: hosted_agent_app


## 1) Install Python Dependencies

**What this step does:**
Installs the required Python packages for building and deploying hosted agents.

**Packages installed:**
- `azure-ai-projects` ‚Äî SDK for interacting with Azure AI Foundry projects, including creating hosted agent versions
- `azure-identity` ‚Äî Azure authentication (DefaultAzureCredential)
- `python-dotenv` ‚Äî Load environment variables from `.env` files
- `requests` ‚Äî HTTP client for REST API calls
- `azure-ai-agentserver-core` / `azure-ai-agentserver-agentframework` ‚Äî **Hosting adapter** packages that wrap your agent code into a Foundry-compatible HTTP service

**Why hosting adapters?**
The hosting adapter transforms your agent logic into a REST service that exposes the `/responses` endpoint. This is the contract Foundry expects from hosted agents ‚Äî it sends requests to `/responses` and your agent returns responses in a compatible format.

In [2]:
# If you're on a clean env, uncomment.
# %pip install -U pip

# Azure AI Projects SDK (docs show a preview/beta in some sections)
# If you need the exact beta: %pip install --pre azure-ai-projects==2.0.0b2
%pip install -U azure-ai-projects azure-identity python-dotenv requests

# Hosting adapter packages (Python)
# These wrap your agent code into a Foundry-compatible HTTP service.
%pip install -U azure-ai-agentserver-core azure-ai-agentserver-agentframework


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


## 2) Azure CLI Login + Sanity Checks

**What this step does:**
Verifies that your local environment is correctly configured before proceeding with Azure operations.

**Checks performed:**
1. **Azure subscription** ‚Äî Confirms you're logged in and shows which subscription is active
2. **Azure CLI version** ‚Äî Ensures you have a recent version (hosted agent commands require recent CLI)
3. **Docker version** ‚Äî Confirms Docker is installed and accessible

**Why this matters:**
- If `az account show` fails ‚Üí Run `az login` first
- If Docker fails ‚Üí Start Docker Desktop or the Docker daemon
- Wrong subscription? ‚Üí Run `az account set --subscription <name-or-id>`

**Alternative approach:**
This notebook uses `az` CLI + SDK. You could also use `azd` (Azure Developer CLI) for a more opinionated workflow, but the SDK approach shown here gives more visibility into each step.

In [3]:
import subprocess

def sh(cmd: str) -> str:
    """Run a shell command and return stdout. Raises on failure."""
    print(f"$ {cmd}")
    out = subprocess.check_output(cmd, shell=True, text=True)
    return out.strip()

print(sh("az account show --query name -o tsv"))
print(sh("az account show --query id -o tsv"))
print(sh("az version -o json | python3 -c \"import sys,json; print(json.load(sys.stdin)['azure-cli'])\""))
print(sh("docker --version"))

$ az account show --query name -o tsv
MCAPS-Hybrid-REQ-102171-2024-ozgurguler
$ az account show --query id -o tsv
a20bc194-9787-44ee-9c7f-7c3130e651b6
$ az version -o json | python3 -c "import sys,json; print(json.load(sys.stdin)['azure-cli'])"
2.81.0
$ docker --version
Docker version 29.1.3, build f52814d454


## 3) Write a Hosted-Agent App (BaseAgent Pattern)

**What this step does:**
Creates the agent application files using the **BaseAgent pattern** recommended by Microsoft.

**Files created in `hosted_agent_app/`:**
- `main.py` ‚Äî Agent code extending BaseAgent (supports sync + streaming)
- `agent.yaml` ‚Äî Configuration for `azd` deployment
- `requirements.txt` ‚Äî Python dependencies
- `.env` ‚Äî Environment variables for local testing

**Why use BaseAgent instead of a simple function?**

| Feature | Simple Function | BaseAgent Pattern |
|---------|----------------|-------------------|
| Synchronous responses | ‚úÖ | ‚úÖ `run()` |
| Streaming responses | ‚ùå | ‚úÖ `run_stream()` |
| Thread integration | ‚ùå | ‚úÖ Automatic |
| Production-ready | ‚ö†Ô∏è Basic | ‚úÖ Full support |

**BaseAgent class structure:**
```python
class MyHostedAgent(BaseAgent):
    def run(self, request) -> AgentRunResponse:
        # Synchronous execution
        ...
    
    async def run_stream(self, request) -> AsyncIterable[AgentRunResponseUpdate]:
        # Streaming execution (yields chunks)
        ...
```

**Customization:**
Replace the response logic in `run()` and `run_stream()` with your actual agent implementation:
- Call Azure OpenAI for LLM responses
- Use LangChain/LangGraph for complex workflows
- Implement custom business logic
- Integrate external APIs and tools

In [None]:
APP_DIR.mkdir(exist_ok=True)

# Write main.py - Working ChatAgent with Azure OpenAI
# Key points:
# - Uses BaseAgent pattern with run() AND run_stream()
# - Lazy client initialization (avoids startup failures)
# - Hardcoded endpoint fallback (env vars may not pass through)
# - Uses managed identity via DefaultAzureCredential

(APP_DIR / "main.py").write_text(
    '''\
"""
Hosted Agent - Azure OpenAI Chat

Uses gpt-5-nano for intelligent conversations.
"""

import os
from typing import Any
from collections.abc import AsyncIterable

# Agent Framework imports
from agent_framework import BaseAgent, AgentRunResponse, AgentRunResponseUpdate, AgentThread, ChatMessage

# Hosting adapter
from azure.ai.agentserver.agentframework import from_agent_framework


class ChatAgent(BaseAgent):
    """A chat agent powered by Azure OpenAI."""

    def __init__(self):
        super().__init__(
            name="chat-agent",
            description="A helpful AI assistant powered by Azure OpenAI"
        )
        self._client = None

    def _get_client(self):
        """Lazy initialization of OpenAI client."""
        if self._client is None:
            from openai import AzureOpenAI
            from azure.identity import DefaultAzureCredential, get_bearer_token_provider

            token_provider = get_bearer_token_provider(
                DefaultAzureCredential(),
                "https://cognitiveservices.azure.com/.default"
            )
            self._client = AzureOpenAI(
                # Hardcode endpoint as fallback - env vars may not pass through
                azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT", "https://ozgurguler-7212-resource.openai.azure.com/"),
                azure_ad_token_provider=token_provider,
                api_version="2024-02-15-preview",
            )
        return self._client

    def _get_user_message(self, messages) -> str:
        """Extract user message text."""
        if messages is None:
            return ""
        if isinstance(messages, str):
            return messages
        if isinstance(messages, ChatMessage):
            return messages.text or ""
        if isinstance(messages, list) and len(messages) > 0:
            last = messages[-1]
            return last if isinstance(last, str) else (last.text or "")
        return str(messages)

    async def run(
        self,
        messages: str | ChatMessage | list[str] | list[ChatMessage] | None = None,
        *,
        thread: AgentThread | None = None,
        **kwargs: Any,
    ) -> AgentRunResponse:
        """Get a response from Azure OpenAI."""
        user_msg = self._get_user_message(messages)
        client = self._get_client()
        deployment = os.getenv("MODEL_DEPLOYMENT_NAME", "gpt-5-nano")

        response = client.chat.completions.create(
            model=deployment,
            messages=[
                {"role": "system", "content": "You are a helpful AI assistant."},
                {"role": "user", "content": user_msg}
            ]
        )

        return AgentRunResponse(text=response.choices[0].message.content)

    async def run_stream(
        self,
        messages: str | ChatMessage | list[str] | list[ChatMessage] | None = None,
        *,
        thread: AgentThread | None = None,
        **kwargs: Any,
    ) -> AsyncIterable[AgentRunResponseUpdate]:
        """Stream response from Azure OpenAI."""
        user_msg = self._get_user_message(messages)
        client = self._get_client()
        deployment = os.getenv("MODEL_DEPLOYMENT_NAME", "gpt-5-nano")

        stream = client.chat.completions.create(
            model=deployment,
            messages=[
                {"role": "system", "content": "You are a helpful AI assistant."},
                {"role": "user", "content": user_msg}
            ],
            stream=True
        )

        for chunk in stream:
            if chunk.choices and chunk.choices[0].delta.content:
                yield AgentRunResponseUpdate(text=chunk.choices[0].delta.content)


if __name__ == "__main__":
    print("Initializing chat agent...")
    print(f"Project endpoint: {os.getenv(\'AZURE_AI_PROJECT_ENDPOINT\', \'not set\')}")

    agent = ChatAgent()

    print("Starting hosted agent server on port 8088...")
    from_agent_framework(agent).run()
''',
    encoding="utf-8",
)

# Write requirements.txt
(APP_DIR / "requirements.txt").write_text(
    """\
agent-framework-core
azure-ai-agentserver-agentframework
""",
    encoding="utf-8",
)

print("Created main.py and requirements.txt")

## 4) Run Locally + Smoke Test `/responses`

**What this step does:**
Provides commands to test your agent locally before containerizing it.

**Why test locally first?**
- Faster iteration ‚Äî no need to build/push Docker images
- Easier debugging ‚Äî full access to logs and stack traces
- Validates the hosting adapter integration

**How to test:**

1. **Start the server** (in a separate terminal):
   ```bash
   cd hosted_agent_app && python agent_app.py
   ```

2. **Send a test request** (from another terminal):
   ```bash
   curl -s http://localhost:8088/responses \
     -H 'Content-Type: application/json' \
     -d '{"input": {"messages": [{"role": "user", "content": "Hello!"}]}}' | jq
   ```

**Expected behavior:**
- Server starts on port 8088
- `/responses` returns a JSON response with your agent's output
- If it works locally, it will work when containerized

**Troubleshooting:**
- Port in use? Change the port in `agent_app.py`
- Import errors? Check package installation
- Auth errors? The local server may need Azure credentials if your agent calls Azure services

In [None]:
# Start the local server in the background.
# In Jupyter, the simplest approach is to run it in a separate terminal.
# Here we print the command you should run.

print("Run in a terminal:")
print(f"cd {APP_DIR} && python main.py")

print("\nThen test from another terminal:")
print("curl -s http://localhost:8088/responses -H 'Content-Type: application/json' \\")
print("  -d '{\"input\": {\"messages\": [{\"role\": \"user\", \"content\": \"Where is Seattle?\"}]}}' | jq")

In [None]:
# Image naming (used for local build and ACR push)
IMAGE_NAME = os.getenv("IMAGE_NAME") or "my-hosted-agent"
IMAGE_TAG = os.getenv("IMAGE_TAG") or "v3"  # v3: BaseAgent pattern with streaming support
print("IMAGE:", f"{IMAGE_NAME}:{IMAGE_TAG}")

## 5) Create a Dockerfile + Build the Image

**What this step does:**
1. Creates a `Dockerfile` that packages your agent app into a container
2. Builds the Docker image locally

**Why containerize?**
Azure AI Foundry hosted agents run as containers. Containerization ensures:
- **Reproducibility** ‚Äî Same environment locally and in Azure
- **Isolation** ‚Äî Your dependencies don't conflict with others
- **Portability** ‚Äî Deploy anywhere that runs containers

**Dockerfile breakdown:**
```dockerfile
FROM python:3.11-slim          # Base image with Python
WORKDIR /app                   # Set working directory
COPY . /app                    # Copy your agent code
RUN pip install ...            # Install dependencies
EXPOSE 8088                    # Declare the port (documentation)
CMD ["python", "agent_app.py"] # Start your agent server
```

**Image naming:**
- `IMAGE_NAME` ‚Äî The name of your image (default: `my-hosted-agent`)
- `IMAGE_TAG` ‚Äî Version tag (default: `v1`)
- Full local reference: `my-hosted-agent:v1`

**Tip:** Increment `IMAGE_TAG` (e.g., `v2`, `v3`) when you make changes to avoid caching issues.

In [None]:
# Dockerfile using requirements.txt and main.py (BaseAgent pattern)
(APP_DIR / "Dockerfile").write_text(
"""\
FROM python:3.11-slim

WORKDIR /app

# Copy application code
COPY . /app

# Install dependencies from requirements.txt
RUN pip install --no-cache-dir -U pip \\
    && pip install --no-cache-dir -r requirements.txt

# Expose the hosting adapter port
EXPOSE 8088

# Run the agent (main.py uses BaseAgent pattern)
CMD ["python", "main.py"]
""",
    encoding="utf-8",
)

print("Dockerfile written.")
print("Building docker image...")
sh(f"docker build -t {IMAGE_NAME}:{IMAGE_TAG} {APP_DIR}")
print("Local image built:", f"{IMAGE_NAME}:{IMAGE_TAG}")

## 6) Push Image to Azure Container Registry (ACR)

**What this step does:**
1. Configures the ACR connection using your `ACR_NAME`
2. Logs into ACR using Azure CLI credentials
3. Tags your local image with the ACR path
4. Pushes the image to ACR

**Why ACR?**
Azure AI Foundry can only pull container images from Azure Container Registry. When you create a hosted agent version, you provide an ACR image reference, and Foundry pulls from there.

**Image reference format:**
```
<acr-name>.azurecr.io/<image-name>:<tag>
Example: myregistry.azurecr.io/my-hosted-agent:v1
```

**Prerequisites:**
- `ACR_NAME` environment variable set (just the registry name, not the full `.azurecr.io` URL)
- You must have `AcrPush` role on the registry (or be an Owner/Contributor)

**What happens behind the scenes:**
1. `az acr login` ‚Äî Gets a temporary Docker credential for your registry
2. `docker tag` ‚Äî Creates an alias pointing to ACR
3. `docker push` ‚Äî Uploads image layers to ACR

**Troubleshooting:**
- "unauthorized" ‚Üí Check your Azure login and ACR permissions
- "not found" ‚Üí Verify ACR_NAME is correct (no `.azurecr.io` suffix)

In [8]:
# Reload .env to pick up any changes
from dotenv import load_dotenv
load_dotenv(override=True)

# ACR config (required to push)
# NOTE: ACR_NAME should be just the registry name, e.g. "myregistry" (NOT "myregistry.azurecr.io")
ACR_NAME = os.getenv("ACR_NAME") or ""
if not ACR_NAME:
    raise ValueError("Set ACR_NAME in .env (e.g. ACR_NAME=myregistry, without .azurecr.io)")
ACR_LOGIN_SERVER = f"{ACR_NAME}.azurecr.io"
IMAGE_REF = f"{ACR_LOGIN_SERVER}/{IMAGE_NAME}:{IMAGE_TAG}"

# TIP: If you need to fix container code and redeploy:
#   1. Fix the code in hosted_agent_app/
#   2. Change IMAGE_TAG above (e.g., "v1" -> "v2")
#   3. Re-run Docker build and push cells
#   4. Create a new agent version with the new IMAGE_REF

print("IMAGE_REF:", IMAGE_REF)
print(f"(To change version, edit IMAGE_TAG in cell above, currently: {IMAGE_TAG})")

IMAGE_REF: containervault01.azurecr.io/my-hosted-agent:v1
(To change version, edit IMAGE_TAG in cell above, currently: v1)


### Push the Container Image to ACR

**What this cell does:**
1. **Logs in to ACR** using `az acr login` (authenticates Docker with your registry)
2. **Tags the local image** with the ACR registry path
3. **Pushes the image** to ACR so Foundry can pull it later

**Prerequisites:**
- `ACR_NAME` must be set in your `.env` (just the registry name, e.g., `myregistry`)
- Docker must be running
- You must have `AcrPush` permissions on the registry

**What happens behind the scenes:**
```
Local: my-hosted-agent:v1
  ‚Üì docker tag
ACR:   myregistry.azurecr.io/my-hosted-agent:v1
  ‚Üì docker push
Azure: Image now available for Foundry to pull
```

In [9]:
# Log in to ACR and push the image
print("Logging into ACR...")
print(f"$ az acr login --name {ACR_NAME}")
sh(f"az acr login --name {ACR_NAME}")

print("Tagging + pushing...")
print(f"$ docker tag {IMAGE_NAME}:{IMAGE_TAG} {IMAGE_REF}")
sh(f"docker tag {IMAGE_NAME}:{IMAGE_TAG} {IMAGE_REF}")

print(f"$ docker push {IMAGE_REF}")
sh(f"docker push {IMAGE_REF}")

print(f"Pushed: {IMAGE_REF}")

Logging into ACR...
$ az acr login --name containervault01
$ az acr login --name containervault01
Tagging + pushing...
$ docker tag my-hosted-agent:v1 containervault01.azurecr.io/my-hosted-agent:v1
$ docker tag my-hosted-agent:v1 containervault01.azurecr.io/my-hosted-agent:v1
$ docker push containervault01.azurecr.io/my-hosted-agent:v1
$ docker push containervault01.azurecr.io/my-hosted-agent:v1
Pushed: containervault01.azurecr.io/my-hosted-agent:v1


## 7) Grant ACR Pull Permissions to the Foundry Project Managed Identity

**What this step does:**
Grants the Azure AI Foundry project's managed identity permission to pull images from your ACR.

**Why is this needed?**
When Foundry deploys your hosted agent, it needs to pull your container image from ACR. This requires:
1. The Foundry project has a **system-assigned managed identity**
2. That identity has **AcrPull** role on your ACR

**How to find the Principal ID:**
1. Go to Azure Portal ‚Üí Your Foundry project resource
2. Navigate to **Identity** ‚Üí **System assigned**
3. Copy the **Object (principal) ID**
4. Set it as `FOUNDRY_PROJECT_PRINCIPAL_ID` in your `.env`

**What the code does:**
```bash
az role assignment create \
  --assignee-object-id <principal-id> \
  --assignee-principal-type ServicePrincipal \
  --role 'AcrPull' \
  --scope <acr-resource-id>
```

**Common issues:**
- **"Principal not found"** ‚Üí Double-check the principal ID from the portal
- **"Authorization failed"** ‚Üí You need Owner or User Access Administrator role
- **Already assigned?** ‚Üí The command is idempotent; re-running is safe

**Note:** This is a one-time setup per project/ACR combination.

In [10]:
# ===== YOU MUST SET THIS =====
# This is the managed identity principal object id for your Foundry project.
# In portal: Foundry project -> Identity -> System assigned -> Object (principal) ID
FOUNDRY_PROJECT_PRINCIPAL_ID = os.getenv("FOUNDRY_PROJECT_PRINCIPAL_ID") or ""
if not FOUNDRY_PROJECT_PRINCIPAL_ID:
    print("Set FOUNDRY_PROJECT_PRINCIPAL_ID env var before running this cell.")
else:
    # Get ACR resource ID
    acr_id = sh(f"az acr show -n {ACR_NAME} --query id -o tsv")
    print("ACR ID:", acr_id)

    # Assign pull permissions
    # Role name may vary; docs reference 'Container Registry Repository Reader'
    # If this role name isn't recognized, list ACR roles and pick the correct one.
    sh(
        f"az role assignment create --assignee-object-id {FOUNDRY_PROJECT_PRINCIPAL_ID} "
        f"--assignee-principal-type ServicePrincipal "
        f"--role 'AcrPull' --scope {acr_id}"
    )
    print("Assigned AcrPull to Foundry project identity.")


Set FOUNDRY_PROJECT_PRINCIPAL_ID env var before running this cell.


In [11]:
# Step 1: Get your subscription ID
print("Your subscription ID:")
print(sh("az account show --query id -o tsv"))

Your subscription ID:
$ az account show --query id -o tsv
a20bc194-9787-44ee-9c7f-7c3130e651b6


In [12]:
# Step 2: Find your Foundry account and resource group
# Look for the account that matches your PROJECT_ENDPOINT subdomain
print("Your AI Services accounts (find the one matching your PROJECT_ENDPOINT):\n")
print(sh("az cognitiveservices account list -o table"))
print("\n" + "="*80)
print("HOW TO USE:")
print("1. Look at your PROJECT_ENDPOINT (e.g., https://ACCOUNT_NAME.services.ai.azure.com/...)")
print("2. Find that ACCOUNT_NAME in the 'Name' column above")
print("3. Copy the corresponding 'ResourceGroup' value")
print("="*80)

Your AI Services accounts (find the one matching your PROJECT_ENDPOINT):

$ az cognitiveservices account list -o table
Kind            Location        Name                              ResourceGroup
--------------  --------------  --------------------------------  ---------------------
AIServices      eastus          agent-ai-servicesgezg             rg-openai
AIServices      eastus          agent-ai-servicesjq3h             rg-openai
AIServices      eastus          agent-ai-services7fxt             rg-openai
AIServices      eastus2         ai-ozgurgulerai5658070475260732   rg-ozgurgulerai
AIServices      eastus2         ai-eastus2hubozguler527669401205  rg-openai
AIServices      eastus2         ai-hubx611882637128               rg-openai
AIServices      eastus2         ai-hubxx118150369322              rg-ozgurguler-3950_ai
FormRecognizer  uksouth         docint-ozguler                    rg_xbip
AIServices      northcentralus  ozgur-m3q1pn4n-northcentralus     rg-openai
AIServices   

In [13]:
# Step 3: Extract values from your PROJECT_ENDPOINT automatically
import re
endpoint = PROJECT_ENDPOINT or os.getenv("AZURE_AI_PROJECT_ENDPOINT") or ""

if endpoint:
    # Extract account name from endpoint URL
    match = re.search(r'https://([^.]+)\.services\.ai\.azure\.com/api/projects/([^/]+)', endpoint)
    if match:
        account_name = match.group(1)
        project_name = match.group(2)
        print(f"Detected from your PROJECT_ENDPOINT:")
        print(f"  FOUNDRY_ACCOUNT_NAME = {account_name}")
        print(f"  FOUNDRY_PROJECT_NAME = {project_name}")
        print(f"\nAdd these to your .env file!")
    else:
        print("Could not parse endpoint. Set manually.")
else:
    print("PROJECT_ENDPOINT not set. Run cell 2 first, or set AZURE_AI_PROJECT_ENDPOINT in .env")

Detected from your PROJECT_ENDPOINT:
  FOUNDRY_ACCOUNT_NAME = ozgurguler-7212-resource
  FOUNDRY_PROJECT_NAME = ozgurguler-7212

Add these to your .env file!


## 8) Capability hosts

> Note
> This section refers to the Microsoft Foundry (new) portal.

> Note
> Updating capability hosts is not supported. To modify a capability host, you must delete the existing one and recreate it with the new configuration.

Capability hosts are sub-resources that you define at both the **Foundry account** and **Foundry project** scopes. They specify where the Foundry Agent Service should store and process your agent data, including:

- Conversation history (threads)
- File uploads
- Vector stores

### Why use capability hosts?
Capability hosts allow you to bring your own Azure resources instead of using the default Microsoft-managed platform resources. This gives you:

- Data sovereignty ‚Äî keep all agent data within your Azure subscription
- Security control ‚Äî use your own storage accounts, databases, and search services
- Compliance ‚Äî meet specific regulatory or organizational requirements

### How do capability hosts work?
Creating capability hosts is not required. If you don't create an **account-level and project-level** capability host, the Agent Service automatically uses Microsoft-managed Azure resources for:

- Thread storage (conversation history, agent definitions)
- File storage (uploaded documents)
- Vector search (embeddings and retrieval)

When you create capability hosts at both the **account** and **project** levels, all agent data is stored and processed using your own Azure resources within your subscription (a ‚Äústandard agent setup‚Äù).

> Note
> All Foundry workspace resources should be in the same region as the VNet, including Cosmos DB, Storage Account, AI Search, Foundry Account, Project, and Managed Identity.

### Configuration hierarchy
Capability hosts follow a hierarchy where more specific configurations override broader ones:

- Service defaults (Microsoft-managed search and storage) ‚Äî used when no capability host is configured
- Account-level capability host ‚Äî shared defaults for all projects under the account
- Project-level capability host ‚Äî overrides account-level and service defaults for that specific project

### Understand capability host constraints
- One capability host per scope: each account and each project can only have one active capability host. Creating a second with a different name at the same scope will result in a `409 Conflict`.
- Configuration updates are not supported: delete and recreate the capability host to change configuration.

### Recommended setup

Required properties (at either the account or project level):

| Property | Purpose | Required Azure resource | Example connection name |
| --- | --- | --- | --- |
| `threadStorageConnections` | Stores agent definitions, conversation history and chat threads | Azure Cosmos DB | `my-cosmosdb-connection` |
| `vectorStoreConnections` | Handles vector storage for retrieval and search | Azure AI Search | `my-ai-search-connection` |
| `storageConnections` | Manages file uploads and blob storage | Azure Storage Account | `my-storage-connection` |

Optional property:
- `aiServicesConnections` ‚Äî use your own model deployments (Azure OpenAI)

### Management API examples (ARM)

Account capability host:
```
PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/capabilityHosts/{name}?api-version=2025-06-01

{
  "properties": {
    "capabilityHostKind": "Agents"
  }
}
```

Optional: account-level defaults with project overrides:
```
PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/capabilityHosts/{name}?api-version=2025-06-01

{
  "properties": {
    "capabilityHostKind": "Agents",
    "threadStorageConnections": ["shared-cosmosdb-connection"],
    "vectorStoreConnections": ["shared-ai-search-connection"],
    "storageConnections": ["shared-storage-connection"]
  }
}
```

Project capability host (overrides service defaults and any account-level settings):
```
PUT https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/projects/{projectName}/capabilityHosts/{name}?api-version=2025-06-01

{
  "properties": {
    "capabilityHostKind": "Agents",
    "threadStorageConnections": ["my-cosmos-db-connection"],
    "vectorStoreConnections": ["my-ai-search-connection"],
    "storageConnections": ["my-storage-account-connection"],
    "aiServicesConnections": ["my-azure-openai-connection"]
  }
}
```

Delete capability hosts (impacts dependent agents):

> Warning
> Deleting a capability host affects all agents that depend on it. If you delete the project and/or account capability host, agents may lose access to the files, threads, and vector stores they previously used.
```
DELETE https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/capabilityHosts/{name}?api-version=2025-06-01
DELETE https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/projects/{projectName}/capabilityHosts/{name}?api-version=2025-06-01
```

Validation (list existing capability hosts):
```
GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/capabilityHosts?api-version=2025-06-01
GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.CognitiveServices/accounts/{accountName}/projects/{projectName}/capabilityHosts?api-version=2025-06-01
```

Troubleshooting quick checks:
- `409 Conflict` typically means a capability host already exists at that scope (use it, or delete/recreate).
- If another operation is in progress, wait and retry (conflicts can be transient).
- Idempotency: same name + same configuration returns the existing resource; same name + different configuration returns `400`; different name returns `409`.

> Note
> API versions and required fields can change (preview). If you hit `404`/`400`, compare the error payload with the current portal/REST docs for your region/tenant.

**Common errors:**
- 403/401 ‚Üí You need Contributor role on the Foundry account
- "Update not supported" ‚Üí This is expected: updates are not supported; delete/recreate instead

In [14]:
# Reload .env to pick up any changes you made
from dotenv import load_dotenv
load_dotenv(override=True)

# Azure resource context (required for management-plane + CLI operations)
SUBSCRIPTION_ID = os.getenv("AZ_SUBSCRIPTION_ID") or ""
RESOURCE_GROUP  = os.getenv("AZ_RESOURCE_GROUP") or ""
FOUNDRY_ACCOUNT_NAME = os.getenv("FOUNDRY_ACCOUNT_NAME") or ""
FOUNDRY_PROJECT_NAME = os.getenv("FOUNDRY_PROJECT_NAME") or ""

missing = [k for k, v in {
    "AZ_SUBSCRIPTION_ID": SUBSCRIPTION_ID,
    "AZ_RESOURCE_GROUP": RESOURCE_GROUP,
    "FOUNDRY_ACCOUNT_NAME": FOUNDRY_ACCOUNT_NAME,
    "FOUNDRY_PROJECT_NAME": FOUNDRY_PROJECT_NAME,
}.items() if not v]
if missing:
    raise ValueError("Missing required config for deploy sections: " + ", ".join(missing))

print("Azure context OK.")
print(f"  Subscription: {SUBSCRIPTION_ID}")
print(f"  Resource Group: {RESOURCE_GROUP}")
print(f"  Account: {FOUNDRY_ACCOUNT_NAME}")
print(f"  Project: {FOUNDRY_PROJECT_NAME}")

Azure context OK.
  Subscription: a20bc194-9787-44ee-9c7f-7c3130e651b6
  Resource Group: rg-ozgurguler-7212
  Account: ozgurguler-7212-resource
  Project: ozgurguler-7212


### Why do we need an ARM token and Capability Host?

**Capability hosts** are ARM (management-plane) resources under `management.azure.com`. If you need to list/create/delete them (account-level and/or project-level), you must authenticate to ARM with an access token.

**What this cell does:**
1. **Gets an ARM token** ‚Äî Azure Resource Manager (ARM) is the management plane for Azure. We need an access token to make API calls to create/list/delete capability hosts.
2. **Calls the management API** ‚Äî We use `az rest` against the capability host endpoints (PUT/GET/DELETE).

**Why it's needed:**
- Capability host operations are not available via the data-plane `services.ai.azure.com` endpoint.
- Updates aren't supported; treat changes as delete + recreate.
- Some hosted-agents preview flows still require one-time setup at the account level before `create_version()` succeeds.

In [15]:
# Azure resource context (required for management-plane + CLI operations)
SUBSCRIPTION_ID = os.getenv("AZ_SUBSCRIPTION_ID") or ""
RESOURCE_GROUP  = os.getenv("AZ_RESOURCE_GROUP") or ""
FOUNDRY_ACCOUNT_NAME = os.getenv("FOUNDRY_ACCOUNT_NAME") or ""
FOUNDRY_PROJECT_NAME = os.getenv("FOUNDRY_PROJECT_NAME") or ""

missing = [k for k, v in {
    "AZ_SUBSCRIPTION_ID": SUBSCRIPTION_ID,
    "AZ_RESOURCE_GROUP": RESOURCE_GROUP,
    "FOUNDRY_ACCOUNT_NAME": FOUNDRY_ACCOUNT_NAME,
    "FOUNDRY_PROJECT_NAME": FOUNDRY_PROJECT_NAME,
}.items() if not v]
if missing:
    raise ValueError("Missing required config for deploy sections: " + ", ".join(missing))

print("Azure context OK.")


Azure context OK.


## 9) Create Hosted Agent Version (ImageBasedHostedAgentDefinition)


In [16]:
# Hosted agent config
HOSTED_AGENT_NAME = os.getenv("HOSTED_AGENT_NAME") or "my-hosted-agent"
HOSTED_CPU = os.getenv("HOSTED_CPU") or "2"
HOSTED_MEMORY = os.getenv("HOSTED_MEMORY") or "4Gi"
CONTAINER_PROTOCOL_VERSION = os.getenv("CONTAINER_PROTOCOL_VERSION") or "1"

# Normalize memory to Gi units expected by the service
if not HOSTED_MEMORY.endswith("Gi"):
    HOSTED_MEMORY = f"{HOSTED_MEMORY}Gi"

print("HOSTED_AGENT_NAME:", HOSTED_AGENT_NAME)
print("HOSTED_CPU:", HOSTED_CPU)
print("HOSTED_MEMORY:", HOSTED_MEMORY)
print("CONTAINER_PROTOCOL_VERSION:", CONTAINER_PROTOCOL_VERSION)


HOSTED_AGENT_NAME: my-hosted-agent
HOSTED_CPU: 2
HOSTED_MEMORY: 4Gi
CONTAINER_PROTOCOL_VERSION: 1


In [17]:
# Hosted agents SDK models require azure-ai-projects >= 1.0.0b11 (or 2.0.0b2 for some features)
# If you get ImportError, upgrade: pip install --pre azure-ai-projects>=1.0.0b11

try:
    from azure.identity import DefaultAzureCredential
    from azure.ai.projects import AIProjectClient
    from azure.ai.projects.models import (
        ImageBasedHostedAgentDefinition,
        ProtocolVersionRecord,
        AgentProtocol,
    )
except ImportError as e:
    print(f"Import error: {e}")
    print("\nHosted agent models not found in your SDK version.")
    print("Upgrade with: pip install --pre 'azure-ai-projects>=1.0.0b11'")
    raise

# Hosted agent config (define here to avoid cell ordering issues)
HOSTED_AGENT_NAME = os.getenv("HOSTED_AGENT_NAME") or "my-hosted-agent"
HOSTED_CPU = os.getenv("HOSTED_CPU") or "2"
HOSTED_MEMORY = os.getenv("HOSTED_MEMORY") or "4Gi"
CONTAINER_PROTOCOL_VERSION = os.getenv("CONTAINER_PROTOCOL_VERSION") or "1"

# Normalize memory to Gi units expected by the service
if not HOSTED_MEMORY.endswith("Gi"):
    HOSTED_MEMORY = f"{HOSTED_MEMORY}Gi"

print("Creating agent version with:")
print(f"  HOSTED_AGENT_NAME: {HOSTED_AGENT_NAME}")
print(f"  IMAGE_REF: {IMAGE_REF}")
print(f"  CPU: {HOSTED_CPU}, Memory: {HOSTED_MEMORY}")

cred = DefaultAzureCredential()
client = AIProjectClient(endpoint=PROJECT_ENDPOINT, credential=cred)

# IMPORTANT:
# Your container might need env vars. Put only what your container expects.
env_vars = {
    "AZURE_AI_PROJECT_ENDPOINT": PROJECT_ENDPOINT,
    "MODEL_DEPLOYMENT_NAME": MODEL_DEPLOYMENT_NAME,
}

agent_version = client.agents.create_version(
    agent_name=HOSTED_AGENT_NAME,
    description=f"Hosted agent created from notebook (image: {IMAGE_TAG})",
    definition=ImageBasedHostedAgentDefinition(
        container_protocol_versions=[
            ProtocolVersionRecord(protocol=AgentProtocol.RESPONSES, version=CONTAINER_PROTOCOL_VERSION)
        ],
        cpu=HOSTED_CPU,
        memory=HOSTED_MEMORY,
        image=IMAGE_REF,
        environment_variables=env_vars,
    ),
)

# Store the version for later use
CREATED_VERSION = agent_version.version if hasattr(agent_version, "version") else "1"

print("\n‚úÖ Created hosted agent version.")
print(f"Agent: {agent_version.name if hasattr(agent_version, 'name') else HOSTED_AGENT_NAME}")
print(f"Version: {CREATED_VERSION}")
print(f"\nNext: Run the 'Start Agent' cell below to deploy this version.")

Creating agent version with:
  HOSTED_AGENT_NAME: my-hosted-agent
  IMAGE_REF: containervault01.azurecr.io/my-hosted-agent:v1
  CPU: 2, Memory: 4Gi

‚úÖ Created hosted agent version.
Agent: my-hosted-agent
Version: 3

Next: Run the 'Start Agent' cell below to deploy this version.


## 10) Start / Manage the Hosted Agent Deployment (Azure CLI)

**What this step does:**
Uses Azure CLI commands to manage your hosted agent's lifecycle.

**Available commands:**

| Command | Purpose |
|---------|---------|
| `az cognitiveservices agent start` | Start running a specific agent version |
| `az cognitiveservices agent stop` | Stop a running agent |
| `az cognitiveservices agent show` | Check current status (Running, Stopped, etc.) |
| `az cognitiveservices agent list-versions` | List all versions of an agent |
| `az cognitiveservices agent update` | Update deployment configuration |

**Required parameters:**
- `--account-name` ‚Äî Your Foundry account name
- `--project-name` ‚Äî Your Foundry project name  
- `--name` ‚Äî Your hosted agent name
- `--agent-version` ‚Äî Which version to start (e.g., "1", "2")

**Agent lifecycle:**
```
Created ‚Üí Starting ‚Üí Running ‚Üí Stopping ‚Üí Stopped
                ‚Üë__________________________|
```

**Important notes:**
- Starting an agent may take 1-2 minutes as Azure provisions the container
- Only one version can be running at a time per agent name
- Check status with `az cognitiveservices agent show` before invoking

**Cost considerations:**
Running agents consume compute resources. Stop agents when not in use to avoid unnecessary charges.

In [18]:
# Check existing agents and their deployment status BEFORE starting
import json

print("=== Existing Hosted Agents ===")
agents = json.loads(sh(
    f"az cognitiveservices agent list "
    f"--account-name {FOUNDRY_ACCOUNT_NAME} "
    f"--project-name {FOUNDRY_PROJECT_NAME} "
    f"-o json"
))

for agent in agents:
    agent_name = agent.get("name")
    versions = agent.get("versions", {})
    latest = versions.get("latest", {})
    version_id = latest.get("version", "?")
    
    print(f"\nüì¶ Agent: {agent_name} (version {version_id})")
    
    # Check deployment status using 'start' (idempotent, returns status)
    try:
        status = json.loads(sh(
            f"az cognitiveservices agent start "
            f"--account-name {FOUNDRY_ACCOUNT_NAME} "
            f"--project-name {FOUNDRY_PROJECT_NAME} "
            f"--name {agent_name} "
            f"--agent-version {version_id} "
            f"-o json"
        ))
        container = status.get("container") or {}
        print(f"   Status: {status.get('status')}")
        print(f"   Container: {container.get('status')}")
        if container.get("error_message"):
            print(f"   ‚ö†Ô∏è Error: {container.get('error_message')}")
    except Exception as e:
        print(f"   Could not get status: {e}")

print("\n" + "="*50)
print(f"Target agent for this notebook: {HOSTED_AGENT_NAME}")

=== Existing Hosted Agents ===
$ az cognitiveservices agent list --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 -o json





üì¶ Agent: my-hosted-agent (version 3)
$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




   Status: InProgress
   Container: Starting

üì¶ Agent: ozgur-hosted-agent (version 1)
$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name ozgur-hosted-agent --agent-version 1 -o json




   Status: InProgress
   Container: Starting

Target agent for this notebook: my-hosted-agent


In [19]:
# ===== SET THE VERSION YOU WANT TO START =====
# Uses the version just created, or falls back to env var, or defaults to "1"
AGENT_VERSION_TO_START = (
    globals().get("CREATED_VERSION")  # From create_version cell above
    or os.getenv("AGENT_VERSION_TO_START") 
    or "1"
)
print(f"Starting agent version: {AGENT_VERSION_TO_START}")

import json
import time

start_resp = json.loads(sh(
    f"az cognitiveservices agent start "
    f"--account-name {FOUNDRY_ACCOUNT_NAME} "
    f"--project-name {FOUNDRY_PROJECT_NAME} "
    f"--name {HOSTED_AGENT_NAME} "
    f"--agent-version {AGENT_VERSION_TO_START} "
    f"-o json"
))
print("Initial response:", json.dumps(start_resp, indent=2))

def wait_for_agent_running(timeout_sec=300, poll_sec=15):
    """
    Poll for agent deployment status.
    
    IMPORTANT CLI behavior:
    - 'start' is idempotent and returns deployment status (use this for polling)
    - 'show' only returns agent definition (no status/container fields)
    - 'start'/'stop' require --agent-version, 'show' does NOT accept it
    """
    deadline = time.time() + timeout_sec
    last = None
    while time.time() < deadline:
        # Use 'start' to get deployment status (it's idempotent)
        status = json.loads(sh(
            f"az cognitiveservices agent start "
            f"--account-name {FOUNDRY_ACCOUNT_NAME} "
            f"--project-name {FOUNDRY_PROJECT_NAME} "
            f"--name {HOSTED_AGENT_NAME} "
            f"--agent-version {AGENT_VERSION_TO_START} "
            f"-o json"
        ))
        overall = status.get("status")
        container = status.get("container") or {}
        container_status = container.get("status")
        error_msg = container.get("error_message")
        
        if container_status == "Running" or overall == "Succeeded":
            return status
        if overall == "Failed" or container_status == "Failed":
            raise RuntimeError(f"Agent failed to start: {error_msg or status}")
        if error_msg and error_msg.strip():
            print(f"‚ö†Ô∏è Error detected: {error_msg}")
            print("Common causes: container crash (check import), ACR permissions, capability host not set up")
            print("See Troubleshooting section at the end of this notebook.")
        
        if (overall, container_status) != last:
            print(f"Waiting... status={overall}, container={container_status}")
            last = (overall, container_status)
        time.sleep(poll_sec)

    raise TimeoutError(f"Timed out after {timeout_sec}s waiting for agent to reach Running state")

running_status = wait_for_agent_running()
print(
    "\n‚úÖ Agent is running!",
    f"status={running_status.get('status')}",
    f"container={running_status.get('container', {}).get('status')}",
)

Starting agent version: 3
$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




Initial response: {
  "agent_id": "my-hosted-agent",
  "agent_version_id": "3",
  "container": {
    "created_at": "2025-12-30T09:41:42.3698986Z",
    "error_message": "",
    "max_replicas": 1,
    "min_replicas": 1,
    "object": "agent.container",
    "status": "Starting",
    "updated_at": "2025-12-30T09:41:42.4446158Z"
  },
  "id": "83f72f64-2215-4785-b668-f0b3f4dd8990",
  "status": "InProgress"
}
$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




Waiting... status=InProgress, container=Starting
$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




$ az cognitiveservices agent start --account-name ozgurguler-7212-resource --project-name ozgurguler-7212 --name my-hosted-agent --agent-version 3 -o json




KeyboardInterrupt: 

In [None]:
# Check hosted agent status (Azure CLI)
# Note: 'show' command does NOT take --agent-version (only start/stop do)
import json

status = json.loads(sh(
    f"az cognitiveservices agent show "
    f"--account-name {FOUNDRY_ACCOUNT_NAME} "
    f"--project-name {FOUNDRY_PROJECT_NAME} "
    f"--name {HOSTED_AGENT_NAME} "
    f"-o json"
))

overall = status.get("status")
container_status = (status.get("container") or {}).get("status")
print("Agent status:", overall, "| container:", container_status)
print("Full status:", json.dumps(status, indent=2))

## 11) Invoke the Hosted Agent via Responses API


In [None]:
# Invoke the hosted agent via OpenAI Responses API
# Note: The hosted agent must be in "Running" state (check with az cognitiveservices agent show)

try:
    from azure.ai.projects.models import AgentReference
except ImportError:
    print("AgentReference not available - upgrade azure-ai-projects")
    raise

openai_client = client.get_openai_client()

# Create the agent reference directly (no need to retrieve the agent first)
agent_ref = AgentReference(name=HOSTED_AGENT_NAME, version=AGENT_VERSION_TO_START)

resp = openai_client.responses.create(
    input=[{"role": "user", "content": "Write a haiku about Azure AI Foundry hosted agents."}],
    extra_body={"agent": agent_ref.as_dict()},
)

print(resp.output_text)

## 12) (Optional) Classic Prompt-Based Agent for Comparison

**Why include this section?**
To illustrate the key difference between hosted agents and classic agents.

**Classic agents (AgentsClient):**
- You define the agent using prompts, instructions, and tool configurations
- Azure Foundry executes your agent definition in its managed runtime
- You don't control the runtime environment or dependencies
- Great for simple agents that use built-in tools (Code Interpreter, File Search, etc.)

**Hosted agents (this notebook):**
- You write your own agent code in Python (or any language)
- You containerize it and deploy it to Azure
- You have full control over dependencies, frameworks, and behavior
- Great for complex agents, custom logic, or specific framework requirements (LangChain, Semantic Kernel, etc.)

**When to use which?**

| Use Case | Recommended Approach |
|----------|---------------------|
| Simple Q&A with built-in tools | Classic agent |
| Custom business logic | Hosted agent |
| Third-party API integrations | Hosted agent |
| LangChain/LangGraph agents | Hosted agent |
| Quick prototyping | Classic agent |
| Production with custom scaling | Hosted agent |

**The code below** shows how you would create a classic agent for comparison. It's commented out but can be run if you want to see both approaches side by side.

## 13) Workshop Workflow: Stop / Start / Resume

**For workshop participants:** Use these commands to manage your hosted agent between sessions.

### Stop Agent (saves costs when not in use)
```bash
az cognitiveservices agent stop \
  --account-name <account> \
  --project-name <project> \
  --name <agent-name> \
  --agent-version <version>
```

### Start Agent (resume for next workshop step)
```bash
az cognitiveservices agent start \
  --account-name <account> \
  --project-name <project> \
  --name <agent-name> \
  --agent-version <version>
```

### Check Status
```bash
# Get deployment status (use start - it's idempotent)
az cognitiveservices agent start \
  --account-name <account> \
  --project-name <project> \
  --name <agent-name> \
  --agent-version <version> \
  -o json
```

> **üí° CLI Quirk:** The `start` command is safe to call repeatedly ‚Äî it doesn't restart an already-running agent, it just returns the current deployment status.
>
> This is a quirk of the preview CLI. There's no dedicated "status" command, so `start` doubles as both "start" and "check status".

### Expected Output When Agent is Running
When your agent is successfully running, you should see:
```json
{
  "agent_id": "my-hosted-agent",
  "agent_version_id": "1",
  "container": {
    "status": "Running",
    "error_message": ""
  },
  "status": "Succeeded"
}
```

**Proceeding to Agent Memory Step:**
Once your agent shows `"status": "Running"`, you can proceed to the agent memory workshop step. The hosted agent will be available at your project endpoint.

In [None]:
# === STOP AGENT (run when pausing workshop) ===
import json

AGENT_VERSION_TO_MANAGE = os.getenv("AGENT_VERSION_TO_START") or globals().get("CREATED_VERSION") or "1"

print(f"Stopping agent: {HOSTED_AGENT_NAME} version {AGENT_VERSION_TO_MANAGE}")
stop_resp = json.loads(sh(
    f"az cognitiveservices agent stop "
    f"--account-name {FOUNDRY_ACCOUNT_NAME} "
    f"--project-name {FOUNDRY_PROJECT_NAME} "
    f"--name {HOSTED_AGENT_NAME} "
    f"--agent-version {AGENT_VERSION_TO_MANAGE} "
    f"-o json"
))
print("‚úÖ Agent stopped.")
print(json.dumps(stop_resp, indent=2))

In [None]:
# === START AGENT (run when resuming workshop) ===
import json

AGENT_VERSION_TO_MANAGE = os.getenv("AGENT_VERSION_TO_START") or globals().get("CREATED_VERSION") or "1"

print(f"Starting agent: {HOSTED_AGENT_NAME} version {AGENT_VERSION_TO_MANAGE}")
start_resp = json.loads(sh(
    f"az cognitiveservices agent start "
    f"--account-name {FOUNDRY_ACCOUNT_NAME} "
    f"--project-name {FOUNDRY_PROJECT_NAME} "
    f"--name {HOSTED_AGENT_NAME} "
    f"--agent-version {AGENT_VERSION_TO_MANAGE} "
    f"-o json"
))

container = start_resp.get("container", {})
status = start_resp.get("status")
container_status = container.get("status")

if container_status == "Running" or status == "Succeeded":
    print("‚úÖ Agent is RUNNING! Ready for next workshop step.")
elif container_status == "Starting":
    print("‚è≥ Agent is starting... wait 1-2 minutes and re-run this cell.")
else:
    print(f"‚ö†Ô∏è Status: {status}, Container: {container_status}")
    if container.get("error_message"):
        print(f"   Error: {container.get('error_message')}")

print("\nFull response:")
print(json.dumps(start_resp, indent=2))

## 14) Cleanup (End of Workshop)

**When you're completely done with the workshop**, clean up resources to avoid charges.

**Hosted agent cleanup commands:**

```bash
# Stop the running agent (stops billing for compute)
az cognitiveservices agent stop \
  --account-name <account> \
  --project-name <project> \
  --name <agent-name> \
  --agent-version <version>

# Delete a specific agent version
az cognitiveservices agent delete \
  --account-name <account> \
  --project-name <project> \
  --name <agent-name> \
  --agent-version <version>

# Delete ALL versions of an agent
az cognitiveservices agent delete \
  --account-name <account> \
  --project-name <project> \
  --name <agent-name>
```

**Optional: Clean up ACR images**
```bash
# Delete a specific tag
az acr repository delete --name <acr-name> --image <image>:<tag>

# Or delete the whole repository
az acr repository delete --name <acr-name> --repository <image>
```

**What to keep vs. delete:**
- **Keep**: ACR (reusable), Foundry project, Capability Host (one-time setup)
- **Delete**: Old agent versions, stopped deployments you no longer need

**Cost considerations:**
- Stopped agents don't incur compute costs
- ACR storage has minimal costs for small images
- The Foundry project itself has no idle cost

In [None]:
# === FULL CLEANUP (run only at end of workshop) ===
# WARNING: This will delete your hosted agent!

import json

AGENT_VERSION_TO_DELETE = os.getenv("AGENT_VERSION_TO_START") or globals().get("CREATED_VERSION") or "1"

# Uncomment the lines below to actually delete
# print(f"Deleting agent: {HOSTED_AGENT_NAME}")
# delete_resp = sh(
#     f"az cognitiveservices agent delete "
#     f"--account-name {FOUNDRY_ACCOUNT_NAME} "
#     f"--project-name {FOUNDRY_PROJECT_NAME} "
#     f"--name {HOSTED_AGENT_NAME} "
#     f"-o json"
# )
# print("‚úÖ Agent deleted.")

print("‚ö†Ô∏è Cleanup commands are commented out for safety.")
print("Uncomment the lines above to delete your agent.")
print(f"\nTo delete manually, run:")
print(f"  az cognitiveservices agent delete --account-name {FOUNDRY_ACCOUNT_NAME} --project-name {FOUNDRY_PROJECT_NAME} --name {HOSTED_AGENT_NAME}")

## Troubleshooting Guide

### Common Issues and Solutions

#### 1. Container crashes with `ImportError`

**Symptom:** Agent stuck in "Starting" with `InternalServerError`

**Root cause:** Wrong import in `agent_app.py`

```python
# WRONG - will crash
from azure.ai.agentserver.agentframework import from_agentframework

# CORRECT - use underscores
from azure.ai.agentserver.agentframework import from_agent_framework
```

**How to fix:**
1. Fix the import in `hosted_agent_app/agent_app.py`
2. Rebuild Docker image with new tag (e.g., `v2`)
3. Push to ACR
4. Create new agent version with new image
5. Start the new version

---

#### 2. `az cognitiveservices agent show` fails with `--agent-version`

**Symptom:** `ERROR: unrecognized arguments: --agent-version`

**Root cause:** The `show` command doesn't accept `--agent-version`

| Command | Has `--agent-version`? |
|---------|------------------------|
| `start` | ‚úÖ Yes (required) |
| `stop` | ‚úÖ Yes (required) |
| `show` | ‚ùå No |

**Correct usage:**
```bash
# Show agent definition (no version param)
az cognitiveservices agent show --account-name X --project-name Y --name Z

# Start specific version (requires version)
az cognitiveservices agent start --account-name X --project-name Y --name Z --agent-version 1
```

---

#### 3. Agent stuck in "Starting" status

**Diagnostic steps:**

1. **Check container locally first:**
   ```bash
   docker run --rm -e AZURE_AI_PROJECT_ENDPOINT="..." -e MODEL_DEPLOYMENT_NAME="..." your-image:tag
   ```
   If it crashes, check the error message.

2. **Verify ACR permissions:**
   ```bash
   az role assignment list --scope $(az acr show -n YOUR_ACR --query id -o tsv) -o table
   ```
   Look for `AcrPull` role assigned to the Foundry project's managed identity.

3. **Check Capability Host is enabled:**
   ```bash
   # Account-level capability host(s)
   az rest --method GET --url "https://management.azure.com/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/capabilityHosts?api-version=2025-06-01"

   # Project-level capability host(s)
   az rest --method GET --url "https://management.azure.com/subscriptions/{sub}/resourceGroups/{rg}/providers/Microsoft.CognitiveServices/accounts/{account}/projects/{projectName}/capabilityHosts?api-version=2025-06-01"
   ```

4. **Verify region:** Hosted agents are only available in **North Central US** (preview).

---

#### 4. How to check deployment status

The `start` command is **idempotent** and returns current deployment status:

```python
import json
status = json.loads(sh(
    f"az cognitiveservices agent start "
    f"--account-name {FOUNDRY_ACCOUNT_NAME} "
    f"--project-name {FOUNDRY_PROJECT_NAME} "
    f"--name {HOSTED_AGENT_NAME} "
    f"--agent-version {VERSION} "
    f"-o json"
))
print("Status:", status.get("status"))
print("Container:", status.get("container", {}).get("status"))
print("Error:", status.get("container", {}).get("error_message"))
```

---

#### 5. Multiple agent versions

When re-running this notebook, you may create multiple versions. To check existing agents:

```bash
# List all agents
az cognitiveservices agent list --account-name X --project-name Y -o table

# List versions of a specific agent
az cognitiveservices agent list-versions --account-name X --project-name Y --name Z -o table
```

---

#### 6. Environment variables not found

**Symptom:** `ValueError: Missing required config...`

**Fix:** Ensure `.env` file has all required variables and reload:
```python
from dotenv import load_dotenv
load_dotenv(override=True)  # override=True ensures fresh values
```

In [None]:
# === DIAGNOSTIC CELL ===
# Run this cell to diagnose common hosted agent issues

import json
import subprocess

def diag(title, cmd):
    print(f"\n{'='*50}")
    print(f"üìã {title}")
    print(f"{'='*50}")
    try:
        result = subprocess.run(cmd, shell=True, capture_output=True, text=True)
        if result.returncode == 0:
            print(result.stdout)
        else:
            print(f"‚ùå Error: {result.stderr}")
    except Exception as e:
        print(f"‚ùå Exception: {e}")

# 1. Check region
diag("1. Foundry Account Region", 
     f"az cognitiveservices account show --name {FOUNDRY_ACCOUNT_NAME} "
     f"--resource-group {RESOURCE_GROUP} --query location -o tsv 2>/dev/null || echo 'Could not get region'")

# 2. List existing agents
diag("2. Existing Hosted Agents",
     f"az cognitiveservices agent list --account-name {FOUNDRY_ACCOUNT_NAME} "
     f"--project-name {FOUNDRY_PROJECT_NAME} --query '[].name' -o tsv 2>/dev/null")

# 3. Check ACR image exists
diag("3. ACR Image Tags",
     f"az acr repository show-tags --name {ACR_NAME} --repository {IMAGE_NAME} -o tsv 2>/dev/null || echo 'Image not found'")

# 4. Check ACR permissions
diag("4. ACR Role Assignments (look for AcrPull)",
     f"az role assignment list --scope $(az acr show -n {ACR_NAME} --query id -o tsv 2>/dev/null) "
     f"--query \"[?contains(roleDefinitionName, 'Acr')].{{Principal:principalName, Role:roleDefinitionName}}\" -o table 2>/dev/null")

# 5. Test container locally (quick check)
print(f"\n{'='*50}")
print("üìã 5. Container Local Test")
print("="*50)
print("Run this manually to test your container:")
print(f"  docker run --rm -e AZURE_AI_PROJECT_ENDPOINT=\"{PROJECT_ENDPOINT}\" \\")
print(f"    -e MODEL_DEPLOYMENT_NAME=\"{MODEL_DEPLOYMENT_NAME}\" \\")
print(f"    {IMAGE_REF}")

# 6. Current agent status
diag("6. Current Agent Deployment Status",
     f"az cognitiveservices agent start --account-name {FOUNDRY_ACCOUNT_NAME} "
     f"--project-name {FOUNDRY_PROJECT_NAME} --name {HOSTED_AGENT_NAME} "
     f"--agent-version 1 -o json 2>/dev/null | python3 -c \"import sys,json; d=json.load(sys.stdin); print(f'Status: {d.get(\\\"status\\\")}'); c=d.get('container',{{}}); print(f'Container: {c.get(\\\"status\\\")}'); print(f'Error: {c.get(\\\"error_message\\\",\\\"none\\\")}')\" 2>/dev/null || echo 'Could not get status'")

---

<div align="center">

## License & Attribution

This notebook is part of the **Azure AI Foundry Demo Repository**

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](../LICENSE)

**Original Author:** Ozgur Guler | AI Solution Leader, AI Innovation Hub

**Contact:** [ozgur.guler1@gmail.com](mailto:ozgur.guler1@gmail.com)

---

*If you use, modify, or distribute this work, you must provide appropriate credit to the original author as required by the [Apache License 2.0](../LICENSE).*

**Copyright ¬© 2025 Ozgur Guler. All rights reserved.**

</div>