# üìú Citadel Access Contracts - Testing Center

## Test multiple Access Contracts with different configurations and targets!

Use this Jupyter notebook to verify the Citadel Access Contracts deployment, including:
- Creating multiple access contracts for different use cases
- Testing Azure Key Vault integration
- Testing Microsoft Foundry connection integration
- Comparing request patterns across different contracts
- Visualizing API usage and throttling behavior

> **Note:** This notebook assumes you have already deployed your Citadel Governance Hub. If you haven't done so, please refer to the [Citadel Access Contracts Guide](../guides/full-deployment-guide.md) before proceeding.

## Azure Prerequisites

To take full advantage of this notebook, ensure you have the following Azure resources set up:
- An Azure Key Vault with secrets for API keys (if testing Key Vault integration) - Deployed separately
- A Microsoft Foundry account and project (if testing Foundry connection integration) - Deployed separately
- Ensure that your Azure credentials have the necessary permissions to access these resources.

<a id='0'></a>
### 0Ô∏è‚É£ Initialize notebook variables

Before running the tests, ensure you have set the following variables according to your environment:

In [None]:
import os
import sys, json, requests, time
sys.path.insert(1, '../shared')  # add the shared directory to the Python path
import utils
from apimtools import APIMClientTool

inference_api_version = "2024-05-01-preview"

targetInferenceApi = "models"  # use 'models' for universal LLM API, or 'openai' for Azure OpenAI

governance_hub_resource_group = "REPLACE"  ## specify the resource group name where the Governance Hub is located
location = "REPLACE"  ## specify the location of the Governance Hub

# Optional: Azure AI Foundry configuration (for Foundry connection integration)
# Set use_foundry_integration = True to enable Foundry connection creation
use_foundry_integration = True
foundry_subscription_id = "00000000-0000-0000-0000-000000000000"  # Replace with your Foundry subscription ID
foundry_resource_group = "REPLACE"  # Replace with your Foundry resource group
foundry_account_name = "REPLACE"  # Replace with your AI Foundry account name
foundry_project_name = "REPLACE"  # Replace with your AI Foundry project name

# Optional: Key Vault configuration
use_keyvault_integration = True
keyvault_subscription_id = "00000000-0000-0000-0000-000000000000"  # Replace with your Key Vault subscription ID
keyvault_resource_group = "REPLACE"  # Replace with your Key Vault resource group
keyvault_name = "REPLACE"  # Replace with your Key Vault name

<a id='1'></a>
### 1Ô∏è‚É£ Verify the Azure CLI and the connected Azure subscription

The following commands ensure that you have the latest version of the Azure CLI and that the Azure CLI is connected to your Azure subscription.

In [None]:
output = utils.run("az account show", "Retrieved az account", "Failed to get the current az account")

if output.success and output.json_data:
    current_user = output.json_data['user']['name']
    tenant_id = output.json_data['tenantId']
    subscription_id = output.json_data['id']

    utils.print_info(f"Current user: {current_user}")
    utils.print_info(f"Tenant ID: {tenant_id}")
    utils.print_info(f"Subscription ID: {subscription_id}")

<a id='init'></a>
### ‚öôÔ∏è Initialize client tool for your APIM service

üëâ An existing Citadel's Governance Hub deployment is expected with already on-boarded LLM models

In [None]:
try:
    apimClientTool = APIMClientTool(
        governance_hub_resource_group
    )
    apimClientTool.initialize()
    apimClientTool.discover_api(targetInferenceApi)

    apim_resource_gateway_url = str(apimClientTool.apim_resource_gateway_url)
    azure_endpoint = str(apimClientTool.azure_endpoint)
    
    # Get supported models from the policy fragment
    supported_models = apimClientTool.get_policy_fragment_supported_models("set-backend-pools")
    utils.print_info(f"Supported models in APIM policy fragment 'set-backend-pools': {supported_models}")

    if targetInferenceApi == "openai":
        chat_completions_url = f"{azure_endpoint}openai/deployments/{{model_name}}/chat/completions?api-version={inference_api_version}"
    else:  # models
        chat_completions_url = f"{azure_endpoint}models/chat/completions?api-version={inference_api_version}"
    utils.print_info(f"Chat Completion Endpoint Template: {chat_completions_url}")

    utils.print_info(f"Using the following API: {apimClientTool.api_id}")

    utils.print_ok(f"Testing tool initialized successfully!")
except Exception as e:
    utils.print_error(f"Error initializing APIM Client Tool: {e}")

<a id='2'></a>
### 2Ô∏è‚É£ Define Access Contract Configurations

We will create 3 different access contracts with varying default configurations:
1. **Sales-Assistant**: Key Vault integration only
2. **HR-ChatAgent**: Key Vault + Foundry connection integrations (if enabled)
3. **Support-Bot**: Direct output (no Key Vault nor Foundry integration)

In [None]:
# Define the 3 access contracts to create
timestamp = time.strftime('%Y%m%d%H%M%S')

access_contracts = [
    {
        "name": f"sales-assistant-contract-{timestamp}",
        "business_unit": "Sales",
        "use_case_name": "Assistant",
        "environment": "DEV",
        "use_keyvault": True,
        "use_foundry": False,
        "endpoint_secret": "SALES-LLM-ENDPOINT",
        "apikey_secret": "SALES-LLM-KEY",
        "description": "Sales Assistant - Key Vault only"
    },
    {
        "name": f"hr-chatagent-contract-{timestamp}",
        "business_unit": "HR",
        "use_case_name": "ChatAgent",
        "environment": "DEV",
        "use_keyvault": True,
        "use_foundry": True,
        "endpoint_secret": "HR-LLM-ENDPOINT",
        "apikey_secret": "HR-LLM-KEY",
        "description": "HR Chat Agent - Key Vault + Foundry (if enabled)"
    },
    {
        "name": f"support-bot-contract-{timestamp}",
        "business_unit": "Support",
        "use_case_name": "Bot",
        "environment": "DEV",
        "use_keyvault": False,
        "use_foundry": False,
        "endpoint_secret": "SUPPORT-LLM-ENDPOINT",
        "apikey_secret": "SUPPORT-LLM-KEY",
        "description": "Support Bot - Direct output (no Key Vault nor Foundry connection integration)"
    }
]

utils.print_info(f"Defined {len(access_contracts)} access contracts to create:")
for i, contract in enumerate(access_contracts, 1):
    utils.print_info(f"  {i}. {contract['description']}")
    utils.print_info(f"     Product ID: LLM-{contract['business_unit']}-{contract['use_case_name']}-{contract['environment']}")

<a id='3'></a>
### 3Ô∏è‚É£ Create Access Contract Parameter Files

Generate Bicep parameter files (`.bicepparam`) for each access contract.
These files configure the APIM products, subscriptions, and optionally Key Vault secrets and Foundry connections.

In [None]:
import shutil

bicep_dir = "../bicep/infra/citadel-access-contracts"
template_file = os.path.join(bicep_dir, "main.bicep")
default_policy_file = os.path.join(bicep_dir, "policies", "default-ai-product-policy.xml")

# Store generated parameter files for deployment
generated_param_files = []

for i, contract in enumerate(access_contracts, 1):
    utils.print_info(f"\n{'='*60}")
    utils.print_info(f"Creating Parameter File {i}/{len(access_contracts)}: {contract['description']}")
    utils.print_info(f"{'='*60}")
    
    # Create folder structure: contracts/[businessunit-usecase]/[environment]/
    folder_name = f"{contract['business_unit'].lower()}-{contract['use_case_name'].lower()}"
    environment_folder = contract['environment'].lower()
    contract_folder = os.path.join(bicep_dir, "contracts", folder_name, environment_folder)
    os.makedirs(contract_folder, exist_ok=True)
    utils.print_info(f"üìÅ Created folder: {contract_folder}")
    
    # Copy the default policy file to the contract folder
    policy_file_dest = os.path.join(contract_folder, "ai-product-policy.xml")
    shutil.copy(default_policy_file, policy_file_dest)
    utils.print_info(f"üìã Copied policy file: {policy_file_dest}")
    
    # Generate parameter file path (relative path for bicep reference)
    params_file = os.path.join(contract_folder, f"main.bicepparam")
    
    # Policy file path relative to the parameter file location (for bicep loadTextContent)
    policy_relative_path = "ai-product-policy.xml"
    
    # Build Foundry configuration section
    foundry_params = ""
    if contract['use_foundry']:
        foundry_params = f"""
// Azure AI Foundry Integration
param useTargetFoundry = true

param foundry = {{
  subscriptionId: '{foundry_subscription_id}'
  resourceGroupName: '{foundry_resource_group}'
  accountName: '{foundry_account_name}'
  projectName: '{foundry_project_name}'
}}

param foundryConfig = {{
  connectionNamePrefix: ''
  deploymentInPath: 'false'
  isSharedToAll: false
  inferenceAPIVersion: ''
  deploymentAPIVersion: ''
  staticModels: []
  listModelsEndpoint: ''
  getModelEndpoint: ''
  deploymentProvider: ''
  customHeaders: {{}}
  authConfig: {{}}
}}
"""
    else:
        foundry_params = """
// Azure AI Foundry Integration (disabled)
param useTargetFoundry = false

param foundry = {
  subscriptionId: '00000000-0000-0000-0000-000000000000'
  resourceGroupName: 'placeholder'
  accountName: 'placeholder'
  projectName: 'placeholder'
}
"""

    # Update the using path to account for the additional environment subfolder
    params_content = f"""using '../../../main.bicep'

// ============================================================================
// {contract['description']} - Generated from Notebook
// ============================================================================

param apim = {{
  subscriptionId: '{subscription_id}'
  resourceGroupName: '{governance_hub_resource_group}'
  name: '{apimClientTool.apim_resource_name}'
}}

param keyVault = {{
  subscriptionId: '{keyvault_subscription_id}'
  resourceGroupName: '{keyvault_resource_group}'
  name: '{keyvault_name}'
}}

param useTargetAzureKeyVault = {str(contract['use_keyvault']).lower()}

param useCase = {{
  businessUnit: '{contract['business_unit']}'
  useCaseName: '{contract['use_case_name']}'
  environment: '{contract['environment']}'
}}

param apiNameMapping = {{
  LLM: ['universal-llm-api', 'azure-openai-api']
}}

param services = [
  {{
    code: 'LLM'
    endpointSecretName: '{contract['endpoint_secret']}'
    apiKeySecretName: '{contract['apikey_secret']}'
    policyXml: loadTextContent('{policy_relative_path}')
  }}
]

param productTerms = 'Access Contract created from testing notebook - {contract["description"]}'
{foundry_params}
"""

    # Write the parameter file
    with open(params_file, 'w') as f:
        f.write(params_content)
    
    # Store for deployment step
    generated_param_files.append({
        "contract": contract,
        "params_file": params_file,
        "contract_folder": contract_folder
    })
    
    utils.print_ok(f"‚úÖ Parameter file created: {params_file}")

utils.print_ok(f"\nüìÅ Created {len(generated_param_files)} parameter files ready for deployment!")
utils.print_info("Each contract folder contains:")
utils.print_info("  ‚Ä¢ main.bicepparam - Bicep parameter file")
utils.print_info("  ‚Ä¢ ai-product-policy.xml - APIM product policy (customize as needed)")

<a id='3.1'></a>
### 3Ô∏è‚É£.1 Deploy Access Contracts using ü¶æ Bicep

Deploy each access contract using the generated parameter files.
This creates the APIM products, subscriptions, and optionally Key Vault secrets and Foundry connections in Azure.

In [None]:
# Store deployment results for later use
deployment_results = []

for i, item in enumerate(generated_param_files, 1):
    contract = item['contract']
    params_file = item['params_file']
    
    utils.print_info(f"\n{'='*60}")
    utils.print_info(f"Deploying Access Contract {i}/{len(generated_param_files)}: {contract['description']}")
    utils.print_info(f"{'='*60}")
    
    # Deploy the access contract
    deployment_cmd = f"az deployment sub create --name {contract['name']} --location {location} --template-file {template_file} --parameters {params_file}"
    
    utils.print_info(f"Deploying {contract['name']}...")
    output = utils.run(
        deployment_cmd,
        f"Deployment '{contract['name']}' succeeded",
        f"Deployment '{contract['name']}' failed"
    )

    if output.success:
        # Deployment succeeded - try to get outputs if JSON data is available
        outputs = {}
        if output.json_data:
            outputs = output.json_data.get('properties', {}).get('outputs', {})
        
        deployment_results.append({
            "contract": contract,
            "outputs": outputs,
            "success": True
        })
        utils.print_ok(f"‚úÖ Access Contract {i} deployed successfully!")
        
        # Show key outputs if available
        if outputs:
            for key, value in outputs.items():
                masked_value = utils.mask_sensitive_values(value.get('value'))
                utils.print_info(f"  {key}: {masked_value}")
        else:
            utils.print_info("  (No outputs returned - deployment completed)")
    else:
        deployment_results.append({
            "contract": contract,
            "outputs": {},
            "success": False
        })
        utils.print_error(f"‚ùå Access Contract {i} deployment failed!")

# Re-initialize APIM client to pick up new subscriptions
apimClientTool.initialize()
utils.print_ok(f"\nüéâ Completed deploying {len([r for r in deployment_results if r['success']])} access contracts!")

<a id='4'></a>
### 4Ô∏è‚É£ Retrieve API Keys for Each Access Contract

Get the subscription keys created for each access contract to use in API testing.

In [None]:
# Map contract names to their subscription keys
contract_keys = {}

for result in deployment_results:
    if not result['success']:
        continue
    
    contract = result['contract']
    product_id = f"LLM-{contract['business_unit']}-{contract['use_case_name']}-{contract['environment']}"
    subscription_name = f"{product_id}-SUB-01"
    
    # Find the subscription key from APIM subscriptions
    for sub in apimClientTool.apim_subscriptions:
        if subscription_name.lower() in sub.get('name', '').lower():
            contract_keys[product_id] = {
                "key": sub.get('key'),
                "description": contract['description'],
                "use_keyvault": contract['use_keyvault'],
                "use_foundry": contract['use_foundry']
            }
            utils.print_ok(f"Found key for {product_id}")
            break
    else:
        # If not found in existing subscriptions, check outputs for direct credentials
        if not contract['use_keyvault']:
            endpoints = result['outputs'].get('endpoints', {}).get('value', [])
            for ep in endpoints:
                if ep.get('code') == 'LLM':
                    contract_keys[product_id] = {
                        "key": ep.get('apiKey'),
                        "endpoint": ep.get('endpoint'),
                        "description": contract['description'],
                        "use_keyvault": contract['use_keyvault'],
                        "use_foundry": contract['use_foundry']
                    }
                    utils.print_ok(f"Found direct key for {product_id}")
                    break

utils.print_info(f"\nRetrieved keys for {len(contract_keys)} access contracts:")
for product_id, info in contract_keys.items():
    utils.print_info(f"  ‚Ä¢ {product_id}: {info['description']}")

<a id='5'></a>
### 5Ô∏è‚É£ Test API Requests Across All Access Contracts

Send test requests to each access contract and collect metrics for visualization.

In [None]:
model_name = supported_models[2] if len(supported_models) > 2 else supported_models[0]
utils.print_info(f"Using model: {model_name}")

# Store results for each contract
test_results = {product_id: [] for product_id in contract_keys.keys()}

messages = {
    "model": model_name,
    "messages": [
        {"role": "system", "content": "You are a helpful assistant. Keep responses brief."},
        {"role": "user", "content": "What is 2+2?"}
    ]
}

# Send a single test request to each contract
for product_id, info in contract_keys.items():
    utils.print_info(f"\nTesting {product_id}...")
    
    api_key = info.get('key')
    if not api_key:
        utils.print_error(f"No API key found for {product_id}")
        continue
    
    try:
        response = requests.post(
            chat_completions_url,
            headers={'api-key': api_key},
            json=messages,
            timeout=30
        )
        
        utils.print_response_code(response)
        
        if response.status_code == 200:
            data = json.loads(response.text)
            content = data.get("choices", [{}])[0].get("message", {}).get("content", "")
            utils.print_ok(f"üí¨ Response: {content[:100]}..." if len(content) > 100 else f"üí¨ Response: {content}")
            utils.print_info(f"   Region: {response.headers.get('x-ms-region', 'N/A')}")
        else:
            utils.print_error(f"Error: {response.text[:200]}")
    except Exception as e:
        utils.print_error(f"Request failed: {e}")

<a id='6'></a>
### 6Ô∏è‚É£ Run Load Test Across All Access Contracts

Send multiple requests to each contract over 30 seconds to test rate limiting and collect performance data.

In [None]:
import requests, json, time
from concurrent.futures import ThreadPoolExecutor
import threading

# Run for 30 seconds per contract
test_duration = 30
all_api_runs = {product_id: [] for product_id in contract_keys.keys()}

model_name = supported_models[2] if len(supported_models) > 2 else supported_models[0]
utils.print_info(f"Using model: {model_name}")

messages = {
    "model": model_name,
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Count from 1 to 10."}
    ]
}

def run_api_test(product_id, api_key, duration):
    """Run API calls for a specific contract."""
    runs = []
    start_time = time.time()
    run_count = 0
    
    while (time.time() - start_time) < duration:
        run_count += 1
        call_start_time = time.time()
        
        try:
            response = requests.post(
                chat_completions_url,
                headers={'api-key': api_key},
                json=messages,
                timeout=30
            )
            
            elapsed = time.time() - start_time
            
            if response.status_code == 200:
                data = json.loads(response.text)
                total_tokens = data.get("usage", {}).get("total_tokens", 0)
            else:
                total_tokens = 0
            
            runs.append((call_start_time, total_tokens, response.status_code, elapsed))
            
        except Exception as e:
            runs.append((call_start_time, 0, 500, time.time() - start_time))
        
        time.sleep(0.2)  # Small delay between requests
    
    return runs

# Run tests for each contract sequentially
for product_id, info in contract_keys.items():
    api_key = info.get('key')
    if not api_key:
        continue
    
    print(f"\nüïê Testing {product_id} for {test_duration} seconds...")
    print(f"   {info['description']}")
    
    runs = run_api_test(product_id, api_key, test_duration)
    all_api_runs[product_id] = runs
    
    success = sum(1 for r in runs if r[2] == 200)
    throttled = sum(1 for r in runs if r[2] == 429)
    errors = sum(1 for r in runs if r[2] not in [200, 429])
    
    print(f"   ‚úÖ Success: {success} | ‚õî Throttled: {throttled} | ‚ùå Errors: {errors}")

utils.print_ok(f"\nüèÅ Load testing completed for all {len(contract_keys)} access contracts!")

<a id='7'></a>
### 7Ô∏è‚É£ Visualize Results Across All Access Contracts

Compare API usage, token consumption, and throttling behavior across all access contracts.

In [None]:
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
from matplotlib.lines import Line2D
import numpy as np

# Check if we have data to plot
contracts_with_data = {k: v for k, v in all_api_runs.items() if v}

if contracts_with_data:
    # Print summary table
    print("\n" + "="*80)
    print("üìä SUMMARY: Access Contracts Performance Comparison")
    print("="*80)
    print(f"{'Contract':<40} {'Calls':<8} {'Success':<10} {'Throttled':<10} {'Tokens':<10}")
    print("-"*80)
    
    for product_id, runs in contracts_with_data.items():
        success = sum(1 for r in runs if r[2] == 200)
        throttled = sum(1 for r in runs if r[2] == 429)
        total_tokens = sum(r[1] for r in runs)
        print(f"{product_id:<40} {len(runs):<8} {success:<10} {throttled:<10} {total_tokens:<10}")
    
    print("="*80)
    
    num_contracts = len(contracts_with_data)
    fig, axes = plt.subplots(num_contracts, 1, figsize=(14, 5 * num_contracts), squeeze=False)
    
    colors_map = {'success': 'tab:green', 'throttled': 'tab:red', 'error': 'tab:orange'}
    
    for idx, (product_id, runs) in enumerate(contracts_with_data.items()):
        ax = axes[idx, 0]
        
        if not runs:
            ax.text(0.5, 0.5, 'No data', ha='center', va='center')
            ax.set_title(product_id)
            continue
        
        # Process data
        base_time = runs[0][0]
        times = [r[3] for r in runs]  # elapsed time
        tokens = [r[1] for r in runs]
        status_codes = [r[2] for r in runs]
        
        # Color bars based on status
        colors = [
            colors_map['success'] if code == 200 
            else colors_map['throttled'] if code == 429 
            else colors_map['error'] 
            for code in status_codes
        ]
        
        # Create bar chart
        ax.bar(times, tokens, color=colors, width=0.3, alpha=0.7)
        
        # Add throttled markers
        throttled_times = [t for t, code in zip(times, status_codes) if code == 429]
        if throttled_times:
            max_tokens = max(tokens) if tokens else 1
            ax.scatter(throttled_times, [max_tokens * 0.05] * len(throttled_times), 
                      marker='x', s=50, color='darkred', zorder=5)
        
        # Calculate stats
        success = sum(1 for code in status_codes if code == 200)
        throttled = sum(1 for code in status_codes if code == 429)
        total_tokens = sum(tokens)
        
        # Labels and title
        info = contract_keys.get(product_id, {})
        title = f"{product_id}\n{info.get('description', '')}"
        ax.set_title(title, fontsize=11, fontweight='bold')
        ax.set_xlabel('Time (seconds)')
        ax.set_ylabel('Tokens per call')
        
        # Add stats annotation
        stats_text = f"Total: {len(runs)} calls | Success: {success} | Throttled: {throttled} | Tokens: {total_tokens}"
        ax.text(0.02, 0.98, stats_text, transform=ax.transAxes, fontsize=9,
               verticalalignment='top', bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
    
    # Add shared legend
    legend_items = [
        Patch(facecolor='tab:green', alpha=0.7, label='Success (200)'),
        Patch(facecolor='tab:red', alpha=0.7, label='Throttled (429)'),
        Patch(facecolor='tab:orange', alpha=0.7, label='Error'),
        Line2D([0], [0], marker='x', color='darkred', markersize=8, linestyle='None', label='Throttle point')
    ]
    fig.legend(handles=legend_items, loc='upper right', bbox_to_anchor=(0.98, 0.99))
    
    plt.tight_layout()
    plt.subplots_adjust(top=0.95)
    plt.show()
    
else:
    print('No API test data available. Run the load test first to capture data.')

<a id='8'></a>
### 8Ô∏è‚É£ Compare Token Bucket Behavior

Visualize the token bucket algorithm behavior for each access contract.

In [None]:
import matplotlib.pyplot as plt
from matplotlib.patches import Patch
from matplotlib.lines import Line2D

contracts_with_data = {k: v for k, v in all_api_runs.items() if v}

if contracts_with_data:
    # Token bucket parameters (default policy: 400 tokens/min)
    capacity = 400
    refill = capacity / 60  # tokens per second
    
    fig, axes = plt.subplots(len(contracts_with_data), 1, figsize=(14, 6 * len(contracts_with_data)), squeeze=False)
    
    for idx, (product_id, runs) in enumerate(contracts_with_data.items()):
        ax1 = axes[idx, 0]
        ax2 = ax1.twinx()
        
        # Process data for token bucket simulation
        calls = [(r[3], r[1] or 0, r[2]) for r in runs]  # (elapsed_time, tokens, status)
        
        bucket = capacity
        last_time = 0.0
        times, usage, status_codes, levels = [], [], [], []
        
        for call_time, tokens, status in calls:
            # Refill bucket
            bucket = min(capacity, bucket + (call_time - last_time) * refill)
            levels.append(bucket)
            times.append(call_time)
            usage.append(tokens)
            status_codes.append(status)
            # Consume tokens
            bucket = max(0, bucket - tokens)
            last_time = call_time
        
        # Colors based on status
        colors = ['tab:green' if code == 200 else 'tab:red' if code == 429 else 'tab:orange' for code in status_codes]
        
        # Plot bars for token usage
        ax1.bar(times, usage, color=colors, width=0.35, alpha=0.7)
        
        # Plot bucket level
        ax2.plot(times, levels, color='purple', linewidth=2)
        ax2.axhline(capacity, color='purple', linestyle='--', alpha=0.6)
        
        # Mark throttled points
        throttled_times = [t for t, code in zip(times, status_codes) if code == 429]
        throttled_usage = [u for u, code in zip(usage, status_codes) if code == 429]
        if throttled_times:
            max_usage = max(usage) if usage else 0
            throttled_marker_heights = [u + max_usage * 0.01 for u in throttled_usage]
            ax1.scatter(throttled_times, throttled_marker_heights, marker='o', s=20, 
                       color='darkred', edgecolors='white', linewidth=0.4, zorder=6)
        
        # Labels
        ax1.set_xlabel('Seconds')
        ax1.set_ylabel('Tokens per call')
        ax2.set_ylabel('Tokens in bucket', color='purple')
        ax2.tick_params(axis='y', labelcolor='purple')
        
        info = contract_keys.get(product_id, {})
        ax1.set_title(f'Token Bucket Behavior: {product_id}\n{info.get("description", "")}')
        
        # Stats
        success = sum(code == 200 for code in status_codes)
        throttled = sum(code == 429 for code in status_codes)
        print(f"{product_id}: Calls: {len(status_codes)} | Success: {success} | Throttled: {throttled}")
    
    # Add legend to first subplot
    legend_items = [
        Patch(facecolor='tab:green', alpha=0.7, label='Success (200)'),
        Line2D([0], [0], color='purple', linewidth=2, label='Bucket level'),
        Line2D([0], [0], color='purple', linestyle='--', label='Capacity'),
        Line2D([0], [0], marker='o', color='darkred', markersize=8, linestyle='None',
               markerfacecolor='darkred', markeredgecolor='white', label='Throttled (429)')
    ]
    axes[0, 0].legend(handles=legend_items, loc='upper right', bbox_to_anchor=(0.98, 0.85), framealpha=0.9)
    
    plt.tight_layout()
    plt.show()
else:
    print('Run the load test first to capture api_runs data.')

<a id='cleanup'></a>
### üßπ Cleanup (Optional)

Remove the test access contracts from APIM created during this notebook session.

> **Note:** This will not delete any created secrets in Azure Key Vault or Microsoft Foundry connection.

In [None]:
# Set to True to delete the access contracts created in this session
cleanup_enabled = True

if cleanup_enabled:
    for result in deployment_results:
        if not result['success']:
            continue
        
        contract = result['contract']
        product_id = f"LLM-{contract['business_unit']}-{contract['use_case_name']}-{contract['environment']}"
        subscription_name = f"{product_id}-SUB-01"
        
        utils.print_info(f"Deleting {product_id}...")
        
        # Delete product and its associated subscriptions
        prod_cmd = f"az apim product delete --resource-group {governance_hub_resource_group} --service-name {apimClientTool.apim_resource_name} --product-id {product_id} --delete-subscriptions true --yes"
        utils.run(prod_cmd, f"Deleted product {product_id}", f"Failed to delete product")
    
    utils.print_ok("Cleanup completed!")
else:
    utils.print_info("Cleanup is disabled. Set cleanup_enabled = True to remove test resources.")