# Fabric Capacity Utilization Monitoring

**Production monitoring solution** that collects capacity metrics and workload data from Microsoft Fabric REST APIs and sends to Azure Log Analytics.

## What This Monitors
- ✅ **Capacity Inventory**: Basic capacity details, workspace counts, item distribution
- ✅ **Workload Data**: Item-based workload tracking across all workspaces
- ✅ **Workspace Events**: Workspace lifecycle and item details

## Data Streams
- `Custom-FabricCapacityMetrics_CL` - Capacity utilization metrics
- `Custom-FabricCapacityWorkloads_CL` - Workload and item inventory
- `Custom-FabricWorkspaceEvents_CL` - Workspace events and changes
- `Custom-FabricItemDetails_CL` - Enhanced item metadata

In [None]:
# Install required packages
%pip install --quiet msal requests azure-identity azure-keyvault-secrets python-dotenv

In [None]:
# Configuration Parameters
import os
from dotenv import load_dotenv
load_dotenv()

# Capacity Configuration
capacity_ids = ["95e3bf4e-6e0e-427d-9c78-88823f6d507c"]  # Leave empty for auto-discovery

# Azure Log Analytics Configuration (from your Terraform/Bicep deployment)
dcr_endpoint_host = os.getenv("DCR_ENDPOINT_HOST")
dcr_immutable_id = os.getenv("DCR_IMMUTABLE_ID")
stream_capacity_metrics = "Custom-FabricCapacityMetrics_CL"
stream_capacity_workloads = "Custom-FabricCapacityWorkloads_CL"
stream_workspace_events = "Custom-FabricWorkspaceEvents_CL"
stream_item_details = "Custom-FabricItemDetails_CL"

# Authentication Configuration
tenant_id = os.getenv("FABRIC_TENANT_ID")
client_id = os.getenv("FABRIC_APP_ID")
client_secret = os.getenv("FABRIC_APP_SECRET")

# Key Vault Configuration (optional)
use_key_vault = False
key_vault_uri = os.getenv("AZURE_KEY_VAULT_URI", "https://kaydemokeyvault.vault.azure.net/")
key_vault_secret_name = os.getenv("AZURE_KEY_VAULT_SECRET_NAME", "FabricServicePrincipal")

# Validation
required_vars = [tenant_id, client_id, dcr_endpoint_host, dcr_immutable_id]
if not client_secret and not use_key_vault:
    required_vars.append(client_secret)

if not all(required_vars):
    missing = []
    if not tenant_id: missing.append("FABRIC_TENANT_ID")
    if not client_id: missing.append("FABRIC_APP_ID")
    if not client_secret and not use_key_vault: missing.append("FABRIC_APP_SECRET")
    if not dcr_endpoint_host: missing.append("DCR_ENDPOINT_HOST")
    if not dcr_immutable_id: missing.append("DCR_IMMUTABLE_ID")
    raise ValueError(f"Missing required environment variables: {', '.join(missing)}")

print("✅ Configuration loaded successfully")
print(f"📊 Monitoring: {'Specific capacities' if capacity_ids else 'All accessible capacities'}")

In [None]:
# Core Functions
import json
import datetime as dt
import requests
from typing import List, Dict, Any

FABRIC_SCOPE = "https://api.fabric.microsoft.com/.default"
MONITOR_SCOPE = "https://monitor.azure.com/.default"
FABRIC_API = "https://api.fabric.microsoft.com/v1"

def acquire_token(tenant: str, client_id: str, client_secret: str, scope: str) -> str:
    """Acquire OAuth token for API access"""
    import msal
    authority = f"https://login.microsoftonline.com/{tenant}"
    app = msal.ConfidentialClientApplication(client_id, authority=authority, client_credential=client_secret)
    result = app.acquire_token_for_client(scopes=[scope])
    if "access_token" not in result:
        raise RuntimeError(f"Failed to acquire token: {result}")
    return result["access_token"]

def get_secret_from_key_vault(vault_uri: str, secret_name: str) -> str:
    """Get secret from Azure Key Vault using managed identity"""
    from azure.keyvault.secrets import SecretClient
    from azure.identity import DefaultAzureCredential
    credential = DefaultAzureCredential()
    client = SecretClient(vault_url=vault_uri, credential=credential)
    return client.get_secret(secret_name).value

def iso_now() -> str:
    """Get current timestamp in ISO format"""
    return dt.datetime.utcnow().replace(tzinfo=dt.timezone.utc).isoformat().replace("+00:00", "Z")

def post_to_log_analytics(endpoint_host: str, dcr_id: str, stream_name: str, data: List[Dict], token: str) -> Dict:
    """Send data to Azure Log Analytics via Data Collection Rule"""
    if not data:
        return {"sent": 0, "batches": 0}
    
    url = f"https://{endpoint_host}/dataCollectionRules/{dcr_id}/streams/{stream_name}?api-version=2023-01-01"
    headers = {
        "Authorization": f"Bearer {token}", 
        "Content-Type": "application/json"
    }
    
    # Split large payloads into batches (max 1MB per request)
    MAX_BYTES = 950_000
    batches = []
    current_batch = []
    current_size = 2  # Start with array brackets
    
    for record in data:
        record_size = len(json.dumps(record, separators=(",", ":")))
        if current_size + record_size + (1 if current_batch else 0) > MAX_BYTES:
            if current_batch:
                batches.append(current_batch)
                current_batch = []
                current_size = 2
        current_batch.append(record)
        current_size += record_size + (1 if len(current_batch) > 1 else 0)
    
    if current_batch:
        batches.append(current_batch)
    
    # Send each batch
    total_sent = 0
    for i, batch in enumerate(batches, 1):
        response = requests.post(url, headers=headers, data=json.dumps(batch), timeout=60)
        if response.status_code >= 400:
            raise RuntimeError(f"Batch {i} failed ({response.status_code}): {response.text[:200]}")
        total_sent += len(batch)
    
    return {"sent": total_sent, "batches": len(batches)}

print("✅ Core functions loaded")

In [None]:
# Data Collection Functions

def get_capacities(token: str) -> List[Dict]:
    """Get all accessible Fabric capacities"""
    headers = {"Authorization": f"Bearer {token}"}
    response = requests.get(f"{FABRIC_API}/capacities", headers=headers, timeout=60)
    response.raise_for_status()
    return response.json().get("value", [])

def get_workspaces(token: str) -> List[Dict]:
    """Get all accessible workspaces"""
    headers = {"Authorization": f"Bearer {token}"}
    response = requests.get(f"{FABRIC_API}/workspaces", headers=headers, timeout=60)
    response.raise_for_status()
    return response.json().get("value", [])

def get_workspace_items(workspace_id: str, token: str) -> List[Dict]:
    """Get all items in a workspace"""
    headers = {"Authorization": f"Bearer {token}"}
    response = requests.get(f"{FABRIC_API}/workspaces/{workspace_id}/items", headers=headers, timeout=60)
    if response.status_code == 200:
        return response.json().get("value", [])
    return []

def collect_capacity_metrics(capacities: List[Dict], workspaces: List[Dict]) -> List[Dict]:
    """Transform capacity data into metrics format"""
    metrics = []
    
    for capacity in capacities:
        capacity_id = capacity.get('id')
        capacity_workspaces = [ws for ws in workspaces if ws.get('capacityId') == capacity_id]
        
        # Determine capacity type
        sku = capacity.get('sku', '').upper()
        if sku.startswith('F'):
            capacity_type = "Fabric (F-SKU)"
        elif sku.startswith('P') and not sku.startswith('PP'):
            capacity_type = "Premium (P-SKU)"
        elif 'PREMIUM PER USER' in capacity.get('displayName', '').upper():
            capacity_type = "Premium Per User (PPU)"
        else:
            capacity_type = f"Other ({sku})"
        
        metric = {
            "TimeGenerated": iso_now(),
            "CapacityId": capacity_id,
            "CapacityName": capacity.get('displayName', 'Unknown'),
            "CapacityType": capacity_type,
            "CapacitySku": sku,
            "Region": capacity.get('region', 'Unknown'),
            "State": capacity.get('state', 'Unknown'),
            "WorkspaceCount": len(capacity_workspaces),
            "MonitoringMethod": "workspace-based"
        }
        metrics.append(metric)
    
    return metrics

def collect_workspace_workloads(workspaces: List[Dict], token: str) -> List[Dict]:
    """Collect workload data from workspace items"""
    workloads = []
    
    for workspace in workspaces:
        workspace_id = workspace.get('id')
        workspace_name = workspace.get('displayName', 'Unknown')
        capacity_id = workspace.get('capacityId', '')
        
        items = get_workspace_items(workspace_id, token)
        
        for item in items:
            workload = {
                "TimeGenerated": iso_now(),
                "CapacityId": capacity_id,
                "WorkspaceId": workspace_id,
                "WorkspaceName": workspace_name,
                "WorkloadType": item.get('type', 'Unknown'),
                "ItemId": item.get('id'),
                "ItemName": item.get('displayName', 'Unknown'),
                "ItemDescription": item.get('description', ''),
                "State": "Active"  # Items exist = active
            }
            workloads.append(workload)
    
    return workloads

def collect_workspace_events(workspaces: List[Dict], token: str) -> List[Dict]:
    """Collect workspace event data"""
    events = []
    
    for workspace in workspaces:
        workspace_id = workspace.get('id')
        items = get_workspace_items(workspace_id, token)
        
        # Count items by type
        item_types = {}
        for item in items:
            item_type = item.get('type', 'Unknown')
            item_types[item_type] = item_types.get(item_type, 0) + 1
        
        event = {
            "TimeGenerated": iso_now(),
            "EventType": "WorkspaceInventory",
            "WorkspaceId": workspace_id,
            "WorkspaceName": workspace.get('displayName', 'Unknown'),
            "WorkspaceType": workspace.get('type', 'Unknown'),
            "CapacityId": workspace.get('capacityId', ''),
            "ItemCount": len(items),
            "ItemTypes": json.dumps(item_types),
            "IsOnDedicatedCapacity": workspace.get('capacityId') is not None
        }
        events.append(event)
    
    return events

def collect_item_details(workspaces: List[Dict], token: str) -> List[Dict]:
    """Collect detailed item information"""
    details = []
    
    for workspace in workspaces:
        workspace_id = workspace.get('id')
        workspace_name = workspace.get('displayName', 'Unknown')
        capacity_id = workspace.get('capacityId', '')
        
        items = get_workspace_items(workspace_id, token)
        
        for item in items:
            detail = {
                "TimeGenerated": iso_now(),
                "WorkspaceId": workspace_id,
                "WorkspaceName": workspace_name,
                "CapacityId": capacity_id,
                "ItemId": item.get('id'),
                "ItemName": item.get('displayName', 'Unknown'),
                "ItemType": item.get('type', 'Unknown'),
                "ItemDescription": item.get('description', ''),
                "IsOnDedicatedCapacity": capacity_id != ''
            }
            details.append(detail)
    
    return details

print("✅ Data collection functions loaded")

In [None]:
# Authentication Setup
print("🔐 Setting up authentication...")

# Get client secret
if use_key_vault:
    print("   📋 Getting secret from Key Vault...")
    client_secret = get_secret_from_key_vault(key_vault_uri, key_vault_secret_name)
else:
    print("   📋 Using environment variable...")

# Acquire tokens
print("   🎫 Acquiring Fabric API token...")
fabric_token = acquire_token(tenant_id, client_id, client_secret, FABRIC_SCOPE)

print("   🎫 Acquiring Monitor API token...")
monitor_token = acquire_token(tenant_id, client_id, client_secret, MONITOR_SCOPE)

print("✅ Authentication completed")

In [None]:
# Data Collection
print("📊 Starting data collection...")

# Get base data
print("   🏢 Fetching capacities...")
all_capacities = get_capacities(fabric_token)

print("   📁 Fetching workspaces...")
all_workspaces = get_workspaces(fabric_token)

# Filter capacities if specific IDs provided
if capacity_ids:
    target_ids = [cid.lower() for cid in capacity_ids]
    capacities_to_monitor = [cap for cap in all_capacities 
                           if cap.get('id', '').lower() in target_ids]
    print(f"   🎯 Filtered to {len(capacities_to_monitor)} specific capacities")
else:
    capacities_to_monitor = all_capacities
    print(f"   🌐 Monitoring all {len(capacities_to_monitor)} accessible capacities")

# Collect monitoring data
print("   📈 Collecting capacity metrics...")
capacity_metrics = collect_capacity_metrics(capacities_to_monitor, all_workspaces)

print("   💼 Collecting workload data...")
workload_data = collect_workspace_workloads(all_workspaces, fabric_token)

print("   📋 Collecting workspace events...")
workspace_events = collect_workspace_events(all_workspaces, fabric_token)

print("   📦 Collecting item details...")
item_details = collect_item_details(all_workspaces, fabric_token)

# Summary
print(f"\n📊 Collection Summary:")
print(f"   Capacity metrics: {len(capacity_metrics)} records")
print(f"   Workload data: {len(workload_data)} records")
print(f"   Workspace events: {len(workspace_events)} records")
print(f"   Item details: {len(item_details)} records")

print("✅ Data collection completed")

In [None]:
# Send Data to Log Analytics
print("📤 Sending data to Azure Log Analytics...")

results = {}

# Send capacity metrics
if capacity_metrics:
    print(f"   📈 Sending {len(capacity_metrics)} capacity metrics...")
    result = post_to_log_analytics(dcr_endpoint_host, dcr_immutable_id, 
                                 stream_capacity_metrics, capacity_metrics, monitor_token)
    results["capacity_metrics"] = result
    print(f"      ✅ Sent {result['sent']} records in {result['batches']} batches")

# Send workload data
if workload_data:
    print(f"   💼 Sending {len(workload_data)} workload records...")
    result = post_to_log_analytics(dcr_endpoint_host, dcr_immutable_id, 
                                 stream_capacity_workloads, workload_data, monitor_token)
    results["workload_data"] = result
    print(f"      ✅ Sent {result['sent']} records in {result['batches']} batches")

# Send workspace events (if DCR stream exists)
if workspace_events:
    try:
        print(f"   📋 Sending {len(workspace_events)} workspace events...")
        result = post_to_log_analytics(dcr_endpoint_host, dcr_immutable_id, 
                                     stream_workspace_events, workspace_events, monitor_token)
        results["workspace_events"] = result
        print(f"      ✅ Sent {result['sent']} records in {result['batches']} batches")
    except RuntimeError as e:
        if "not configured" in str(e):
            print(f"      ⚠️  Stream {stream_workspace_events} not configured in DCR - skipping")
        else:
            raise

# Send item details (if DCR stream exists)
if item_details:
    try:
        print(f"   📦 Sending {len(item_details)} item details...")
        result = post_to_log_analytics(dcr_endpoint_host, dcr_immutable_id, 
                                     stream_item_details, item_details, monitor_token)
        results["item_details"] = result
        print(f"      ✅ Sent {result['sent']} records in {result['batches']} batches")
    except RuntimeError as e:
        if "not configured" in str(e):
            print(f"      ⚠️  Stream {stream_item_details} not configured in DCR - skipping")
        else:
            raise

print("\n✅ Data ingestion completed successfully!")
print(f"\n📋 Final Results:")
for stream_type, result in results.items():
    print(f"   {stream_type}: {result['sent']} records sent")

## ✅ Monitoring Complete

### Infrastructure Setup
**Important**: Before running this notebook, deploy the required Azure infrastructure using:
- **Terraform**: `logAnalytics/terraform/` folder
- **Bicep**: `logAnalytics/bicep/` folder

Both create:
- Log Analytics Workspace with custom tables
- Data Collection Endpoint (DCE)
- Data Collection Rules (DCR) with all required streams

### Data Collected
- **Capacity Metrics**: Basic capacity inventory with workspace counts
- **Workload Data**: Item-based workload tracking across workspaces  
- **Workspace Events**: Workspace inventory and item distribution
- **Item Details**: Enhanced metadata for all Fabric items

### Next Steps
1. **Deploy Infrastructure**: Use Terraform or Bicep templates first
2. **Configure Environment**: Set DCR_ENDPOINT_HOST and DCR_IMMUTABLE_ID from deployment outputs
3. **Schedule this notebook** to run regularly (daily/weekly)
4. **Build dashboards** in Azure Monitor/Power BI using the collected data
5. **Set up alerts** based on capacity utilization trends

### KQL Query Examples
```kql
// Capacity utilization overview
FabricCapacityMetrics_CL
| summarize arg_max(TimeGenerated, *) by CapacityId
| project CapacityName, CapacityType, WorkspaceCount, Region

// Item distribution by type
FabricCapacityWorkloads_CL
| summarize count() by WorkloadType, CapacityId
| order by count_ desc

// Workspace activity
FabricWorkspaceEvents_CL
| where TimeGenerated > ago(7d)
| summarize TotalItems = sum(ItemCount) by WorkspaceName
```

### Infrastructure Files
- `logAnalytics/terraform/` - Terraform templates for Azure resources
- `logAnalytics/bicep/` - Bicep templates for Azure resources
- `logAnalytics/common/` - Shared JSON templates for table and DCR definitions