# Fabric Spark Monitoring with Log Analytics Integration

## Overview
This notebook demonstrates comprehensive Spark monitoring for Microsoft Fabric using **(Livy Sessions)** and **(Resource Usage)** APIs, with data ingestion into Azure Monitor Log Analytics.

### 🎯 What This Notebook Collects

#### Livy Sessions Monitoring
1. **Workspace Livy Sessions** - All interactive Spark sessions in workspace
2. **Notebook Sessions** - Spark sessions launched from notebooks
3. **Spark Job Sessions** - Sessions from Spark job definitions
4. **Lakehouse Sessions** - Sessions accessing Lakehouse data
5. **Session Logs** - Driver and executor logs per session
6. **History Metrics** - Performance metrics from Spark History Server

#### Resource Usage Monitoring
7. **Spark Resource Usage** - CPU, memory, disk, network metrics
8. **Active Session Resources** - Real-time resource tracking
9. **Bottleneck Detection** - Identify CPU/memory/disk/network issues
10. **Capacity Analysis** - Executor efficiency and resource utilization

### 📊 Data Ingestion Streams
All data is sent to Azure Monitor via Data Collection Rules (DCR):
- `FabricSparkLivySession_CL` - Session metadata (17 columns)
- `FabricSparkLogs_CL` - Driver/executor logs (13 columns)
- `FabricSparkHistoryMetrics_CL` - Performance metrics (19 columns)
- `FabricSparkResourceUsage_CL` - Resource utilization (24 columns)

### Prerequisites
- ✅ Azure authentication (DefaultAzureCredential or service principal)
- ✅ Infrastructure deployed (DCE, DCR, Log Analytics workspace)
- ✅ `.env` file configured with Azure Monitor settings
- ✅ Fabric workspace with active Spark sessions

### Value Proposition
- 🎯 **Complete Spark Observability** - Sessions, logs, metrics, resources
- 📈 **Performance Analysis** - Identify slow jobs, resource bottlenecks
- 🚨 **Proactive Alerting** - Detect failures and resource exhaustion
- 💰 **Cost Optimization** - Track executor usage and capacity waste
- 📊 **Historical Trends** - Long-term performance and capacity planning

## Setup and Configuration

### Load Environment Configuration

Load environment variables from `.env` file and verify Azure Monitor configuration.

In [54]:
import os
from pathlib import Path
from dotenv import load_dotenv

print("=" * 80)
print("⚙️ LOADING ENVIRONMENT CONFIGURATION")
print("=" * 80)

# Try to load .env from multiple locations
env_locations = [
    Path.cwd() / ".env",  # Current directory
    Path.cwd().parent / ".env",  # Parent directory (if running from notebooks/)
    (
        Path(__file__).parent.parent / ".env" if "__file__" in globals() else None
    ),  # Project root
]

env_loaded = False
env_file_path = None

for env_path in env_locations:
    if env_path and env_path.exists():
        print(f"📁 Found .env file: {env_path}")
        load_dotenv(env_path, override=True)
        env_loaded = True
        env_file_path = env_path
        break

if not env_loaded:
    print("⚠️ No .env file found in expected locations")
    print("   Searched:")
    for loc in env_locations:
        if loc:
            print(f"   - {loc}")
    print("\n💡 Create .env file from .env.example:")
    print("   cp .env.example .env")
else:
    print(f"✅ Environment loaded from: {env_file_path}\n")

    # Validate critical environment variables
    required_vars = {
        "Azure Monitor": [
            ("AZURE_MONITOR_DCE_ENDPOINT", "DCE endpoint URL"),
            ("AZURE_MONITOR_DCR_IMMUTABLE_ID", "DCR immutable ID"),
            ("LOG_ANALYTICS_WORKSPACE_NAME", "Log Analytics workspace name"),
        ],
        "Fabric Workspace": [
            ("FABRIC_WORKSPACE_ID", "Workspace GUID"),
            ("FABRIC_WORKSPACE_NAME", "Workspace name"),
        ],
        "Stream Names": [
            ("AZURE_MONITOR_STREAM_LIVY_SESSION", "Livy session stream"),
            ("AZURE_MONITOR_STREAM_RESOURCE_USAGE", "Resource usage stream"),
        ],
    }

    print("🔍 Validating Configuration:")
    all_valid = True

    for category, vars_list in required_vars.items():
        print(f"\n   {category}:")
        for var_name, description in vars_list:
            value = os.getenv(var_name)
            if value and value != f"your-{var_name.lower().replace('_', '-')}-here":
                status = "✅"
                display_value = value[:50] + "..." if len(value) > 50 else value
            else:
                status = "❌"
                display_value = "Not configured"
                all_valid = False
            print(f"      {status} {description}: {display_value}")

    print("\n" + "=" * 80)
    if all_valid:
        print("✅ All required environment variables are configured!")
    else:
        print("⚠️ Some environment variables need to be configured in .env file")
        print("   Edit .env and add your Azure Monitor and Fabric workspace details")
    print("=" * 80)

⚙️ LOADING ENVIRONMENT CONFIGURATION
📁 Found .env file: c:\Dvlp\fabric-la-connector\notebooks\.env
✅ Environment loaded from: c:\Dvlp\fabric-la-connector\notebooks\.env

🔍 Validating Configuration:

   Azure Monitor:
      ✅ DCE endpoint URL: https://dce-fabric-monitoring-lede.canadacentral-1...
      ✅ DCR immutable ID: dcr-6987822159f748c38d622d990a60351c
      ✅ Log Analytics workspace name: law-fabric-monitoring

   Fabric Workspace:
      ✅ Workspace GUID: 8457f746-f2d9-4d27-8221-5714601e40c6
      ✅ Workspace name: YourWorkspaceName

   Stream Names:
      ✅ Livy session stream: Custom-FabricSparkLivySession_CL
      ✅ Resource usage stream: Custom-FabricSparkResourceUsage_CL

✅ All required environment variables are configured!


In [55]:
# Import required modules
import os
import json
from datetime import datetime, timedelta

# Import FabricLA-Connector components
from fabricla_connector import workflows
from fabricla_connector.config import validate_config
from fabricla_connector.collectors import (
    # Livy Sessions collectors
    collect_livy_sessions_workspace,
    collect_livy_sessions_notebook,
    collect_livy_sessions_sparkjob,
    collect_livy_sessions_lakehouse,
    collect_spark_logs,
    collect_spark_metrics,
    # Resource Usage collectors
    collect_spark_resource_usage,
    collect_resource_usage_for_active_sessions,
    # General Spark collectors
    collect_spark_applications_workspace,
    collect_spark_applications_item,
)

print("=" * 80)
print("📦 FABRICLA-CONNECTOR SPARK MONITORING")
print("=" * 80)
print("✅ Modules imported successfully")
print("✅ Phase 1 APIs: Livy Sessions, Logs, Metrics")
print("✅ Phase 4 APIs: Resource Usage, Bottleneck Detection")
print("=" * 80)

📦 FABRICLA-CONNECTOR SPARK MONITORING
✅ Modules imported successfully
✅ Phase 1 APIs: Livy Sessions, Logs, Metrics
✅ Phase 4 APIs: Resource Usage, Bottleneck Detection


### Validate Configuration

Verify Azure Monitor and Fabric workspace settings are properly configured.

In [56]:
# Configuration and validation
print("⚙️ Validating configuration...\n")

# Validate all configuration sections
config_status = validate_config("all")
print(f"Configuration Status: {'✅ Valid' if config_status else '❌ Invalid'}\n")

# Check Azure Monitor configuration
dce_endpoint = os.getenv("AZURE_MONITOR_DCE_ENDPOINT")
dcr_id = os.getenv("AZURE_MONITOR_DCR_IMMUTABLE_ID")
workspace_name = os.getenv("LOG_ANALYTICS_WORKSPACE_NAME")

print("🔧 Azure Monitor Configuration:")
print(
    f"   DCE Endpoint: {dce_endpoint[:50]}..."
    if dce_endpoint
    else "   DCE Endpoint: ❌ Not configured"
)
print(f"   DCR ID: {dcr_id}" if dcr_id else "   DCR ID: ❌ Not configured")
print(
    f"   Log Analytics: {workspace_name}"
    if workspace_name
    else "   Log Analytics: ❌ Not configured"
)

# Set workspace ID (update with your workspace ID)
WORKSPACE_ID = os.getenv("FABRIC_WORKSPACE_ID", "your-workspace-id-here")
WORKSPACE_NAME = os.getenv("FABRIC_WORKSPACE_NAME", "YourWorkspace")

if WORKSPACE_ID == "your-workspace-id-here":
    print("\n⚠️ Please update FABRIC_WORKSPACE_ID in .env file")
else:
    print(f"\n🎯 Target workspace: {WORKSPACE_NAME} ({WORKSPACE_ID})")

# Collection configuration
LOOKBACK_HOURS = int(os.getenv("FABRIC_LOOKBACK_HOURS", "24"))

print(f"\n📊 Collection Settings:")
print(f"   Lookback: {LOOKBACK_HOURS} hours")
print("=" * 80)

⚙️ Validating configuration...

Configuration Status: ✅ Valid

🔧 Azure Monitor Configuration:
   DCE Endpoint: https://dce-fabric-monitoring-lede.canadacentral-1...
   DCR ID: dcr-6987822159f748c38d622d990a60351c
   Log Analytics: law-fabric-monitoring

🎯 Target workspace: YourWorkspaceName (8457f746-f2d9-4d27-8221-5714601e40c6)

📊 Collection Settings:
   Lookback: 43200 hours


### Validate Spark Availability

Quick check to verify Spark is enabled in the workspace.

In [57]:
print("Validating Spark availability...")
print(f"Workspace: {WORKSPACE_ID}")
print(f"Endpoint: /v1/workspaces/{WORKSPACE_ID}/spark/livySessions\n")

try:
    test_generator = collect_livy_sessions_workspace(
        workspace_id=WORKSPACE_ID, 
        lookback_hours=1
    )
    
    test_sessions = list(test_generator)
    
    print(f"SUCCESS - Spark API accessible")
    print(f"Sessions found: {len(test_sessions)}\n")
    
    if len(test_sessions) == 0:
        print("Note: No active sessions in last hour (this is normal)")
        print("      Collections will work once Spark sessions exist\n")
    
    SPARK_AVAILABLE = True

except Exception as e:
    error_msg = str(e)
    print(f"ERROR: {error_msg}\n")
    
    if "401" in error_msg or "403" in error_msg:
        SPARK_AVAILABLE = False
        print("Authentication issue - check service principal permissions")
    elif "404" in error_msg:
        print("Workspace not found or Spark not enabled")
        print("Will attempt collection anyway...")
        SPARK_AVAILABLE = True
    else:
        print("Unexpected error - will attempt collection anyway")
        SPARK_AVAILABLE = True

print("=" * 60)
print("READY" if SPARK_AVAILABLE else "BLOCKED - Fix auth first")
print("=" * 60)

Validating Spark availability...
Workspace: 8457f746-f2d9-4d27-8221-5714601e40c6
Endpoint: /v1/workspaces/8457f746-f2d9-4d27-8221-5714601e40c6/spark/livySessions

[Auth] Fabric authentication not available: No module named 'notebookutils'
[Auth] Using service principal authentication
[Auth] SUCCESS: Using credentials from Environment Variables


SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...KAZ-WNen9A
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 0 sessions
SUCCESS - Spark API accessible
Sessions found: 0

Note: No active sessions in last hour (this is normal)
      Collections will work once Spark sessions exist

READY
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 0 sessions
SUCCESS - Spark API accessible
Sessions found: 0

Note: No active sessions in last hour (this is normal)
      Collections will work once Spark sessions exist

READY


## Phase 1: Livy Sessions Collection

### 1.1 Workspace Livy Sessions
Collect all interactive Spark sessions in the workspace (notebooks, spark jobs, etc.)

In [58]:
# Livy Sessions Collection
print("\n" + "=" * 60)
print("PHASE 1.1: WORKSPACE LIVY SESSIONS")
print("=" * 60)

if not SPARK_AVAILABLE:
    print("Skipped - Spark not available")
    workspace_sessions = []
else:
    try:
        session_generator = collect_livy_sessions_workspace(
            workspace_id=WORKSPACE_ID, lookback_hours=LOOKBACK_HOURS
        )

        workspace_sessions = list(session_generator)
        print(f"\nCollection complete: {len(workspace_sessions)} sessions")

        # Ingest to Azure Monitor
        if workspace_sessions:
            from fabricla_connector.ingestion import FabricIngestion

            dce_endpoint = os.getenv("AZURE_MONITOR_DCE_ENDPOINT")
            dcr_id = os.getenv("AZURE_MONITOR_DCR_IMMUTABLE_ID")
            stream_name = os.getenv(
                "AZURE_MONITOR_STREAM_LIVY_SESSION", "Custom-FabricSparkLivySession_CL"
            )

            ingestion = FabricIngestion(
                endpoint_host=dce_endpoint, dcr_id=dcr_id, stream_name=stream_name
            )

            ingestion_result = ingestion.ingest_enhanced(
                records=workspace_sessions, troubleshoot=True
            )

            print(f"Ingested: {ingestion_result.get('successful_records', 0)}")
            if ingestion_result.get("failed_records", 0) > 0:
                print(f"Failed: {ingestion_result.get('failed_records', 0)}")

            # Display session states
            if workspace_sessions:
                states = {}
                for session in workspace_sessions:
                    state = session.get("State", "unknown")
                    states[state] = states.get(state, 0) + 1
                print("\nSession states:")
                for state, count in states.items():
                    print(f"  {state}: {count}")
        else:
            print("No sessions to ingest")

    except Exception as e:
        error_msg = str(e)
        print(f"ERROR: {error_msg}")

        if "404" in error_msg:
            print("\nSpark not enabled or no sessions exist")
            print("Run a Spark notebook first, then retry")
        elif "401" in error_msg or "403" in error_msg:
            print("\nAuthentication issue - check credentials")
        
        workspace_sessions = []


PHASE 1.1: WORKSPACE LIVY SESSIONS
[Auth] Fabric authentication not available: No module named 'notebookutils'
[Auth] Using service principal authentication
[Auth] SUCCESS: Using credentials from Environment Variables
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...1dCsx3WZsQ
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...1dCsx3WZsQ
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 16 sessions

Collection complete: 16 sessions
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 16 sessions

Collection complete: 16 sessions


Access forbidden (403). Check RBAC permissions for DCE https://dce-fabric-monitoring-lede.canadacentral-1.ingest.monitor.azure.com


Ingested: 0
Failed: 16

Session states:
  Completed: 15
  Failed: 1


### 1.2 Notebook Livy Sessions
Collect Spark sessions specifically launched from Fabric notebooks.

In [59]:
# Notebook Livy Sessions
print("\n" + "=" * 80)
print("📓 PHASE 1.2: NOTEBOOK LIVY SESSIONS")
print("=" * 80)

if not SPARK_AVAILABLE:
    print("⏭️ Skipped - Spark is not available in this workspace")
else:
    try:
        # Get a notebook ID from your workspace (update with actual notebook ID)
        NOTEBOOK_ID = os.getenv("FABRIC_NOTEBOOK_ID", "skip")

        if NOTEBOOK_ID != "skip":
            # Collect session data from the generator
            session_generator = collect_livy_sessions_notebook(
                workspace_id=WORKSPACE_ID,
                notebook_id=NOTEBOOK_ID,
                lookback_hours=LOOKBACK_HOURS,
            )

            # Convert generator to list
            notebook_sessions = list(session_generator)

            print(f"✅ Notebook sessions collected!")
            print(f"   Sessions found: {len(notebook_sessions)}")

            # Ingest to Azure Monitor
            if notebook_sessions:
                from fabricla_connector.ingestion import FabricIngestion

                dce_endpoint = os.getenv("AZURE_MONITOR_DCE_ENDPOINT")
                dcr_id = os.getenv("AZURE_MONITOR_DCR_IMMUTABLE_ID")
                stream_name = os.getenv(
                    "AZURE_MONITOR_STREAM_LIVY_SESSION",
                    "Custom-FabricSparkLivySession_CL",
                )

                ingestion = FabricIngestion(
                    endpoint_host=dce_endpoint, dcr_id=dcr_id, stream_name=stream_name
                )

                ingestion_result = ingestion.ingest_enhanced(
                    records=notebook_sessions, troubleshoot=True
                )

                print(
                    f"   Records ingested: {ingestion_result.get('successful_records', 0)}"
                )
            else:
                print(f"   ℹ️ No sessions found for this notebook")
        else:
            print(
                "⏭️ Skipped - Set FABRIC_NOTEBOOK_ID in .env to collect notebook sessions"
            )

    except Exception as e:
        error_msg = str(e)
        print(f"❌ Error: {error_msg}")
        if "404" in error_msg or "EntityNotFound" in error_msg:
            print(
                f"   💡 Tip: Verify the notebook ID exists and has run Spark sessions"
            )
        import traceback

        traceback.print_exc()


📓 PHASE 1.2: NOTEBOOK LIVY SESSIONS
⏭️ Skipped - Set FABRIC_NOTEBOOK_ID in .env to collect notebook sessions


### 1.3 Spark Job Livy Sessions
Collect sessions from Spark job definitions.

In [60]:
# Spark Job Livy Sessions
print("\n" + "=" * 80)
print("⚡ PHASE 1.3: SPARK JOB LIVY SESSIONS")
print("=" * 80)

if not SPARK_AVAILABLE:
    print("⏭️ Skipped - Spark is not available in this workspace")
else:
    try:
        SPARKJOB_ID = os.getenv("FABRIC_SPARKJOB_ID", "skip")

        if SPARKJOB_ID != "skip":
            # Collect session data from the generator
            session_generator = collect_livy_sessions_sparkjob(
                workspace_id=WORKSPACE_ID,
                sparkjob_id=SPARKJOB_ID,
                lookback_hours=LOOKBACK_HOURS,
            )

            # Convert generator to list
            sparkjob_sessions = list(session_generator)

            print(f"✅ Spark job sessions collected!")
            print(f"   Sessions found: {len(sparkjob_sessions)}")

            # Ingest to Azure Monitor
            if sparkjob_sessions:
                from fabricla_connector.ingestion import FabricIngestion

                dce_endpoint = os.getenv("AZURE_MONITOR_DCE_ENDPOINT")
                dcr_id = os.getenv("AZURE_MONITOR_DCR_IMMUTABLE_ID")
                stream_name = os.getenv(
                    "AZURE_MONITOR_STREAM_LIVY_SESSION",
                    "Custom-FabricSparkLivySession_CL",
                )

                ingestion = FabricIngestion(
                    endpoint_host=dce_endpoint, dcr_id=dcr_id, stream_name=stream_name
                )

                ingestion_result = ingestion.ingest_enhanced(
                    records=sparkjob_sessions, troubleshoot=True
                )

                print(
                    f"   Records ingested: {ingestion_result.get('successful_records', 0)}"
                )
            else:
                print(f"   ℹ️ No sessions found for this Spark job")
        else:
            print(
                "⏭️ Skipped - Set FABRIC_SPARKJOB_ID in .env to collect spark job sessions"
            )

    except Exception as e:
        error_msg = str(e)
        print(f"❌ Error: {error_msg}")
        if "404" in error_msg or "EntityNotFound" in error_msg:
            print(f"   💡 Tip: Verify the Spark job ID exists and has been executed")
        import traceback

        traceback.print_exc()


⚡ PHASE 1.3: SPARK JOB LIVY SESSIONS
⏭️ Skipped - Set FABRIC_SPARKJOB_ID in .env to collect spark job sessions


### 1.4 Lakehouse Livy Sessions
Collect sessions accessing Lakehouse data.

In [61]:
# Lakehouse Livy Sessions 
print("\n" + "=" * 80)
print("🏠 PHASE 1.4: LAKEHOUSE LIVY SESSIONS")
print("=" * 80)

if not SPARK_AVAILABLE:
    print("⏭️ Skipped - Spark is not available in this workspace")
else:
    try:
        LAKEHOUSE_ID = os.getenv("FABRIC_LAKEHOUSE_ID", "skip")
        LAKEHOUSE_NAME = os.getenv("FABRIC_LAKEHOUSE_NAME", "DefaultLakehouse")

        if LAKEHOUSE_ID != "skip":
            # Collect session data from the generator
            session_generator = collect_livy_sessions_lakehouse(
                workspace_id=WORKSPACE_ID,
                lakehouse_id=LAKEHOUSE_ID,
                lakehouse_name=LAKEHOUSE_NAME,
                workspace_name=WORKSPACE_NAME,
                lookback_hours=LOOKBACK_HOURS,
            )

            # Convert generator to list
            lakehouse_sessions = list(session_generator)

            print(f"✅ Lakehouse sessions collected!")
            print(f"   Sessions found: {len(lakehouse_sessions)}")

            # Ingest to Azure Monitor
            if lakehouse_sessions:
                from fabricla_connector.ingestion import FabricIngestion

                dce_endpoint = os.getenv("AZURE_MONITOR_DCE_ENDPOINT")
                dcr_id = os.getenv("AZURE_MONITOR_DCR_IMMUTABLE_ID")
                stream_name = os.getenv(
                    "AZURE_MONITOR_STREAM_LIVY_SESSION",
                    "Custom-FabricSparkLivySession_CL",
                )

                ingestion = FabricIngestion(
                    endpoint_host=dce_endpoint, dcr_id=dcr_id, stream_name=stream_name
                )

                ingestion_result = ingestion.ingest_enhanced(
                    records=lakehouse_sessions, troubleshoot=True
                )

                print(
                    f"   Records ingested: {ingestion_result.get('successful_records', 0)}"
                )
            else:
                print(f"   ℹ️ No sessions found for this Lakehouse")
        else:
            print(
                "⏭️ Skipped - Set FABRIC_LAKEHOUSE_ID in .env to collect lakehouse sessions"
            )

    except Exception as e:
        error_msg = str(e)
        print(f"❌ Error: {error_msg}")
        if "404" in error_msg or "EntityNotFound" in error_msg:
            print(
                f"   💡 Tip: Verify the Lakehouse ID exists and has been accessed by Spark"
            )
        import traceback

        traceback.print_exc()



🏠 PHASE 1.4: LAKEHOUSE LIVY SESSIONS
[Auth] Fabric authentication not available: No module named 'notebookutils'
[Auth] Using service principal authentication
[Auth] SUCCESS: Using credentials from Environment Variables
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...KOAasuphJA
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...KOAasuphJA
[DEBUG] API call: Lakehouse Livy Sessions - DefaultLakehouse - Status: 404
ERROR: 404 Not Found - Resource doesn't exist for Lakehouse Livy Sessions - DefaultLakehouse
   Response: {"requestId":"b5823558-079d-4435-8bd0-af72d5107fdf","errorCode":"EntityNotFound","message":"The requested resource could not be found"}
✅ Lakehouse sessions collected!
   Sessions found: 0
   ℹ️ No sessions found for this Lakehouse
[DEBUG] API call: Lakehouse Livy Sessions - DefaultLakehouse - Status: 404
ERROR: 404 Not Found - Resource doesn't exist for Lakehouse Livy Sessions - DefaultLakehouse
   Respo

## Phase 2: Resource Usage Monitoring

### 2.1 Spark Resource Usage Collection
Collect comprehensive resource metrics (CPU, memory, disk, network) for all Spark sessions.

In [62]:
print("\n" + "=" * 80)
print("💻 PHASE 4.1: SPARK RESOURCE USAGE")
print("=" * 80)

if not SPARK_AVAILABLE:
    print("⏭️ Skipped - Spark is not available in this workspace")
else:
    try:
        # Note: collect_resource_usage_for_active_sessions collects resources for all active sessions
        # It internally calls collect_livy_sessions_workspace first
        resource_generator = collect_resource_usage_for_active_sessions(
            workspace_id=WORKSPACE_ID, lookback_hours=LOOKBACK_HOURS
        )

        # Convert generator to list
        resource_records = list(resource_generator)

        print(f"✅ Resource usage collected!")
        print(f"   Resource records: {len(resource_records)}")

        # Ingest to Azure Monitor
        if resource_records:
            from fabricla_connector.ingestion import FabricIngestion

            dce_endpoint = os.getenv("AZURE_MONITOR_DCE_ENDPOINT")
            dcr_id = os.getenv("AZURE_MONITOR_DCR_IMMUTABLE_ID")
            stream_name = os.getenv(
                "AZURE_MONITOR_STREAM_RESOURCE_USAGE",
                "Custom-FabricSparkResourceUsage_CL",
            )

            ingestion = FabricIngestion(
                endpoint_host=dce_endpoint, dcr_id=dcr_id, stream_name=stream_name
            )

            ingestion_result = ingestion.ingest_enhanced(
                records=resource_records, troubleshoot=True
            )

            print(
                f"   Records ingested: {ingestion_result.get('successful_records', 0)}"
            )
            print(f"   Failed: {ingestion_result.get('failed_records', 0)}")

            # Display resource summary
            print(f"\n📊 Resource Summary:")
            total_cpu = sum(float(r.get("TotalCPUCores", 0)) for r in resource_records)
            total_mem = sum(float(r.get("TotalMemoryGB", 0)) for r in resource_records)
            total_disk = sum(float(r.get("TotalDiskGB", 0)) for r in resource_records)

            print(f"   Total CPU cores: {total_cpu:.1f}")
            print(f"   Total Memory: {total_mem:.2f} GB")
            print(f"   Total Disk: {total_disk:.2f} GB")

            # Show bottleneck analysis
            bottlenecks = {}
            for r in resource_records:
                bottleneck = r.get("BottleneckType", "none")
                if bottleneck and bottleneck != "none":
                    bottlenecks[bottleneck] = bottlenecks.get(bottleneck, 0) + 1

            if bottlenecks:
                print(f"\n⚠️ Bottlenecks detected:")
                for btype, count in bottlenecks.items():
                    print(f"   {btype}: {count} sessions")
        else:
            print(f"   ℹ️ No resource data available")
            print(
                f"   💡 Tip: Resource usage requires sessions with running Spark applications"
            )

    except Exception as e:
        error_msg = str(e)
        print(f"❌ Error: {error_msg}")
        if "404" in error_msg or "EntityNotFound" in error_msg:
            print(f"   💡 Tip: No active Spark sessions found for resource monitoring")
        import traceback

        traceback.print_exc()


💻 PHASE 4.1: SPARK RESOURCE USAGE
[Auth] Fabric authentication not available: No module named 'notebookutils'
[Auth] Using service principal authentication
[Auth] SUCCESS: Using credentials from Environment Variables
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...PAqm7f_IRQ
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...PAqm7f_IRQ
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 16 sessions
INFO: Collecting resource usage for 16 sessions
✅ Resource usage collected!
   Resource records: 0
   ℹ️ No resource data available
   💡 Tip: Resource usage requires sessions with running Spark applications
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 16 sessions
INFO: Collecting resource usage for 16 sessions
✅ Resource usage collected!
   Resource records: 0
   ℹ️ 

### 2.2 Active Sessions Resource Tracking
Real-time resource monitoring for currently active Spark sessions.

In [63]:
print("\n" + "=" * 80)
print("⏱️ PHASE 4.2: ACTIVE SESSION RESOURCES")
print("=" * 80)

if not SPARK_AVAILABLE:
    print("⏭️ Skipped - Spark is not available in this workspace")
else:
    try:
        # This is the same as 4.1 but we'll filter to show only currently active
        resource_generator = collect_resource_usage_for_active_sessions(
            workspace_id=WORKSPACE_ID, lookback_hours=LOOKBACK_HOURS
        )

        # Convert generator to list
        active_resources = list(resource_generator)

        print(f"✅ Active session resources collected!")
        print(f"   Resource records: {len(active_resources)}")

        if active_resources:
            # Note: Data already ingested in Phase 4.1 if same function was used
            # This cell focuses on displaying active session details

            print(f"\n🔥 Currently Active Sessions:")
            for i, resource in enumerate(active_resources[:10], 1):  # Show first 10
                session_id = resource.get("SessionId", "unknown")
                state = resource.get("State", "unknown")
                cpu = float(resource.get("TotalCPUCores", 0))
                mem = float(resource.get("TotalMemoryGB", 0))
                efficiency = float(resource.get("ExecutorEfficiency", 0))
                bottleneck = resource.get("BottleneckType", "none")

                print(f"\n   {i}. Session {session_id[:12]}... [{state}]")
                print(f"      CPU: {cpu:.1f} cores, Memory: {mem:.2f} GB")
                print(f"      Executor Efficiency: {efficiency:.1f}%")
                if bottleneck and bottleneck != "none":
                    severity = float(resource.get("BottleneckSeverity", 0))
                    print(
                        f"      ⚠️ Bottleneck: {bottleneck} (severity: {severity:.2f})"
                    )

            if len(active_resources) > 10:
                print(f"\n   ... and {len(active_resources) - 10} more active sessions")
        else:
            print(f"   ℹ️ No active sessions with resource data at this time")

    except Exception as e:
        error_msg = str(e)
        print(f"❌ Error: {error_msg}")
        if "404" in error_msg or "EntityNotFound" in error_msg:
            print(f"   💡 Tip: No active Spark sessions found for resource monitoring")
        import traceback

        traceback.print_exc()


⏱️ PHASE 4.2: ACTIVE SESSION RESOURCES
[Auth] Fabric authentication not available: No module named 'notebookutils'
[Auth] Using service principal authentication
[Auth] SUCCESS: Using credentials from Environment Variables
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...10b_J1EqaA
SUCCESS: Token acquired for https://api.fabric.microsoft.com/.default: eyJ0eXAiOi...10b_J1EqaA
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 16 sessions
INFO: Collecting resource usage for 16 sessions
✅ Active session resources collected!
   Resource records: 0
   ℹ️ No active sessions with resource data at this time
[DEBUG] API call: Workspace Livy Sessions - 8457f746-f2d9-4d27-8221-5714601e40c6 - Status: 200
Found 16 Livy sessions
Collected 16 sessions
INFO: Collecting resource usage for 16 sessions
✅ Active session resources collected!
   Resource records: 0
   ℹ️ No active sessions with re

## 📊 Collection Complete

Your Spark monitoring data has been collected and ingested to Azure Monitor Log Analytics.

### Next Steps
1. **Wait 2-5 minutes** for data to appear in Log Analytics
2. **Query your data** using KQL in Azure Portal → Log Analytics
3. **Create dashboards** for visualizing Spark performance
4. **Set up alerts** for failures and resource bottlenecks
5. **Schedule this notebook** to run periodically (every 15-30 minutes)

### Sample KQL Queries

#### Query 1: Recent Livy Sessions
```kql
FabricSparkLivySession_CL
| where TimeGenerated > ago(24h)
| project TimeGenerated, WorkspaceName, SessionName, State, ExecutorCount, ExecutorCores
| order by TimeGenerated desc
```

#### Query 2: Resource Usage by Session
```kql
FabricSparkResourceUsage_CL
| where TimeGenerated > ago(24h)
| summarize 
    AvgCPU = avg(TotalCPUCores),
    AvgMemory = avg(TotalMemoryGB),
    AvgEfficiency = avg(ExecutorEfficiency)
    by SessionId, SessionName
| order by AvgMemory desc
```

#### Query 3: Bottleneck Detection
```kql
FabricSparkResourceUsage_CL
| where TimeGenerated > ago(24h)
| where BottleneckType != "none"
| summarize 
    Count = count(),
    AvgSeverity = avg(BottleneckSeverity)
    by BottleneckType, WorkspaceName
| order by Count desc
```

#### Query 4: Failed Sessions
```kql
FabricSparkLivySession_CL
| where TimeGenerated > ago(24h)
| where State in ("error", "dead", "killed")
| project TimeGenerated, WorkspaceName, SessionName, State, Log
| order by TimeGenerated desc
```

### Log Analytics Portal
Access your data: [Azure Portal - Log Analytics](https://portal.azure.com)