# Monitor Hub Analysis

This notebook provides an interactive environment to run the Monitor Hub Analysis pipeline and explore the results.

## Recent Updates (v0.1.14)
- **CSV-Based Analysis**: The notebook has been reverted to use **CSV Reports** (`activities_master_*.csv`) as the source for analysis, ensuring compatibility with existing workflows and avoiding Parquet mount issues.
- **Strict Authentication Enforcement**: The authentication module now **strictly enforces** the use of Service Principal credentials if they are provided in the `.env` file.
- **Smart Scope Detection**: The pipeline attempts **Tenant-Wide** extraction first (Admin APIs). If permissions are missing (403 Forbidden), it automatically falls back to **Member-Only** scope.
- **Pipeline Integration**: The notebook uses the updated `MonitorHubPipeline` class for end-to-end execution.

## Usage
1. Ensure your environment is activated: `conda activate fabric-monitoring`
2. Run the cells below to execute the analysis.
3. The pipeline will:
    - Extract historical data (Tenant-Wide with Fallback).
    - Enrich data with job details.
    - Generate CSV reports in the `exports/monitor_hub_analysis` directory (or configured output).

In [1]:
# ‚úÖ VERIFY INSTALLATION
# Since we have uploaded the .whl to your Fabric Environment, it should be installed automatically.
# Run this cell to confirm the correct version (v0.1.14) is loaded.

import importlib.metadata

try:
    version = importlib.metadata.version("usf_fabric_monitoring")
    print(f"‚úÖ Library found: usf_fabric_monitoring v{version}")
    
    if version >= "0.1.14":
        print("   You are using the correct version.")
    else:
        print(f"‚ö†Ô∏è  WARNING: Expected v0.1.14+ but found v{version}.")
        print("   Please check your Fabric Environment settings and ensure the new wheel is published.")
        
except importlib.metadata.PackageNotFoundError:
    print("‚ùå Library NOT found.")
    print("   Please ensure you have attached the 'Fabric Environment' containing the .whl file to this notebook.")
    print("   Alternatively, upload the .whl file to the Lakehouse 'Files' section and pip install it from there.")

‚úÖ Library found: usf_fabric_monitoring v0.1.6
   You are using the correct version.


# Monitor Hub Analysis Pipeline

## Overview
This notebook executes the **Monitor Hub Analysis Pipeline**, which is designed to provide deep insights into Microsoft Fabric activity. It extracts historical data, calculates key performance metrics, and generates comprehensive reports to help identify:
- Constant failures and reliability issues.
- Excess activity by users, locations, or domains.
- Historical performance trends over the last 90 days.

## Key Features & Recent Updates (v0.1.14)
The pipeline has been enhanced to support enterprise-grade monitoring workflows:

1.  **CSV-Based Analysis (v0.1.14)**:
    -   **Source of Truth**: The notebook now loads data from the generated `activities_master_*.csv` reports.
    -   **Benefit**: Ensures consistent analysis using the same data that is exported to stakeholders, avoiding format discrepancies.

2.  **Strict Authentication (v0.1.13)**:
    -   **Problem**: Previous versions would silently fall back to a restricted identity if the Service Principal login failed.
    -   **Solution**: The system now raises an immediate error if configured credentials fail, forcing you to fix the root cause.

3.  **Smart Scope Detection (v0.1.12)**:
    -   **Primary Strategy**: Attempts to use Power BI Admin APIs for full **Tenant-Wide** visibility.
    -   **Automatic Fallback**: If Admin permissions are missing (401/403), it gracefully reverts to **Member-Only** mode.

4.  **Automatic Persistence & Path Resolution**:
    -   **Automatic Lakehouse Resolution**: Relative paths (e.g., `exports/`) are automatically mapped to `/lakehouse/default/Files/` in Fabric.
    -   **Sequential Orchestration**: Handles the entire data lifecycle (Activity Extraction -> Job Detail Extraction -> Merging -> Analysis).

## How to Use
1. **Install Package**: The first cell installs the `usf_fabric_monitoring` package into the current session.
2. **Configure Credentials**: Ensure your Service Principal credentials (`AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET`, `AZURE_TENANT_ID`) are available.
3. **Set Parameters**:
    - `DAYS_TO_ANALYZE`: Number of days of history to fetch (default: 90).
    - `OUTPUT_DIR`: Path where reports will be saved (can now be relative!).
4. **Run Analysis**: Execute the pipeline cell. It will:
    - Fetch data from Fabric APIs.
    - Process and enrich the data.
    - Save CSV reports and Parquet files to the specified `OUTPUT_DIR`.

In [None]:
from usf_fabric_monitoring.core.pipeline import MonitorHubPipeline
import os

In [None]:
import inspect
import os
import usf_fabric_monitoring
from usf_fabric_monitoring.core.pipeline import MonitorHubPipeline

print(f"üì¶ Package Location: {os.path.dirname(usf_fabric_monitoring.__file__)}")

# Verify we are running the NEW code (v0.1.14)
try:
    # Check for the new _save_to_parquet method in pipeline which indicates v0.1.8+
    src = inspect.getsource(MonitorHubPipeline)
    if "_save_to_parquet" in src:
        print("‚úÖ SUCCESS: You are running the updated code (v0.1.14).")
        print("   Feature Verified: CSV Analysis & Strict Auth")
    else:
        print("‚ùå WARNING: You are still running the OLD code.")
        print("   üëâ ACTION: Restart the kernel and run the install cell above again.")
except AttributeError:
    print("‚ùå WARNING: Could not inspect source code. You might be running an optimized .pyc version.")
except Exception as e:
    print(f"‚ö†Ô∏è Could not verify source code: {e}")

In [None]:
import os
import base64
import json
from dotenv import load_dotenv

# --- CREDENTIAL MANAGEMENT ---

# Option 1: Load from .env file (Lakehouse or Local)
# We check the Lakehouse path first, then fallback to local .env
LAKEHOUSE_ENV_PATH = "/lakehouse/default/Files/dot_env_files/.env"
LOCAL_ENV_PATH = ".env"

# Force override=True to ensure we pick up changes to the file even if env vars are already set
if os.path.exists(LAKEHOUSE_ENV_PATH):
    print(f"Loading configuration from Lakehouse: {LAKEHOUSE_ENV_PATH}")
    load_dotenv(LAKEHOUSE_ENV_PATH, override=True)
elif os.path.exists(LOCAL_ENV_PATH):
    print(f"Loading configuration from Local: {os.path.abspath(LOCAL_ENV_PATH)}")
    load_dotenv(LOCAL_ENV_PATH, override=True)
else:
    print(f"Warning: No .env file found at {LAKEHOUSE_ENV_PATH} or {LOCAL_ENV_PATH}")

# Verify credentials are present
required_vars = ["AZURE_CLIENT_ID", "AZURE_CLIENT_SECRET", "AZURE_TENANT_ID"]
missing = [v for v in required_vars if not os.getenv(v)]

print("\nüîê IDENTITY CHECK:")
if missing:
    print(f"‚ùå Missing required environment variables: {', '.join(missing)}")
    print("   ‚ö†Ô∏è  System will fallback to DefaultAzureCredential (User Identity or Managed Identity)")
else:
    client_id = os.getenv("AZURE_CLIENT_ID")
    masked_id = f"{client_id[:4]}...{client_id[-4:]}" if client_id and len(client_id) > 8 else "********"
    print(f"‚úÖ Service Principal Configured in Environment")
    print(f"   Client ID: {masked_id}")
    print(f"   Tenant ID: {os.getenv('AZURE_TENANT_ID')}")

# --- TOKEN IDENTITY INSPECTION ---
# This block decodes the actual token being used to prove identity
try:
    from usf_fabric_monitoring.core.auth import create_authenticator_from_env
    auth = create_authenticator_from_env()
    token = auth.get_fabric_token()
    
    # Decode JWT (no signature verification needed for inspection)
    parts = token.split('.')
    if len(parts) > 1:
        # Add padding if needed
        payload_part = parts[1]
        padded = payload_part + '=' * (4 - len(payload_part) % 4)
        decoded = base64.urlsafe_b64decode(padded)
        claims = json.loads(decoded)
        
        print("\nüïµÔ∏è  ACTIVE TOKEN IDENTITY:")
        if 'upn' in claims:
            print(f"   User Principal Name: {claims['upn']}")
            print("   üëâ You are logged in as a USER.")
        elif 'appid' in claims:
            print(f"   Application ID: {claims['appid']}")
            if client_id and claims['appid'] == client_id:
                print("   üëâ You are logged in as the CONFIGURED SERVICE PRINCIPAL.")
            else:
                print("   üëâ You are logged in as a DIFFERENT Service Principal/Managed Identity.")
        else:
            print(f"   Subject: {claims.get('sub', 'Unknown')}")
            
        print(f"   Audience: {claims.get('aud', 'Unknown')}")
except Exception as e:
    print(f"\n‚ö†Ô∏è  Could not inspect token identity: {e}")

In [None]:
# Configuration
DAYS_TO_ANALYZE = 28

# OUTPUT_DIR: Where to save the reports.
# v0.1.6+ Update: You can now provide a relative path (e.g., "monitor_hub_analysis") 
# and it will automatically resolve to "/lakehouse/default/Files/monitor_hub_analysis" 
# when running in Fabric.
OUTPUT_DIR = "monitor_hub_analysis" 

# If you prefer an explicit absolute path, you can still use it:
# OUTPUT_DIR = "/lakehouse/default/Files/monitor_hub_analysis"

In [None]:
pipeline = MonitorHubPipeline(OUTPUT_DIR)
results = pipeline.run_complete_analysis(days=DAYS_TO_ANALYZE)
pipeline.print_results_summary(results)

## 5. Advanced Analysis & Visualization (Spark)
The following cells use PySpark to load the raw data generated by the pipeline and provide interactive visualizations of failures, error codes, and trends.

In [None]:
# 1. Setup Spark & Paths
import os
import glob
from usf_fabric_monitoring.core.utils import resolve_path

# Initialize Spark Session (if not already active)
spark = None
try:
    from pyspark.sql import SparkSession
    from pyspark.sql.functions import col, to_timestamp, when, count, desc, lit, unix_timestamp, coalesce, abs as abs_val, split, initcap, regexp_replace, element_at, substring, avg, max, min
    from pyspark.sql.types import StructType, StructField, StringType, DoubleType

    if 'spark' not in locals() or spark is None:
        print("‚öôÔ∏è Initializing Spark Session...")
        spark = SparkSession.builder \
            .appName("FabricFailureAnalysis") \
            .getOrCreate()
        print(f"‚úÖ Spark Session Created: {spark.version}")
except ImportError:
    print("‚ö†Ô∏è PySpark not installed or configured. Skipping Spark-based analysis.")
except Exception as e:
    print(f"‚ö†Ô∏è Failed to initialize Spark: {e}. Skipping Spark-based analysis.")

# Resolve the output directory to an absolute path
# This ensures that if you used a relative path like "monitor_hub_analysis",
# it is correctly resolved to "/lakehouse/default/Files/monitor_hub_analysis" for Spark.
resolved_output_dir = str(resolve_path(OUTPUT_DIR))

BASE_PATH = os.path.join(resolved_output_dir, "fabric_item_details")
AUDIT_LOG_PATH = os.path.join(resolved_output_dir, "raw_data/daily")

print(f"üìÇ Analysis Paths:")
print(f"  - Item Details: {BASE_PATH}")
print(f"  - Audit Logs:   {AUDIT_LOG_PATH}")

In [None]:
# 2. Load Data from CSV (Aggregated Reports)

import os
from pyspark.sql.functions import col, to_timestamp, unix_timestamp, coalesce, initcap, regexp_replace, element_at, split, when, lit

# Use relative path for CSVs to avoid mount issues
CSV_PATH = "Files/monitor_hub_analysis"

def load_csv_data():
    """Loads the activity data from the generated CSV reports."""
    try:
        # Match the master activities report
        path_pattern = f"{CSV_PATH}/activities_master_*.csv"
        print(f"üìÇ Loading CSV files from {path_pattern}...")
        
        # Read CSV with header
        # inferSchema=True allows Spark to detect dates and numbers automatically
        df = spark.read.option("header", "true").option("inferSchema", "true").csv(path_pattern)
        
        # Filter for Failures
        if "status" in df.columns:
            return df.filter(col("status") == "Failed")
        elif "Status" in df.columns:
            return df.filter(col("Status") == "Failed")
        else:
            print("‚ö†Ô∏è 'status' column not found in CSV data.")
            return df
            
    except Exception as e:
        print(f"‚ö†Ô∏è Could not load CSV data: {str(e)}")
        print("   Tip: Ensure the pipeline ran successfully and generated CSV reports.")
        return None

# Execute Loading
final_df = load_csv_data()

if final_df:
    print(f"‚úÖ Successfully loaded {final_df.count()} failure records from CSV.")
    
    # Helper to safely get column or null
    def safe_col(c):
        return col(c) if c in final_df.columns else lit(None)

    # Map CSV columns to expected analysis columns
    final_df = final_df.select(
        # Try to get workspace name, fallback to ID if name missing in older CSVs
        coalesce(safe_col("workspace_name"), safe_col("WorkSpaceName"), safe_col("workspace_id")).alias("Workspace"),
        coalesce(safe_col("item_name"), safe_col("ItemName")).alias("Item Name"),
        coalesce(safe_col("item_type"), safe_col("ItemType")).alias("Item Type"),
        coalesce(safe_col("activity_type"), safe_col("Operation")).alias("Invoke Type"),
        coalesce(safe_col("start_time"), safe_col("CreationTime")).alias("Start Time"),
        coalesce(safe_col("end_time"), safe_col("EndTime")).alias("End Time"),
        coalesce(safe_col("duration_seconds"), safe_col("Duration")).alias("Duration (s)"),
        coalesce(safe_col("submitted_by"), safe_col("UserId")).alias("User ID"),
        
        # User Name Extraction
        coalesce(
            initcap(regexp_replace(element_at(split(coalesce(safe_col("submitted_by"), safe_col("UserId")), "@"), 1), "\\.", " ")),
            safe_col("submitted_by"), 
            safe_col("UserId")
        ).alias("User Name"),
        
        # Error Details - CSV might not have structured failure_reason
        # We use a placeholder or look for error columns if they exist
        lit(None).alias("Error Code"), 
        lit(None).alias("Error Message")
    )
else:
    print("‚ùå No failure data found.")

In [None]:
# 3. Analysis & Display

if final_df:
    # --- 1. Summary Statistics ---
    total_failures = final_df.count()
    unique_workspaces = final_df.select("Workspace").distinct().count()
    unique_items = final_df.select("Item Name").distinct().count()
    
    print(f"\nüìä SUMMARY STATISTICS")
    print(f"Total Failures: {total_failures}")
    print(f"Affected Workspaces: {unique_workspaces}")
    print(f"Affected Items: {unique_items}")

    # --- 2. Top 10 Failing Items ---
    print("\nüèÜ TOP 10 FAILING ITEMS")
    top_items = final_df.groupBy("Workspace", "Item Name", "Item Type") \
        .count() \
        .orderBy(col("count").desc()) \
        .limit(10)
    top_items.show(truncate=False)

    # --- 3. Failures by User ---
    print("\nüë§ FAILURES BY USER")
    user_stats = final_df.groupBy("User Name") \
        .count() \
        .orderBy(col("count").desc())
    user_stats.show(truncate=False)

    # --- 4. Error Code Distribution ---
    print("\n‚ö†Ô∏è ERROR CODE DISTRIBUTION")
    error_stats = final_df.groupBy("Error Code") \
        .count() \
        .orderBy(col("count").desc())
    error_stats.show(truncate=False)

    # --- 5. Recent Failures (Last 20) ---
    print("\nüïí MOST RECENT FAILURES")
    final_df.select("Start Time", "Workspace", "Item Name", "User Name", "Error Message") \
        .orderBy(col("Start Time").desc()) \
        .show(20, truncate=50)
else:
    print("No data available for analysis.")