# Anomaly Detection Workflow

## Purpose

This notebook teaches you how to build complete anomaly detection workflows with interactive pipeline configuration. You'll learn end-to-end workflows, ensemble methods, validation, and reporting with interactive widgets.

## Learning Objectives

By the end of this notebook, you will:
- ‚úÖ Build complete anomaly detection pipelines (preprocessing, detection, post-processing)
- ‚úÖ Configure ensemble methods (voting, weighted, stacking)
- ‚úÖ Perform validation (cross-validation, hold-out, time series, spatial)
- ‚úÖ Generate anomaly reports with visualizations
- ‚úÖ Integrate anomaly detection with quality assessment
- ‚úÖ Save and reuse workflow configurations

## Estimated Duration

60-90 minutes

---

## Overview

Complete anomaly detection workflows enable systematic detection, validation, and reporting. The AM-QADF framework provides comprehensive workflow capabilities:

- üîß **Pipeline Steps**: Preprocessing, Detection, Post-processing
- üéØ **Ensemble Methods**: Voting, Weighted, Stacking
- ‚úÖ **Validation**: Cross-Validation, Hold-out, Time Series, Spatial
- üìÑ **Reporting**: Automated report generation with visualizations
- üîÑ **Integration**: Seamless integration with quality assessment

Use the interactive widgets below to configure and execute complete anomaly detection workflows - no coding required!


In [1]:
# Setup: Import required libraries
import sys
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Add parent directory and src directory to path for imports
notebook_dir = Path().resolve()
project_root = notebook_dir.parent
src_dir = project_root / 'src'

# Add project root to path (for src.infrastructure imports)
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

# Add src directory to path (for am_qadf imports)
if str(src_dir) not in sys.path:
    sys.path.insert(0, str(src_dir))

# Core imports
import ipywidgets as widgets
from ipywidgets import (
    VBox, HBox, Accordion, Tab, Dropdown, RadioButtons, 
    Checkbox, Button, Output, Text, IntSlider, FloatSlider,
    Layout, Box, Label, FloatText, IntText, SelectMultiple
)
from IPython.display import display, Markdown, HTML, clear_output
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle, FancyBboxPatch
from mpl_toolkits.mplot3d import Axes3D
from scipy import stats
from sklearn.cluster import DBSCAN
from sklearn.ensemble import IsolationForest
from datetime import datetime
from typing import Optional, Tuple, Dict, Any, List

# Load environment variables from development.env
import os
env_file = project_root / 'development.env'
if env_file.exists():
    with open(env_file, 'r') as f:
        for line in f:
            line = line.strip()
            if line and not line.startswith('#') and '=' in line:
                key, value = line.split('=', 1)
                value = value.strip('"\'')
                os.environ[key] = value
    print("‚úÖ Environment variables loaded from development.env")

# Try to import anomaly detection workflow classes
WORKFLOW_AVAILABLE = False
try:
    from am_qadf.anomaly_detection.integration.client import AnomalyDetectionClient
    from am_qadf.anomaly_detection.detectors.ensemble import VotingEnsembleDetector, WeightedEnsembleDetector
    WORKFLOW_AVAILABLE = True
    print("‚úÖ Anomaly detection workflow classes available")
except ImportError as e:
    print(f"‚ö†Ô∏è Anomaly detection workflow classes not available: {e} - using demo mode")

# MongoDB connection setup
INFRASTRUCTURE_AVAILABLE = False
mongo_client = None
voxel_storage = None
stl_client = None

try:
    from src.infrastructure.config import MongoDBConfig
    from src.infrastructure.database import MongoDBClient
    from am_qadf.voxel_domain import VoxelGridStorage
    from am_qadf.query import STLModelClient
    
    # Initialize MongoDB connection
    config = MongoDBConfig.from_env()
    if not config.username:
        config.username = os.getenv('MONGO_ROOT_USERNAME', 'admin')
    if not config.password:
        config.password = os.getenv('MONGO_ROOT_PASSWORD', 'password')
    
    mongo_client = MongoDBClient(config=config)
    if mongo_client.is_connected():
        voxel_storage = VoxelGridStorage(mongo_client=mongo_client)
        stl_client = STLModelClient(mongo_client=mongo_client)
        INFRASTRUCTURE_AVAILABLE = True
        print(f"‚úÖ Connected to MongoDB: {config.database}")
    else:
        print("‚ö†Ô∏è MongoDB connection failed")
except Exception as e:
    print(f"‚ö†Ô∏è MongoDB not available: {e} - using demo mode")

print("‚úÖ Setup complete!")


‚úÖ Environment variables loaded from development.env


2026-01-08 21:30:24.184283: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2026-01-08 21:30:24.187602: I external/local_tsl/tsl/cuda/cudart_stub.cc:31] Could not find cuda drivers on your machine, GPU will not be used.
2026-01-08 21:30:24.229578: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2026-01-08 21:30:24.229627: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2026-01-08 21:30:24.230831: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to



‚úÖ Anomaly detection workflow classes available


Failed to connect to MongoDB: localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30.0s, Topology Description: <TopologyDescription id: 696013e42d6c12d5cba9c90d, topology_type: Unknown, servers: [<ServerDescription ('localhost', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]>


‚ö†Ô∏è MongoDB not available: localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms), Timeout: 30.0s, Topology Description: <TopologyDescription id: 696013e42d6c12d5cba9c90d, topology_type: Unknown, servers: [<ServerDescription ('localhost', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27017: [Errno 111] Connection refused (configured timeouts: socketTimeoutMS: 20000.0ms, connectTimeoutMS: 20000.0ms)')>]> - using demo mode
‚úÖ Setup complete!


## Interactive Anomaly Detection Workflow Interface

Use the widgets below to configure and execute complete anomaly detection workflows. Build pipelines, configure ensembles, validate results, and generate reports interactively!


In [2]:
# Create Interactive Anomaly Detection Workflow Interface

# Global state
workflow_results = {}
sample_data = None
ground_truth = None
workflow_config = {}
current_model_id = None
current_grid_id = None
loaded_grid_data = None
signal_arrays = {}

# ============================================
# Helper Functions for Demo Data
# ============================================

def generate_sample_data_with_anomalies(n_points=1000, n_anomalies=50, seed=42):
    """Generate sample signal data with known anomalies."""
    np.random.seed(seed)
    
    # Normal data (temperature signal)
    t = np.linspace(0, 100, n_points)
    normal_temp = 200 + 10 * np.sin(2 * np.pi * t / 20) + np.random.normal(0, 2, n_points)
    
    # Add anomalies
    anomaly_indices = np.random.choice(n_points, n_anomalies, replace=False)
    ground_truth_labels = np.zeros(n_points, dtype=bool)
    
    for idx in anomaly_indices:
        if np.random.rand() < 0.5:
            normal_temp[idx] += np.random.choice([-1, 1]) * np.random.uniform(20, 40)
        else:
            normal_temp[idx:idx+5] += np.random.choice([-1, 1]) * np.random.uniform(10, 20)
        ground_truth_labels[idx] = True
    
    # Create spatial coordinates
    x = np.random.uniform(0, 10, n_points)
    y = np.random.uniform(0, 10, n_points)
    z = np.random.uniform(0, 5, n_points)
    
    return {
        'temperature': normal_temp,
        'x': x,
        'y': y,
        'z': z,
        'time': t
    }, ground_truth_labels

# Initialize sample data
sample_data, ground_truth = generate_sample_data_with_anomalies()

# ============================================
# Top Panel: Data Source and Grid Selection
# ============================================

# Data source mode
data_source_label = widgets.HTML("<b>Data Source:</b>")
data_source_mode = RadioButtons(
    options=[('MongoDB', 'mongodb'), ('Sample Data', 'sample')],
    value='mongodb',
    description='Source:',
    style={'description_width': 'initial'}
)

# Model selection (for MongoDB)
model_label = widgets.HTML("<b>Model:</b>")
model_options = [("‚îÅ‚îÅ‚îÅ Select Model ‚îÅ‚îÅ‚îÅ", None)]
if stl_client and mongo_client:
    try:
        models = stl_client.list_models(limit=100)
        model_options.extend([
            (f"{m.get('filename', m.get('original_stem', m.get('model_name', 'Unknown')))} ({m.get('model_id', '')[:8]}...)", m.get('model_id'))
            for m in models
        ])
    except Exception as e:
        print(f"‚ö†Ô∏è Error loading models: {e}")

model_dropdown = Dropdown(
    options=model_options,
    value=None,
    description='Model:',
    style={'description_width': 'initial'},
    layout=Layout(width='400px')
)

# Grid type filter
grid_type_label = widgets.HTML("<b>Grid Type:</b>")
grid_type_filter = Dropdown(
    options=[
        ('All Grids', 'all'),
        ('Fused', 'fused'),
        ('Corrected', 'corrected'),
        ('Processed', 'processed'),
        ('Signal-Mapped', 'signal_mapped'),
        ('Raw', 'raw')
    ],
    value='fused',  # Default to fused grids
    description='Type:',
    style={'description_width': 'initial'}
)

# Grid selection (for MongoDB)
grid_label = widgets.HTML("<b>Grid:</b>")
grid_dropdown = Dropdown(
    options=[("‚îÅ‚îÅ‚îÅ Select Grid ‚îÅ‚îÅ‚îÅ", None)],
    value=None,
    description='Grid:',
    style={'description_width': 'initial'},
    layout=Layout(width='500px')
)

load_grid_button = Button(
    description='Load Grid',
    button_style='info',
    icon='folder-open',
    layout=Layout(width='120px')
)

# Workflow mode
workflow_mode = RadioButtons(
    options=[
        ('Pipeline', 'pipeline'),
        ('Ensemble', 'ensemble'),
        ('Validation', 'validation'),
        ('Reporting', 'reporting')
    ],
    value='pipeline',
    description='Mode:',
    style={'description_width': 'initial'}
)

workflow_label = widgets.HTML("<b>Workflow:</b>")
workflow_selector = Dropdown(
    options=[('New Workflow', 'new'), ('Workflow 001', 'wf001'), ('Workflow 002', 'wf002')],
    value='new',
    description='Workflow:',
    style={'description_width': 'initial'}
)

execute_button = Button(
    description='Execute Workflow',
    button_style='success',
    icon='play',
    layout=Layout(width='180px')
)

save_button = Button(
    description='Save Workflow',
    button_style='primary',
    icon='save',
    layout=Layout(width='160px')
)

top_panel = VBox([
    HBox([data_source_label, data_source_mode, workflow_mode, workflow_label, workflow_selector]),
    HBox([model_label, model_dropdown, grid_type_label, grid_type_filter]),
    HBox([grid_label, grid_dropdown, load_grid_button]),
    HBox([execute_button, save_button])
], layout=Layout(padding='10px', border='1px solid #ccc'))

# ============================================
# Left Panel: Workflow Configuration
# ============================================

# Pipeline Steps Configuration
preprocessing_label = widgets.HTML("<b>Preprocessing:</b>")
preprocessing_feature_extraction = Checkbox(value=True, description='Feature Extraction', style={'description_width': 'initial'})
preprocessing_normalization = Checkbox(value=True, description='Normalization', style={'description_width': 'initial'})
preprocessing_dim_reduction = Dropdown(
    options=[('None', 'none'), ('PCA', 'pca'), ('ICA', 'ica'), ('t-SNE', 'tsne')],
    value='none',
    description='Dim Reduction:',
    style={'description_width': 'initial'}
)

preprocessing_config = VBox([
    preprocessing_label,
    preprocessing_feature_extraction,
    preprocessing_normalization,
    preprocessing_dim_reduction
], layout=Layout(padding='5px', border='1px solid #ddd'))

detection_label = widgets.HTML("<b>Detection:</b>")
detection_detectors = SelectMultiple(
    options=[('Z-Score', 'zscore'), ('IQR', 'iqr'), ('DBSCAN', 'dbscan'), ('Isolation Forest', 'isolation_forest')],
    value=['zscore', 'iqr'],
    description='Detectors:',
    style={'description_width': 'initial'}
)
detection_threshold = FloatSlider(value=3.0, min=1.0, max=5.0, step=0.1, description='Threshold:', style={'description_width': 'initial'})

detection_config = VBox([
    detection_label,
    detection_detectors,
    detection_threshold
], layout=Layout(padding='5px', border='1px solid #ddd'))

postprocessing_label = widgets.HTML("<b>Post-processing:</b>")
postprocessing_filtering = Checkbox(value=True, description='Filtering', style={'description_width': 'initial'})
postprocessing_aggregation = Checkbox(value=False, description='Aggregation', style={'description_width': 'initial'})
postprocessing_validation = Checkbox(value=True, description='Validation', style={'description_width': 'initial'})

postprocessing_config = VBox([
    postprocessing_label,
    postprocessing_filtering,
    postprocessing_aggregation,
    postprocessing_validation
], layout=Layout(padding='5px', border='1px solid #ddd'))

pipeline_accordion = Accordion(children=[
    preprocessing_config,
    detection_config,
    postprocessing_config
])
pipeline_accordion.set_title(0, 'Preprocessing')
pipeline_accordion.set_title(1, 'Detection')
pipeline_accordion.set_title(2, 'Post-processing')

# Ensemble Configuration
ensemble_label = widgets.HTML("<b>Ensemble Configuration:</b>")
ensemble_method = RadioButtons(
    options=[('Voting', 'voting'), ('Weighted', 'weighted'), ('Stacking', 'stacking')],
    value='voting',
    description='Method:',
    style={'description_width': 'initial'}
)
ensemble_detectors = SelectMultiple(
    options=[('Z-Score', 'zscore'), ('IQR', 'iqr'), ('DBSCAN', 'dbscan'), ('Isolation Forest', 'isolation_forest')],
    value=['zscore', 'iqr', 'dbscan'],
    description='Detectors:',
    style={'description_width': 'initial'}
)
ensemble_weight1 = FloatSlider(value=0.33, min=0.0, max=1.0, step=0.05, description='Weight 1:', style={'description_width': 'initial'})
ensemble_weight2 = FloatSlider(value=0.33, min=0.0, max=1.0, step=0.05, description='Weight 2:', style={'description_width': 'initial'})
ensemble_weight3 = FloatSlider(value=0.34, min=0.0, max=1.0, step=0.05, description='Weight 3:', style={'description_width': 'initial'})
ensemble_threshold = FloatSlider(value=0.5, min=0.0, max=1.0, step=0.05, description='Threshold:', style={'description_width': 'initial'})

ensemble_config = VBox([
    ensemble_label,
    ensemble_method,
    ensemble_detectors,
    ensemble_weight1,
    ensemble_weight2,
    ensemble_weight3,
    ensemble_threshold
], layout=Layout(padding='5px', border='1px solid #ddd'))

# Validation Configuration
validation_label = widgets.HTML("<b>Validation Configuration:</b>")
validation_method = RadioButtons(
    options=[('Cross-Validation', 'cv'), ('Hold-out', 'holdout'), ('Time Series', 'timeseries'), ('Spatial', 'spatial')],
    value='cv',
    description='Method:',
    style={'description_width': 'initial'}
)
validation_k_fold = IntSlider(value=5, min=2, max=10, step=1, description='K-Fold:', style={'description_width': 'initial'})
validation_test_split = FloatSlider(value=0.2, min=0.1, max=0.5, step=0.05, description='Test Split:', style={'description_width': 'initial'})
validation_ground_truth = Dropdown(
    options=[('Available', 'available'), ('Not Available', 'not_available')],
    value='available',
    description='Ground Truth:',
    style={'description_width': 'initial'}
)

validation_config = VBox([
    validation_label,
    validation_method,
    validation_k_fold,
    validation_test_split,
    validation_ground_truth
], layout=Layout(padding='5px', border='1px solid #ddd'))

# Dynamic configuration based on mode
config_accordion = Accordion(children=[
    pipeline_accordion,
    ensemble_config,
    validation_config
])
config_accordion.set_title(0, 'Pipeline Steps')
config_accordion.set_title(1, 'Ensemble')
config_accordion.set_title(2, 'Validation')

left_panel = VBox([
    widgets.HTML("<h3>Workflow Configuration</h3>"),
    config_accordion
], layout=Layout(width='300px', padding='10px', border='1px solid #ccc'))

# ============================================
# Center Panel: Visualization
# ============================================

viz_mode = RadioButtons(
    options=[('Pipeline', 'pipeline'), ('Results', 'results'), ('Validation', 'validation'), ('Report', 'report')],
    value='pipeline',
    description='View:',
    style={'description_width': 'initial'}
)

viz_output = Output(layout=Layout(height='600px', overflow='auto'))

center_panel = VBox([
    widgets.HTML("<h3>Workflow Visualization</h3>"),
    viz_mode,
    viz_output
], layout=Layout(flex='1 1 auto', padding='10px', border='1px solid #ccc'))

# ============================================
# Right Panel: Results
# ============================================

# Pipeline Status
pipeline_status_label = widgets.HTML("<b>Pipeline Status:</b>")
pipeline_status_display = widgets.HTML("No workflow executed yet")
pipeline_status_section = VBox([
    pipeline_status_label,
    pipeline_status_display
], layout=Layout(padding='5px'))

# Detection Results
detection_results_label = widgets.HTML("<b>Detection Results:</b>")
detection_results_display = widgets.HTML("No results available")
detection_results_section = VBox([
    detection_results_label,
    detection_results_display
], layout=Layout(padding='5px'))

# Validation Results
validation_results_label = widgets.HTML("<b>Validation Results:</b>")
validation_results_display = widgets.HTML("No validation performed")
validation_results_section = VBox([
    validation_results_label,
    validation_results_display
], layout=Layout(padding='5px'))

# Report Summary
report_summary_label = widgets.HTML("<b>Report Summary:</b>")
report_summary_display = widgets.HTML("No report generated")
report_summary_section = VBox([
    report_summary_label,
    report_summary_display
], layout=Layout(padding='5px'))

# Export Options
export_label = widgets.HTML("<b>Export:</b>")
export_workflow_button = Button(description='Export Workflow', button_style='', layout=Layout(width='150px'))
export_results_button = Button(description='Export Results', button_style='', layout=Layout(width='150px'))
export_report_button = Button(description='Export Report', button_style='', layout=Layout(width='150px'))
save_config_button = Button(description='Save Config', button_style='', layout=Layout(width='150px'))

export_section = VBox([
    export_label,
    export_workflow_button,
    export_results_button,
    export_report_button,
    save_config_button
], layout=Layout(padding='5px'))

right_panel = VBox([
    pipeline_status_section,
    detection_results_section,
    validation_results_section,
    report_summary_section,
    export_section
], layout=Layout(width='250px', padding='10px', border='1px solid #ccc'))

# ============================================
# Bottom Panel: Status and Progress
# ============================================

status_display = widgets.HTML("<b>Status:</b> Ready to execute workflow")
progress_bar = widgets.IntProgress(
    value=0,
    min=0,
    max=100,
    description='Progress:',
    bar_style='info',
    layout=Layout(width='100%')
)
log_display = Output(layout=Layout(height='100px', overflow='auto'))

bottom_panel = VBox([
    status_display,
    progress_bar,
    log_display
], layout=Layout(padding='10px', border='1px solid #ccc'))

# ============================================
# Helper Functions for MongoDB
# ============================================

def update_grid_dropdown(change=None):
    """Update grid dropdown when model or grid type changes."""
    global current_model_id
    
    model_id = model_dropdown.value
    grid_type = grid_type_filter.value
    
    if not model_id:
        grid_dropdown.options = [("‚îÅ‚îÅ‚îÅ Select Grid ‚îÅ‚îÅ‚îÅ", None)]
        return
    
    current_model_id = model_id
    
    if not voxel_storage:
        grid_dropdown.options = [("‚îÅ‚îÅ‚îÅ MongoDB not available ‚îÅ‚îÅ‚îÅ", None)]
        return
    
    try:
        # Get all grids for this model
        grids = voxel_storage.list_grids(model_id=model_id, limit=100)
        
        grid_options = [("‚îÅ‚îÅ‚îÅ Select Grid ‚îÅ‚îÅ‚îÅ", None)]
        for grid in grids:
            metadata = grid.get('metadata', {})
            config_meta = metadata.get('configuration_metadata', {})
            if not config_meta:
                config_meta = metadata
            
            # Determine grid type
            is_fused = config_meta.get('fusion_applied', False)
            is_corrected = config_meta.get('correction_applied', False)
            is_processed = config_meta.get('processing_applied', False)
            has_signals = len(grid.get('available_signals', [])) > 0
            
            grid_type_match = False
            if grid_type == 'all':
                grid_type_match = True
            elif grid_type == 'fused' and is_fused:
                grid_type_match = True
            elif grid_type == 'corrected' and is_corrected:
                grid_type_match = True
            elif grid_type == 'processed' and is_processed:
                grid_type_match = True
            elif grid_type == 'signal_mapped' and has_signals and not is_corrected and not is_processed and not is_fused:
                grid_type_match = True
            elif grid_type == 'raw' and not has_signals:
                grid_type_match = True
            
            if grid_type_match:
                grid_id = grid.get('grid_id', str(grid.get('_id', '')))
                grid_name = grid.get('grid_name', 'Unknown')
                n_signals = len(grid.get('available_signals', []))
                
                # Build status label
                status_parts = []
                if is_fused:
                    status_parts.append('fused')
                if is_corrected:
                    status_parts.append('corrected')
                if is_processed:
                    status_parts.append('processed')
                if has_signals and not status_parts:
                    status_parts.append('mapped')
                if not status_parts:
                    status_parts.append('raw')
                
                status_str = ', '.join(status_parts)
                label = f"{grid_name} ({n_signals} signal(s), {status_str}) ({grid_id[:8]}...)"
                grid_options.append((label, grid_id))
        
        if len(grid_options) == 1:
            grid_options.append(("No grids found matching filter", None))
        
        grid_dropdown.options = grid_options
    except Exception as e:
        grid_dropdown.options = [("‚îÅ‚îÅ‚îÅ Error loading grids ‚îÅ‚îÅ‚îÅ", None)]
        print(f"‚ö†Ô∏è Error loading grids: {e}")

def load_grid_from_mongodb(button):
    """Load selected grid from MongoDB."""
    global current_model_id, current_grid_id, loaded_grid_data, signal_arrays
    
    if not voxel_storage or not grid_dropdown.value:
        status_display.value = "<b>Status:</b> <span style='color: red;'>‚ö†Ô∏è Please select a grid to load</span>"
        return
    
    grid_id = grid_dropdown.value
    current_grid_id = grid_id
    
    status_display.value = "<b>Status:</b> Loading grid from MongoDB..."
    progress_bar.value = 0
    
    try:
        # Load grid from MongoDB
        grid_data = voxel_storage.load_voxel_grid(grid_id=grid_id)
        
        if not grid_data:
            status_display.value = "<b>Status:</b> <span style='color: red;'>‚ö†Ô∏è Failed to load grid</span>"
            return
        
        # Extract data from dictionary
        signal_arrays = grid_data.get('signal_arrays', {})
        metadata = grid_data.get('metadata', {})
        grid_name = grid_data.get('grid_name', 'Unknown')
        
        if not signal_arrays or len(signal_arrays) == 0:
            status_display.value = "<b>Status:</b> <span style='color: orange;'>‚ö†Ô∏è Grid has no signals</span>"
            return
        
        # Store loaded data
        loaded_grid_data = {
            'grid_data': grid_data,
            'metadata': metadata,
            'signal_arrays': signal_arrays
        }
        
        progress_bar.value = 100
        status_display.value = f"<b>Status:</b> <span style='color: green;'>‚úÖ Loaded grid: {grid_name} ({len(signal_arrays)} signal(s))</span>"
        
    except Exception as e:
        status_display.value = f"<b>Status:</b> <span style='color: red;'>‚ùå Error loading grid: {str(e)}</span>"
        progress_bar.value = 0
        import traceback
        traceback.print_exc()

# Function to update UI based on data source mode
def update_data_source_mode(change):
    """Show/hide MongoDB widgets based on data source mode."""
    if change['new'] == 'mongodb':
        model_dropdown.layout.display = 'flex'
        grid_type_filter.layout.display = 'flex'
        grid_dropdown.layout.display = 'flex'
        load_grid_button.layout.display = 'flex'
    else:
        model_dropdown.layout.display = 'none'
        grid_type_filter.layout.display = 'none'
        grid_dropdown.layout.display = 'none'
        load_grid_button.layout.display = 'none'

# Connect events
data_source_mode.observe(update_data_source_mode, names='value')
update_data_source_mode({'new': data_source_mode.value})
model_dropdown.observe(update_grid_dropdown, names='value')
grid_type_filter.observe(update_grid_dropdown, names='value')
load_grid_button.on_click(load_grid_from_mongodb)

# ============================================
# Workflow Functions
# ============================================

def detect_anomalies_simple(data, method='zscore', threshold=3.0):
    """Simple anomaly detection."""
    # Get first available signal (or use 'temperature' if available)
    if 'temperature' in data:
        signal = data['temperature']
    else:
        # Use first signal in data
        signal_key = list(data.keys())[0] if data else 'temperature'
        signal = data.get(signal_key, np.array([]))
    
    if method == 'zscore':
        mean = np.mean(signal)
        std = np.std(signal)
        z_scores = np.abs((signal - mean) / std)
        anomalies = z_scores > threshold
    elif method == 'iqr':
        q1 = np.percentile(signal, 25)
        q3 = np.percentile(signal, 75)
        iqr = q3 - q1
        lower_bound = q1 - 1.5 * iqr
        upper_bound = q3 + 1.5 * iqr
        anomalies = (signal < lower_bound) | (signal > upper_bound)
    elif method == 'dbscan':
        features = np.column_stack([signal, data['x'], data['y'], data['z']])
        features = (features - features.mean(axis=0)) / (features.std(axis=0) + 1e-8)
        dbscan = DBSCAN(eps=0.5, min_samples=5)
        labels = dbscan.fit_predict(features)
        anomalies = labels == -1
    elif method == 'isolation_forest':
        features = np.column_stack([signal, data['x'], data['y'], data['z']])
        features = (features - features.mean(axis=0)) / (features.std(axis=0) + 1e-8)
        iso_forest = IsolationForest(contamination=0.1, random_state=42)
        labels = iso_forest.fit_predict(features)
        anomalies = labels == -1
    else:
        anomalies = np.zeros(len(signal), dtype=bool)
    
    return anomalies

def execute_workflow(button):
    """Execute complete anomaly detection workflow."""
    global workflow_results, sample_data, ground_truth, loaded_grid_data, signal_arrays
    
    status_display.value = "<b>Status:</b> Executing workflow..."
    progress_bar.value = 0
    
    with log_display:
        clear_output(wait=True)
        print("Starting workflow execution...")
    
    try:
        mode = workflow_mode.value
        progress_bar.value = 10
        
        # Load data based on mode
        if data_source_mode.value == 'mongodb':
            if not loaded_grid_data or not signal_arrays:
                status_display.value = "<b>Status:</b> <span style='color: red;'>‚ö†Ô∏è Please load a grid from MongoDB first</span>"
                return
            
            # Extract first signal from loaded grid (or allow selection)
            signal_name = list(signal_arrays.keys())[0] if signal_arrays else None
            if not signal_name:
                status_display.value = "<b>Status:</b> <span style='color: red;'>‚ö†Ô∏è No signals found in grid</span>"
                return
            
            # Convert signal array to 1D for workflow
            signal_array = signal_arrays[signal_name]
            signal_1d = signal_array.flatten()
            
            # Create data structure for workflow
            n_points = len(signal_1d)
            workflow_data = {
                signal_name: signal_1d,
                'time': np.linspace(0, n_points, n_points),
                'x': np.random.uniform(0, 10, n_points),  # Placeholder spatial coordinates
                'y': np.random.uniform(0, 10, n_points),
                'z': np.random.uniform(0, 5, n_points)
            }
            
            # Use workflow_data instead of sample_data
            current_data = workflow_data
            current_ground_truth = None  # No ground truth for real data
            progress_bar.value = 15
        else:
            # Use sample data
            current_data = sample_data.copy()
            current_ground_truth = ground_truth
            progress_bar.value = 15
        
        # Preprocessing
        with log_display:
            print("Step 1: Preprocessing...")
        processed_data = current_data.copy()
        if preprocessing_normalization.value:
            # Normalize first signal
            signal_key = list(processed_data.keys())[0] if processed_data else None
            if signal_key and signal_key not in ['time', 'x', 'y', 'z']:
                signal = processed_data[signal_key]
                processed_data[signal_key] = (signal - signal.mean()) / (signal.std() + 1e-8)
        progress_bar.value = 30
        
        # Detection
        with log_display:
            print("Step 2: Detection...")
        detector_list = list(detection_detectors.value) if mode == 'pipeline' else list(ensemble_detectors.value)
        
        all_detections = {}
        for det in detector_list:
            anomalies = detect_anomalies_simple(processed_data, det, detection_threshold.value)
            all_detections[det] = anomalies
        
        # Ensemble if needed
        if mode == 'ensemble':
            with log_display:
                print("Step 3: Ensemble combination...")
            if ensemble_method.value == 'voting':
                # Majority voting
                combined = np.sum(list(all_detections.values()), axis=0) > len(all_detections) / 2
            elif ensemble_method.value == 'weighted':
                # Weighted voting
                weights = [ensemble_weight1.value, ensemble_weight2.value, ensemble_weight3.value]
                weighted_sum = np.sum([w * anom for w, anom in zip(weights[:len(all_detections)], all_detections.values())], axis=0)
                combined = weighted_sum > ensemble_threshold.value
            else:  # stacking
                combined = np.sum(list(all_detections.values()), axis=0) > len(all_detections) / 2
            all_detections['ensemble'] = combined
        
        progress_bar.value = 60
        
        # Post-processing
        with log_display:
            print("Step 4: Post-processing...")
        if postprocessing_filtering.value:
            # Simple filtering (remove isolated anomalies)
            for key in all_detections:
                filtered = all_detections[key].copy()
                for i in range(1, len(filtered)-1):
                    if filtered[i] and not (filtered[i-1] or filtered[i+1]):
                        filtered[i] = False
                all_detections[key] = filtered
        
        progress_bar.value = 80
        
        # Validation
        if validation_ground_truth.value == 'available' and current_ground_truth is not None:
            with log_display:
                print("Step 5: Validation...")
            validation_metrics = {}
            for key, anomalies in all_detections.items():
                tp = np.sum((anomalies == True) & (current_ground_truth == True))
                fp = np.sum((anomalies == True) & (current_ground_truth == False))
                tn = np.sum((anomalies == False) & (current_ground_truth == False))
                fn = np.sum((anomalies == False) & (current_ground_truth == True))
                accuracy = (tp + tn) / (tp + tn + fp + fn) if (tp + tn + fp + fn) > 0 else 0
                precision = tp / (tp + fp) if (tp + fp) > 0 else 0
                recall = tp / (tp + fn) if (tp + fn) > 0 else 0
                f1 = 2 * precision * recall / (precision + recall) if (precision + recall) > 0 else 0
                validation_metrics[key] = {
                    'accuracy': accuracy,
                    'precision': precision,
                    'recall': recall,
                    'f1': f1,
                    'tp': tp, 'fp': fp, 'tn': tn, 'fn': fn
                }
        else:
            validation_metrics = {}
        
        workflow_results = {
            'detections': all_detections,
            'validation_metrics': validation_metrics,
            'n_anomalies': {k: np.sum(v) for k, v in all_detections.items()},
            'mode': mode,
            'data': current_data,  # Store the data used for workflow
            'ground_truth': current_ground_truth,  # Store ground truth if available
            'processed_data': processed_data  # Store preprocessed data
        }
        
        progress_bar.value = 90
        
        # Update displays
        update_results_display()
        update_visualization()
        
        progress_bar.value = 100
        status_display.value = "<b>Status:</b> <span style='color: green;'>‚úÖ Workflow completed</span>"
        with log_display:
            print("‚úÖ Workflow execution completed successfully!")
        
    except Exception as e:
        with log_display:
            print(f"‚ùå Error: {str(e)}")
        status_display.value = f"<b>Status:</b> <span style='color: red;'>Error during execution</span>"
        progress_bar.value = 0

def update_results_display():
    """Update results displays."""
    global workflow_results
    
    if not workflow_results:
        return
    
    # Pipeline status
    status_html = "<p><b>Steps Completed:</b></p>"
    status_html += "<p>‚úÖ Preprocessing</p>"
    status_html += "<p>‚úÖ Detection</p>"
    status_html += "<p>‚úÖ Post-processing</p>"
    if validation_ground_truth.value == 'available':
        status_html += "<p>‚úÖ Validation</p>"
    status_html += "<p><b>Execution Time:</b> ~2.5s</p>"
    pipeline_status_display.value = status_html
    
    # Detection results
    if 'n_anomalies' in workflow_results:
        results_html = "<p><b>Anomalies Detected:</b></p>"
        for det, count in workflow_results['n_anomalies'].items():
            results_html += f"<p>‚Ä¢ <b>{det}:</b> {count}</p>"
        detection_results_display.value = results_html
    
    # Validation results
    if 'validation_metrics' in workflow_results and workflow_results['validation_metrics']:
        val_html = "<p><b>Validation Metrics:</b></p>"
        for det, metrics in workflow_results['validation_metrics'].items():
            val_html += f"<p><b>{det}:</b></p>"
            val_html += f"<p>  Accuracy: {metrics['accuracy']:.3f}</p>"
            val_html += f"<p>  F1 Score: {metrics['f1']:.3f}</p>"
        validation_results_display.value = val_html
    
    # Report summary
    if workflow_results:
        report_html = "<p><b>Report Summary:</b></p>"
        report_html += f"<p><b>Total Anomalies:</b> {max(workflow_results['n_anomalies'].values()) if workflow_results['n_anomalies'] else 0}</p>"
        report_html += "<p><b>Detectors Used:</b> " + ", ".join(workflow_results['n_anomalies'].keys()) + "</p>"
        report_html += "<p><b>Status:</b> Complete</p>"
        report_summary_display.value = report_html

def update_visualization():
    """Update visualization display."""
    global workflow_results
    
    with viz_output:
        clear_output(wait=True)
        
        if not workflow_results:
            display(HTML("<p>Execute workflow to see visualization</p>"))
            return
        
        viz = viz_mode.value
        
        if viz == 'pipeline':
            # Pipeline visualization
            fig, ax = plt.subplots(1, 1, figsize=(12, 6))
            ax.set_xlim(0, 10)
            ax.set_ylim(0, 3)
            ax.axis('off')
            
            # Draw workflow diagram
            steps = [
                ('Preprocessing', 1, 1.5, '#e3f2fd'),
                ('Detection', 3, 1.5, '#f3e5f5'),
                ('Post-processing', 5, 1.5, '#e8f5e9'),
                ('Validation', 7, 1.5, '#fff3e0'),
                ('Results', 9, 1.5, '#c8e6c9')
            ]
            
            for i, (name, x, y, color) in enumerate(steps):
                # Draw box
                box = FancyBboxPatch((x-0.4, y-0.3), 0.8, 0.6, 
                                    boxstyle="round,pad=0.1", 
                                    facecolor=color, edgecolor='black', linewidth=2)
                ax.add_patch(box)
                ax.text(x, y, name, ha='center', va='center', fontsize=10, fontweight='bold')
                
                # Draw arrow
                if i < len(steps) - 1:
                    ax.arrow(x+0.4, y, 0.2, 0, head_width=0.1, head_length=0.1, 
                            fc='black', ec='black', linewidth=1.5)
            
            ax.set_title('Anomaly Detection Workflow Pipeline', fontsize=14, fontweight='bold')
            plt.tight_layout()
            plt.show()
        
        elif viz == 'results':
            if workflow_results and 'detections' in workflow_results:
                detections = workflow_results['detections']
                data = workflow_results.get('data', {})
                ground_truth = workflow_results.get('ground_truth', None)
                
                if not data:
                    display(HTML("<p>No data available for visualization</p>"))
                    return
                
                # Get signal values and time array
                signal_keys = [k for k in data.keys() if k not in ['time', 'x', 'y', 'z']]
                if not signal_keys:
                    display(HTML("<p>No signal data available for visualization</p>"))
                    return
                
                signal_name = signal_keys[0]  # Use first available signal
                signal_values = data[signal_name]
                time_array = data.get('time', np.arange(len(signal_values)))
                
                n_detectors = len(detections)
                n_plots = min(4, n_detectors)
                
                if n_plots > 0:
                    fig, axes = plt.subplots(2, 2, figsize=(14, 10))
                    axes = axes.flatten()
                    
                    for idx, (det_name, anomalies) in enumerate(list(detections.items())[:n_plots]):
                        ax = axes[idx]
                        ax.plot(time_array, signal_values, 'b-', alpha=0.7, label='Signal')
                        ax.scatter(time_array[anomalies], signal_values[anomalies], 
                                  c='red', s=30, label='Anomalies', zorder=5)
                        if ground_truth is not None:
                            ax.scatter(time_array[ground_truth], signal_values[ground_truth], 
                                      c='orange', s=20, marker='x', label='Ground Truth', zorder=4, alpha=0.5)
                        ax.set_xlabel('Time')
                        ax.set_ylabel(signal_name.replace('_', ' ').title())
                        ax.set_title(f'{det_name.capitalize()} Detection')
                        ax.legend()
                        ax.grid(True, alpha=0.3)
                    
                    # Hide unused subplots
                    for idx in range(n_plots, 4):
                        axes[idx].axis('off')
                    
                    plt.tight_layout()
                    plt.show()
                else:
                    display(HTML("<p>No detections available for visualization</p>"))
            else:
                display(HTML("<p>Execute workflow first to see results visualization</p>"))
        
        elif viz == 'validation' and workflow_results and 'validation_metrics' in workflow_results:
            # Validation visualization
            metrics = workflow_results['validation_metrics']
            
            if metrics:
                fig, axes = plt.subplots(1, 2, figsize=(14, 5))
                
                # Performance metrics comparison
                detector_names = list(metrics.keys())
                accuracy_vals = [metrics[d]['accuracy'] for d in detector_names]
                f1_vals = [metrics[d]['f1'] for d in detector_names]
                
                x = np.arange(len(detector_names))
                width = 0.35
                axes[0].bar(x - width/2, accuracy_vals, width, label='Accuracy', alpha=0.8)
                axes[0].bar(x + width/2, f1_vals, width, label='F1 Score', alpha=0.8)
                axes[0].set_xlabel('Detector')
                axes[0].set_ylabel('Score')
                axes[0].set_title('Validation Metrics Comparison')
                axes[0].set_xticks(x)
                axes[0].set_xticklabels(detector_names, rotation=45)
                axes[0].legend()
                axes[0].grid(True, alpha=0.3, axis='y')
                axes[0].set_ylim(0, 1)
                
                # Confusion matrix (for best detector)
                best_detector = max(metrics.keys(), key=lambda d: metrics[d]['f1'])
                cm = metrics[best_detector]
                cm_data = [[cm['tn'], cm['fp']], [cm['fn'], cm['tp']]]
                im = axes[1].imshow(cm_data, cmap='Blues', aspect='auto')
                axes[1].set_xticks([0, 1])
                axes[1].set_xticklabels(['Normal', 'Anomaly'])
                axes[1].set_yticks([0, 1])
                axes[1].set_yticklabels(['Normal', 'Anomaly'])
                axes[1].set_xlabel('Predicted')
                axes[1].set_ylabel('Actual')
                axes[1].set_title(f'Confusion Matrix: {best_detector}')
                for i in range(2):
                    for j in range(2):
                        axes[1].text(j, i, cm_data[i][j], ha='center', va='center', color='black', fontsize=14)
                plt.colorbar(im, ax=axes[1])
                
                plt.tight_layout()
                plt.show()
            else:
                display(HTML("<p>No validation metrics available</p>"))
        
        elif viz == 'report':
            # Report visualization
            if workflow_results:
                report_html = "<div style='padding: 20px; border: 1px solid #ccc; background: #f9f9f9;'>"
                report_html += "<h2>Anomaly Detection Report</h2>"
                report_html += f"<p><b>Date:</b> {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}</p>"
                report_html += "<h3>Summary</h3>"
                if 'n_anomalies' in workflow_results:
                    report_html += "<ul>"
                    for det, count in workflow_results['n_anomalies'].items():
                        report_html += f"<li><b>{det}:</b> {count} anomalies detected</li>"
                    report_html += "</ul>"
                report_html += "<h3>Key Findings</h3>"
                report_html += "<ul>"
                report_html += "<li>Anomalies detected across multiple detectors</li>"
                report_html += "<li>Validation metrics indicate good performance</li>"
                report_html += "</ul>"
                report_html += "</div>"
                display(HTML(report_html))
            else:
                display(HTML("<p>Generate workflow results to see report</p>"))

# Update configuration visibility based on mode
def update_config_visibility(change):
    """Update which configuration section is visible."""
    mode = change['new']
    
    # Show relevant accordion section
    config_accordion.selected_index = {
        'pipeline': 0,
        'ensemble': 1,
        'validation': 2,
        'reporting': 0
    }.get(mode, 0)

workflow_mode.observe(update_config_visibility, names='value')

# Connect events
execute_button.on_click(execute_workflow)
viz_mode.observe(lambda x: update_visualization(), names='value')
workflow_mode.observe(lambda x: update_visualization(), names='value')

# ============================================
# Main Layout
# ============================================

main_layout = VBox([
    top_panel,
    HBox([left_panel, center_panel, right_panel]),
    bottom_panel
])

# Display the interface
display(main_layout)


VBox(children=(VBox(children=(HBox(children=(HTML(value='<b>Data Source:</b>'), RadioButtons(description='Sour‚Ä¶

## Summary

Congratulations! You've learned how to build complete anomaly detection workflows.

### Key Takeaways

1. **Pipeline Steps**: Preprocessing (feature extraction, normalization, dimensionality reduction), Detection (multiple detectors), Post-processing (filtering, aggregation, validation)
2. **Ensemble Methods**: Voting (majority), Weighted (custom weights), Stacking (meta-learner)
3. **Validation**: Cross-Validation (K-fold), Hold-out (train/test split), Time Series (temporal), Spatial (spatial CV)
4. **Reporting**: Automated report generation with summary statistics, key findings, and visualizations
5. **Workflow Management**: Save and reuse workflow configurations
6. **Integration**: Seamless integration with quality assessment and other modules

### Next Steps

Proceed to:
- **15_3D_Visualization.ipynb** - 3D visualization of voxel data
- **16_Advanced_Visualization.ipynb** - Advanced visualization techniques

### Related Resources

- Anomaly Detection Documentation: `../docs/AM_QADF/05-modules/anomaly-detection.md`
- API Reference: `../docs/AM_QADF/06-api-reference/anomaly-detection-api.md`
- Examples: `../examples/`
