In [None]:
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Drift Detection Framework Demo\n",
    "\n",
    "This notebook demonstrates how to use the drift detection framework to detect concept drift in machine learning models."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "import sys\n",
    "import os\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "# Add the parent directory to the path if running from the notebooks directory\n",
    "sys.path.append('..')\n",
    "\n",
    "# Import drift detection modules\n",
    "from drift_detection import (\n",
    "    load_and_preprocess_nsl_kdd,\n",
    "    prepare_drift_experiment_data,\n",
    "    DriftMonitor,\n",
    "    NeuralDriftMonitor,\n",
    "    run_drift_detection_comparison\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Load and Prepare Data\n",
    "\n",
    "First, we'll load the NSL-KDD dataset and prepare it for our drift detection experiment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Load and preprocess the NSL-KDD dataset\n",
    "df_train, df_test = load_and_preprocess_nsl_kdd(\n",
    "    train_path='../data/KDDTrain+.txt',\n",
    "    test_path='../data/KDDTest+.txt'\n",
    ")\n",
    "\n",
    "# Check dataset shapes\n",
    "print(f\"Training data shape: {df_train.shape}\")\n",
    "print(f\"Test data shape: {df_test.shape}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Prepare data for drift detection experiment\n",
    "X_train, y_train, X_test, y_test, front_size, original_size = prepare_drift_experiment_data(\n",
    "    df_train, df_test, validation_split=0.3\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Basic Drift Detection with Multiple Classifiers\n",
    "\n",
    "Now, we'll use the DriftMonitor class to detect drift using multiple classifiers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Initialize drift monitor with a specific window size\n",
    "window_size = 600\n",
    "monitor = DriftMonitor(window_size=window_size, threshold=0.01, psi_threshold=0.2)\n",
    "\n",
    "# Detect drift\n",
    "results = monitor.detect_drift(X_train, y_train, X_test, y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Plot the results for the XGBoost classifier\n",
    "xgb_results = results['XGBoost']\n",
    "\n",
    "plt.figure(figsize=(15, 12))\n",
    "\n",
    "# KS Drift Scores plot\n",
    "plt.subplot(3, 1, 1)\n",
    "plt.plot(xgb_results['drift_scores'], label='KS Score', color='blue', linewidth=2)\n",
    "plt.axhline(y=0.01, color='r', linestyle='--', label='KS Threshold', linewidth=2)\n",
    "plt.title('KS Test Drift Scores - XGBoost', fontsize=16)\n",
    "plt.xlabel('Window Index', fontsize=14)\n",
    "plt.ylabel('KS Score (p-value)', fontsize=14)\n",
    "plt.yscale('log')\n",
    "plt.legend()\n",
    "plt.grid(True, alpha=0.3)\n",
    "\n",
    "# PSI Scores plot\n",
    "plt.subplot(3, 1, 2)\n",
    "plt.plot(xgb_results['psi_scores'], label='PSI Score', color='green', linewidth=2)\n",
    "plt.axhline(y=0.2, color='r', linestyle='--', label='PSI Threshold', linewidth=2)\n",
    "plt.title('PSI Scores - XGBoost', fontsize=16)\n",
    "plt.xlabel('Window Index', fontsize=14)\n",
    "plt.ylabel('PSI Value', fontsize=14)\n",
    "plt.legend()\n",
    "plt.grid(True, alpha=0.3)\n",
    "\n",
    "# Accuracy plot\n",
    "plt.subplot(3, 1, 3)\n",
    "plt.plot(xgb_results['accuracies'], label='Accuracy', color='orange', linewidth=2)\n",
    "plt.title('Model Accuracy - XGBoost', fontsize=16)\n",
    "plt.xlabel('Window Index', fontsize=14)\n",
    "plt.ylabel('Accuracy', fontsize=14)\n",
    "plt.legend()\n",
    "plt.grid(True, alpha=0.3)\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Comparison Across Multiple Classifiers\n",
    "\n",
    "Let's compare the drift detection performance across different classifiers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Run drift detection comparison with multiple window sizes\n",
    "comparison_results = run_drift_detection_comparison(\n",
    "    X_train, y_train, X_test, y_test,\n",
    "    window_sizes=[400, 600],\n",
    "    front_size=front_size,\n",
    "    original_size=original_size\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Neural Network-Based Drift Detection\n",
    "\n",
    "Now, let's use neural networks for drift detection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Try to import TensorFlow\n",
    "try:\n",
    "    import tensorflow as tf\n",
    "    tf_available = True\n",
    "except ImportError:\n",
    "    tf_available = False\n",
    "    print(\"TensorFlow not available. Skipping neural network-based drift detection.\")\n",
    "\n",
    "if tf_available:\n",
    "    # Initialize neural drift monitor\n",
    "    neural_monitor = NeuralDriftMonitor(window_size=600, threshold=0.01, psi_threshold=0.2)\n",
    "    \n",
    "    # Detect drift using neural networks\n",
    "    nn_results = neural_monitor.detect_drift_with_nn(X_train, y_train, X_test, y_test)\n",
    "    \n",
    "    # Print summary\n",
    "    print(\"\\nNeural network models drift detection summary:\")\n",
    "    for model_name, results in nn_results.items():\n",
    "        drift_count = sum(1 for score in results['drift_scores'] if score < neural_monitor.threshold)\n",
    "        avg_accuracy = sum(results['accuracies']) / len(results['accuracies']) if results['accuracies'] else 0\n",
    "        print(f\"{model_name}: Detected drift in {drift_count} windows, Average accuracy: {avg_accuracy:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Interactive Drift Simulation\n",
    "\n",
    "Finally, let's run the interactive drift simulation in a Jupyter environment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "source": [
    "# Try to import PyTorch for the simulation\n",
    "try:\n",
    "    import torch\n",
    "    torch_available = True\n",
    "except ImportError:\n",
    "    torch_available = False\n",
    "    print(\"PyTorch not available. Skipping interactive simulation.\")\n",
    "\n",
    "if torch_available:\n",
    "    from drift_detection import DriftDetectionSimulator\n",
    "    \n",
    "    # Initialize and display simulator\n",
    "    try:\n",
    "        simulator = DriftDetectionSimulator()\n",
    "        simulator.display()\n",
    "    except Exception as e:\n",
    "        print(f\"Error initializing simulator: {e}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Conclusion\n",
    "\n",
    "In this notebook, we demonstrated how to use the drift detection framework to detect concept drift in machine learning models. We used both traditional classifiers and neural networks, and we compared the results across different window sizes.\n",
    "\n",
    "Key observations:\n",
    "- Different classifiers may detect drift at different points\n",
    "- Neural networks can detect more subtle forms of drift\n",
    "- Using multiple drift detection methods (KS test, PSI) provides more robust detection\n",
    "- Window size affects the sensitivity of drift detection\n",
    "\n",
    "For more information, check out the README.md file and the source code documentation."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}