# Monitoring and Model Drift

In this notebook, we'll learn how to **monitor machine learning models** in production and detect **model drift** — when a model's performance degrades over time due to changing data patterns.

## 🎯 Objectives
- Understand the importance of ML model monitoring.
- Learn different types of drift (data, concept, and prediction drift).
- Implement drift detection techniques.
- Visualize drift using Python tools.

---

## 🧠 1. Why Monitoring is Important

Monitoring ensures that your deployed ML model remains **accurate, reliable, and consistent** over time.

Without monitoring:
- Model accuracy can degrade silently.
- Predictions may become biased or irrelevant.
- Business decisions might rely on outdated insights.

### Key Monitoring Metrics
- **Performance Metrics:** Accuracy, Precision, Recall, F1-score.
- **Data Drift:** Input data distribution changes.
- **Concept Drift:** Relationship between inputs and outputs changes.
- **Prediction Drift:** Shift in model predictions distribution.

## 🧩 2. Setup — Simulating an ML Model

We’ll train a simple classification model and then simulate new incoming data that has drifted from the original distribution.

In [None]:
import numpy as np
import pandas as pd
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import seaborn as sns

# Generate initial training data
X_train, y_train = make_classification(n_samples=1000, n_features=5, random_state=42)
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)

# Generate new (drifted) data
X_new, y_new = make_classification(n_samples=500, n_features=5, shift=1.5, random_state=99)

# Predictions and accuracy
y_pred = model.predict(X_new)
acc = accuracy_score(y_new, y_pred)
print(f"Model accuracy on new (possibly drifted) data: {acc:.2f}")

## 📊 3. Visualizing Data Drift

We'll compare feature distributions from training and new data to detect **data drift**.

In [None]:
train_df = pd.DataFrame(X_train, columns=[f'feature_{i}' for i in range(5)])
new_df = pd.DataFrame(X_new, columns=[f'feature_{i}' for i in range(5)])

fig, axes = plt.subplots(1, 2, figsize=(12, 5))
sns.kdeplot(train_df['feature_0'], fill=True, ax=axes[0], label='Train')
sns.kdeplot(new_df['feature_0'], fill=True, ax=axes[0], label='New')
axes[0].set_title('Feature 0 Distribution Drift')
axes[0].legend()

sns.kdeplot(train_df['feature_1'], fill=True, ax=axes[1], label='Train')
sns.kdeplot(new_df['feature_1'], fill=True, ax=axes[1], label='New')
axes[1].set_title('Feature 1 Distribution Drift')
axes[1].legend()
plt.show()

You can see how the distributions for some features may have **shifted** — indicating possible drift in the data source.

## 📏 4. Quantifying Drift — Population Stability Index (PSI)

PSI measures how much the distribution of a variable has changed over time.

Typical PSI interpretation:
- **< 0.1:** No significant drift.
- **0.1 – 0.25:** Moderate drift.
- **> 0.25:** Significant drift detected.

In [None]:
def calculate_psi(expected, actual, buckets=10):
    def scale_range(input, min_val, max_val):
        input_std = (input - input.min()) / (input.max() - input.min())
        input_scaled = input_std * (max_val - min_val) + min_val
        return input_scaled
    
    breakpoints = np.arange(0, buckets + 1) / buckets
    breakpoints = scale_range(expected.rank(pct=True), 0, 1).quantile(breakpoints)
    expected_percents = np.histogram(expected, bins=breakpoints)[0] / len(expected)
    actual_percents = np.histogram(actual, bins=breakpoints)[0] / len(actual)
    psi_value = np.sum((expected_percents - actual_percents) * np.log(expected_percents / actual_percents))
    return psi_value

psi_values = {f'feature_{i}': calculate_psi(train_df[f'feature_{i}'], new_df[f'feature_{i}']) for i in range(5)}
psi_df = pd.DataFrame(list(psi_values.items()), columns=['Feature', 'PSI'])
psi_df

If PSI > 0.25 for any feature, it’s a clear sign that the input distribution has **drifted significantly** from the training data.

## 🔄 5. Monitoring Prediction Drift

We can track changes in **model prediction probabilities** or **label distributions** to identify drift in predictions.

In [None]:
train_preds = model.predict(X_train)
new_preds = model.predict(X_new)

plt.figure(figsize=(6, 4))
sns.kdeplot(train_preds, label='Train Predictions', fill=True)
sns.kdeplot(new_preds, label='New Predictions', fill=True)
plt.title('Prediction Distribution Drift')
plt.legend()
plt.show()

If the model's prediction distribution shifts dramatically, it might indicate **concept drift** — the relationship between input features and output labels has changed.

## 🧩 6. Automating Drift Monitoring

In real-world scenarios, drift monitoring is automated using tools such as:
- **Evidently AI** – Open-source library for model monitoring.
- **WhyLabs** – Data logging and drift alerts.
- **Prometheus + Grafana** – Metric visualization and alerting.
- **MLflow / Neptune.ai** – Tracking model metrics over time.

Example using `evidently`:

```python
from evidently.report import Report
from evidently.metrics import DataDriftPreset

report = Report(metrics=[DataDriftPreset()])
report.run(reference_data=train_df, current_data=new_df)
report.show(mode='inline')
```

## ⚙️ 7. Model Retraining Trigger

Once significant drift is detected (PSI > 0.25 or performance drop > threshold), retraining can be triggered automatically:

```python
if acc < 0.8 or any(psi_df['PSI'] > 0.25):
    print('⚠️ Drift detected! Triggering model retraining pipeline...')
else:
    print('✅ Model performance stable. No retraining needed.')
```

## ✅ Summary

- Monitoring is essential for maintaining ML model reliability.
- We visualized and measured **data**, **concept**, and **prediction drift**.
- PSI was used to quantify drift.
- Automated retraining can be triggered based on thresholds.

---
Next → **09-ML_Model_Registry_and_Versioning.ipynb** 🗂️