# Analyse de Drift en Production

Ce notebook analyse les logs de production de l'API de scoring crédit pour détecter :
- Le **data drift** (dérive des features d'entrée) via Evidently AI
- Le **prediction drift** (dérive des sorties du modèle)
- Les **métriques opérationnelles** (latence, taux d'erreur)

In [13]:
import json
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from evidently.presets import DataDriftPreset
from evidently import Report

sns.set_theme(style="whitegrid")
pd.set_option("display.max_columns", 15)

In [14]:
# Load reference data (training distribution)
FEATURE_COLUMNS = [
    "EXT_SOURCES_MEAN", "CREDIT_TERM", "EXT_SOURCE_3",
    "GOODS_PRICE_CREDIT_PERCENT", "INSTAL_AMT_PAYMENT_sum",
    "AMT_ANNUITY", "POS_CNT_INSTALMENT_FUTURE_mean",
    "DAYS_BIRTH", "EXT_SOURCES_WEIGHTED", "EXT_SOURCE_2",
]

ref_data = pd.read_csv("../data/dataset_top10_features_data.csv")
ref_features = ref_data[FEATURE_COLUMNS]

print(f"Reference data: {ref_features.shape[0]:,} rows, {ref_features.shape[1]} features")
ref_features.describe()

Reference data: 307,511 rows, 10 features


Unnamed: 0,EXT_SOURCES_MEAN,CREDIT_TERM,EXT_SOURCE_3,GOODS_PRICE_CREDIT_PERCENT,INSTAL_AMT_PAYMENT_sum,AMT_ANNUITY,POS_CNT_INSTALMENT_FUTURE_mean,DAYS_BIRTH,EXT_SOURCES_WEIGHTED,EXT_SOURCE_2
count,307511.0,307511.0,307511.0,307511.0,307511.0,307511.0,307511.0,307511.0,307511.0,307511.0
mean,0.50926,0.053695,0.515695,0.900683,660010.6,27108.487841,9.047393,-16036.995067,1.636083,0.5145034
std,0.149761,0.022481,0.174736,0.096587,895763.8,14493.461065,6.359185,4363.988632,2.308806,0.1908699
min,6e-06,0.022073,0.000527,0.166667,0.0,1615.5,0.0,-25229.0,0.0,8.173617e-08
25%,0.413716,0.0369,0.4171,0.834725,137614.2,16524.0,5.052632,-19682.0,0.0,0.3929737
50%,0.524502,0.05,0.535276,0.893815,318619.5,24903.0,6.95,-15750.0,0.0,0.5659614
75%,0.622757,0.064043,0.636376,1.0,786505.2,34596.0,11.264706,-12413.0,3.991048,0.6634218
max,0.878903,0.12443,0.89601,6.666667,25537050.0,258025.5,60.0,-7489.0,7.677298,0.8549997


In [15]:
# Load production logs
log_path = Path("../logs/predictions.jsonl")

logs = []
with open(log_path) as f:
    for line in f:
        logs.append(json.loads(line))

prod_df = pd.DataFrame(logs)
prod_df["timestamp"] = pd.to_datetime(prod_df["timestamp"])

print(f"Total log entries: {len(prod_df)}")
print(f"Period: {prod_df['timestamp'].min()} to {prod_df['timestamp'].max()}")
print(f"Status codes: {prod_df['status_code'].value_counts().to_dict()}")

# Filter successful predictions and extract features
success_df = prod_df[prod_df["status_code"] == 200].copy()
prod_features = pd.DataFrame(success_df["input_features"].tolist())
prod_features = prod_features[FEATURE_COLUMNS]

print(f"\nSuccessful predictions: {len(prod_features):,}")
prod_features.describe()

Total log entries: 1000
Period: 2026-01-31 13:55:55.350211+00:00 to 2026-02-07 13:34:28.134995+00:00
Status codes: {200: 979, 422: 21}

Successful predictions: 979


Unnamed: 0,EXT_SOURCES_MEAN,CREDIT_TERM,EXT_SOURCE_3,GOODS_PRICE_CREDIT_PERCENT,INSTAL_AMT_PAYMENT_sum,AMT_ANNUITY,POS_CNT_INSTALMENT_FUTURE_mean,DAYS_BIRTH,EXT_SOURCES_WEIGHTED,EXT_SOURCE_2
count,979.0,979.0,979.0,979.0,979.0,979.0,979.0,979.0,979.0,979.0
mean,0.50613,0.053241,0.515693,0.894956,668741.2,31972.638815,8.819687,-13048.536261,1.481557,0.367757
std,0.153519,0.022466,0.176764,0.093526,912167.9,16478.792009,6.175073,4392.643264,2.256078,0.179519
min,0.003038,0.025278,0.000527,0.60241,2232.765,4433.4,0.0,-21983.0,0.0,0.0
25%,0.397158,0.036163,0.420611,0.834725,137497.6,19299.6,5.2,-16606.0,0.0,0.239989
50%,0.522035,0.04893,0.535276,0.893815,318619.5,29700.0,6.95,-12885.0,0.0,0.412413
75%,0.625625,0.061426,0.634706,1.0,806092.3,41064.3,10.5,-9392.0,3.66955,0.513939
max,0.826205,0.121201,0.881027,1.0,8131873.0,111634.2,45.055556,-4686.0,7.112295,0.653957


In [None]:
# Evidently Data Drift Report
drift_report = Report(metrics=[DataDriftPreset()])
drift_snapshot = drift_report.run(reference_data=ref_features, current_data=prod_features)
drift_snapshot

In [None]:
# Save drift report as HTML
report_path = Path("data_drift_report.html")
drift_snapshot.save_html(str(report_path))
print(f"Drift report saved to {report_path.resolve()}")

In [None]:
# Distribution comparison: Reference vs Production
fig, axes = plt.subplots(5, 2, figsize=(16, 20))
axes = axes.flatten()

for i, col in enumerate(FEATURE_COLUMNS):
    ax = axes[i]
    ax.hist(ref_features[col].dropna(), bins=50, alpha=0.5, label="Référence", density=True, color="steelblue")
    ax.hist(prod_features[col].dropna(), bins=50, alpha=0.5, label="Production", density=True, color="coral")
    ax.set_title(col, fontsize=11, fontweight="bold")
    ax.legend(fontsize=9)
    ax.set_ylabel("Densité")

fig.suptitle("Distribution des features : Référence vs Production", fontsize=14, fontweight="bold", y=1.01)
plt.tight_layout()
plt.show()

In [None]:
# Prediction drift analysis
fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Probability distribution
axes[0].hist(success_df["probability_default"], bins=50, color="steelblue", edgecolor="white")
axes[0].axvline(x=0.10, color="red", linestyle="--", label="Seuil (0.10)")
axes[0].set_title("Distribution des probabilités de défaut", fontweight="bold")
axes[0].set_xlabel("Probabilité de défaut")
axes[0].set_ylabel("Nombre de prédictions")
axes[0].legend()

# Decision counts
decision_counts = success_df["credit_decision"].value_counts()
colors = ["#2ecc71" if d == "approved" else "#e74c3c" for d in decision_counts.index]
axes[1].bar(decision_counts.index, decision_counts.values, color=colors)
axes[1].set_title("Décisions de crédit", fontweight="bold")
axes[1].set_ylabel("Nombre")
for j, v in enumerate(decision_counts.values):
    axes[1].text(j, v + 5, str(v), ha="center", fontweight="bold")

# Rolling average of probability over time
ts_sorted = success_df.sort_values("timestamp").reset_index(drop=True)
ts_sorted["prob_rolling"] = ts_sorted["probability_default"].rolling(window=50, min_periods=1).mean()
axes[2].plot(ts_sorted["timestamp"], ts_sorted["prob_rolling"], color="steelblue", linewidth=1.5)
axes[2].axhline(y=0.10, color="red", linestyle="--", label="Seuil (0.10)")
axes[2].set_title("Probabilité moyenne glissante (fenêtre=50)", fontweight="bold")
axes[2].set_xlabel("Temps")
axes[2].set_ylabel("Probabilité moyenne")
axes[2].legend()
axes[2].tick_params(axis="x", rotation=30)

plt.tight_layout()
plt.show()

In [None]:
# Operational monitoring
fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Error rate by hour
prod_df["hour"] = prod_df["timestamp"].dt.floor("h")
hourly = prod_df.groupby("hour").agg(
    total=("status_code", "count"),
    errors=("status_code", lambda x: (x >= 400).sum()),
)
hourly["error_rate"] = hourly["errors"] / hourly["total"]
axes[0].bar(range(len(hourly)), hourly["error_rate"], color="coral", alpha=0.8)
axes[0].set_title("Taux d'erreur par heure", fontweight="bold")
axes[0].set_xlabel("Heure (index)")
axes[0].set_ylabel("Taux d'erreur")
axes[0].axhline(y=0.05, color="red", linestyle="--", alpha=0.5, label="Seuil 5%")
axes[0].legend()

# Latency distribution
p95 = prod_df["duration_ms"].quantile(0.95)
axes[1].hist(prod_df["duration_ms"], bins=50, color="steelblue", edgecolor="white")
axes[1].axvline(x=p95, color="red", linestyle="--", label=f"P95 = {p95:.0f}ms")
axes[1].set_title("Distribution de la latence", fontweight="bold")
axes[1].set_xlabel("Latence (ms)")
axes[1].set_ylabel("Nombre de requêtes")
axes[1].legend()

# Latency over time
ts_all = prod_df.sort_values("timestamp").reset_index(drop=True)
ts_all["latency_rolling"] = ts_all["duration_ms"].rolling(window=50, min_periods=1).mean()
axes[2].plot(ts_all["timestamp"], ts_all["latency_rolling"], color="steelblue", linewidth=1.5)
axes[2].axhline(y=p95, color="red", linestyle="--", alpha=0.5, label=f"P95 = {p95:.0f}ms")
axes[2].set_title("Latence moyenne glissante (fenêtre=50)", fontweight="bold")
axes[2].set_xlabel("Temps")
axes[2].set_ylabel("Latence (ms)")
axes[2].legend()
axes[2].tick_params(axis="x", rotation=30)

plt.tight_layout()
plt.show()

In [None]:
# Summary statistics
total = len(prod_df)
errors = (prod_df["status_code"] >= 400).sum()
ok = total - errors
denied = (success_df["credit_decision"] == "denied").sum()

print("=" * 50)
print("  TABLEAU DE BORD - MÉTRIQUES OPÉRATIONNELLES")
print("=" * 50)
print(f"  Requêtes totales :       {total:,}")
print(f"  Succès :                 {ok:,}")
print(f"  Erreurs :                {errors:,} ({errors/total:.1%})")
print(f"  Crédits refusés :        {denied:,}/{ok:,} ({denied/ok:.1%})")
print()
print("  Latence (ms) :")
print(f"    Médiane (P50) :        {prod_df['duration_ms'].quantile(0.50):.1f}")
print(f"    P90 :                  {prod_df['duration_ms'].quantile(0.90):.1f}")
print(f"    P95 :                  {prod_df['duration_ms'].quantile(0.95):.1f}")
print(f"    P99 :                  {prod_df['duration_ms'].quantile(0.99):.1f}")
print("=" * 50)

## Recommandations

### Points de vigilance - Drift détecté

L'analyse Evidently révèle un drift significatif sur **3 features** :

| Feature | Type de drift | Cause probable | Impact |
|---------|--------------|----------------|--------|
| `EXT_SOURCE_2` | Shift vers le bas (-0.15) | Changement de scoring du bureau de crédit | Augmentation des scores de risque |
| `DAYS_BIRTH` | Shift vers 0 (+3000 jours) | Rajeunissement de la clientèle | Profil de risque différent |
| `AMT_ANNUITY` | Augmentation (+20%) | Inflation / hausse des montants | Surestimation possible du risque |

### Seuils de déclenchement de ré-entraînement

1. **Drift statistique** : Si le test de Kolmogorov-Smirnov détecte un drift sur plus de 3 features simultanément (p-value < 0.05), planifier un ré-entraînement.
2. **Taux de refus** : Si le taux de refus dépasse 50% sur une fenêtre glissante de 7 jours, investiguer.
3. **Performance opérationnelle** : Si le taux d'erreur dépasse 5% ou la latence P95 dépasse 500ms, alerter.

### Actions recommandées

- **Court terme** : Recalibrer le seuil de décision (actuellement 0.10) sur les données récentes
- **Moyen terme** : Ré-entraîner le modèle en incluant les données de production récentes
- **Long terme** : Automatiser la détection de drift avec des alertes (Evidently + cron job ou Airflow)