[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/klar74/WS2025_lecture/blob/main/Vorlesung_27/AI4I_Colab_Notebook.ipynb)

# AI4I 2020 Predictive Maintenance ‚Äì Vollst√§ndiger CRISP-DM Workflow

**Vorlesung 28: Smart Operations Management**  
**Thema:** Predictive Maintenance mit Machine Learning

---

## üéØ Lernziele

In diesem Notebook durchlaufen wir **alle Phasen von CRISP-DM** anhand eines industrienahen Beispiels:

1. **Business Understanding**: Warum Predictive Maintenance? ROI-Berechnung
2. **Data Understanding**: Dataset explorieren, Statistiken, Visualisierungen
3. **Data Preparation**: Feature Engineering, Encoding, Scaling
4. **Modeling**: Logistic Regression, Decision Trees, Random Forest
5. **Evaluation**: Confusion Matrix, Precision/Recall, ROC, Cost-Sensitive Metrics
6. **Deployment**: √úberlegungen f√ºr Produktionsumgebung

---

## üìä Der AI4I 2020 Dataset

**Quelle:** HTW Berlin (Prof. Stephan Matzka, 2020)  
**Lizenz:** CC BY 4.0  
**Gr√∂√üe:** 10.000 Datenpunkte  
**Kontext:** Synthetischer, aber realistischer Produktionsdatensatz

**Failure Modes:**
- **TWF**: Tool Wear Failure (Werkzeugverschlei√ü)
- **HDF**: Heat Dissipation Failure (W√§rmeabfuhr-Problem)
- **PWF**: Power Failure (Leistung au√üerhalb Bereich)
- **OSF**: Overstrain Failure (√úberlastung)
- **RNF**: Random Failures (Zuf√§llige Ausf√§lle)

**Challenge:** Nur 3,39% Ausf√§lle (stark imbalanced!)

---

## üìö Wiederholung wichtiger Konzepte

Dieses Notebook wiederholt:
- ‚úÖ Deskriptive Statistik
- ‚úÖ Explorative Datenanalyse (EDA)
- ‚úÖ Feature Engineering
- ‚úÖ Train-Test Split (stratified!)
- ‚úÖ Supervised Learning (Klassifikation)
- ‚úÖ Decision Trees & Ensemble Methods
- ‚úÖ Imbalanced Data Handling
- ‚úÖ Klassifikationsmetriken (Precision, Recall, F1, ROC)
- ‚úÖ Cost-Sensitive Learning
- ‚úÖ Model Interpretability

---

**Direkt-URL zum Dataset:**  
`https://archive.ics.uci.edu/ml/machine-learning-databases/00601/ai4i2020.csv`

In [None]:
# %%capture
# Optional: Hilfspakete installieren (in Colab meist nicht n√∂tig)
# !pip install -q ucimlrepo


# 1Ô∏è‚É£ Business Understanding

## Warum Predictive Maintenance?

**Das Problem:**
- Ungeplante Maschinenausf√§lle kosten 1.000 - 10.000 ‚Ç¨/Stunde
- Lieferverzug ‚Üí Vertragsstrafen, Kundenverlust
- Notfall-Reparaturen sind teuer

**Traditionelle Ans√§tze:**
- **Reaktiv**: Reparieren wenn kaputt ‚Üí hohe Ausfallkosten
- **Pr√§ventiv**: Feste Intervalle ‚Üí oft zu fr√ºh (Verschwendung) oder zu sp√§t (Ausfall)

**Predictive Maintenance:**
- Zustandsdaten nutzen um Ausf√§lle **vorherzusagen**
- Wartung **gerade rechtzeitig** durchf√ºhren

## Business Case (Beispiel)

**Situation:** 10 Maschinen, je 2 ungeplante Ausf√§lle/Jahr √† 8h

- Produktionsausfall: 10 √ó 2 √ó 8h √ó 2.000 ‚Ç¨/h = **320.000 ‚Ç¨/Jahr**
- Notfall-Reparaturen: 10 √ó 2 √ó 5.000 ‚Ç¨ = **100.000 ‚Ç¨/Jahr**
- **Gesamtkosten: 420.000 ‚Ç¨/Jahr**

**Investment:**
- Sensoren & Infrastruktur: 50.000 ‚Ç¨
- ML-Entwicklung: 100.000 ‚Ç¨
- J√§hrlicher Betrieb: 30.000 ‚Ç¨/Jahr

**Erwarteter Nutzen:** 70% der Ausf√§lle vermieden
- Eingesparte Kosten: 420.000 √ó 0,7 = **294.000 ‚Ç¨/Jahr**
- Netto-Nutzen Jahr 1: 294.000 - 150.000 - 30.000 = **114.000 ‚Ç¨**
- **ROI Jahr 1: 76%**
- **Payback: ~15 Monate**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier
from sklearn.cluster import KMeans
from sklearn.metrics import (classification_report, confusion_matrix, roc_auc_score,
                             RocCurveDisplay, PrecisionRecallDisplay)
import time
import os


# 2Ô∏è‚É£ Data Understanding: Daten laden und explorieren

## CRISP-DM Phase 2: Data Understanding

**Ziele:**
- Dataset kennenlernen
- Datenqualit√§t pr√ºfen
- Erste Hypothesen bilden

In [None]:
# Datensatz laden (mit SSL-Workaround)
import ssl
import urllib.request

# SSL-Kontext f√ºr abgelaufene Zertifikate
ssl._create_default_https_context = ssl._create_unverified_context

url = "https://archive.ics.uci.edu/ml/machine-learning-databases/00601/ai4i2020.csv"

try:
    df = pd.read_csv(url)
except:
    # Fallback: Direkt √ºber urllib mit unverified context
    print("Lade Daten mit SSL-Workaround...")
    with urllib.request.urlopen(url) as response:
        df = pd.read_csv(response)

df.head()

In [None]:
df.shape, df.dtypes

### üìä Erste Analyse

**Fragen:**
- Wie viele Datenpunkte?
- Welche Features (numerisch/kategorial)?
- Fehlende Werte?
- Wie ist die Klassenverteilung?

In [None]:
# Erste Inspektion - Spaltennamen und Datentypen
print("üìã DATASET INFO")
print("="*50)
print(f"Shape: {df.shape}")
print(f"\nSpalten ({len(df.columns)}):")
print(df.columns.tolist())
print("\n" + "="*50)

## Spaltennamen harmonisieren

In [None]:
df.columns = (df.columns
              .str.strip()
              .str.replace(r"\s+", "_", regex=True)
              .str.replace(r"[\[\]\(\)]", "", regex=True)
              .str.replace("%", "pct")
              .str.lower())
df.head()

### üìä Klassenverteilung analysieren

**Das Imbalance-Problem:**
- Wie viele Ausf√§lle vs. normale Betriebszust√§nde?
- Welche Failure Modes sind am h√§ufigsten?

In [None]:
# Class Distribution - Das Imbalance Problem!
print("üìä KLASSENVERTEILUNG")
print("="*50)
print(df['machine_failure'].value_counts())
print("\nProzentual:")
print(df['machine_failure'].value_counts(normalize=True))

# Failure Modes analysieren
print("\n--- FAILURE MODES ---")
failure_modes = ['twf', 'hdf', 'pwf', 'osf', 'rnf']
for mode in failure_modes:
    if mode in df.columns:
        count = df[mode].sum()
        print(f"{mode.upper()}: {count:4d} F√§lle")

print("\n" + "="*50)
print(f"Gesamt Failures: {df['machine_failure'].sum()}")
print(f"Failure Rate:    {df['machine_failure'].mean():.2%}")
print("="*50)

### üìà Deskriptive Statistik

**Wiederholung:**
- Mean, Median, Std
- Min, Max, Quartile
- Verteilungen erkennen

In [None]:
df.describe(include='all')

In [None]:
# Visualisierung wichtiger Features
import seaborn as sns

fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.flatten()

# Features visualisieren
important_features = ['air_temperature_k', 'process_temperature_k', 
                     'rotational_speed_rpm', 'torque_nm', 'tool_wear_min']

for idx, col in enumerate(important_features):
    if col in df.columns:
        axes[idx].hist(df[col], bins=30, edgecolor='black', alpha=0.7)
        axes[idx].set_title(f'Distribution: {col}')
        axes[idx].set_xlabel(col)
        axes[idx].set_ylabel('Frequency')
        
        # Failure vs. Non-Failure
        df[df['machine_failure']==1][col].hist(ax=axes[idx], bins=30, 
                                               alpha=0.5, color='red', label='Failure')
        axes[idx].legend()

plt.tight_layout()
plt.show()

### üîç Korrelationsanalyse

**Wiederholung:**
- Pearson Korrelation: -1 bis +1
- Welche Features korrelieren mit Ausf√§llen?
- Multikollinearit√§t erkennen

In [None]:
# Korrelation mit Zielvariable
numeric_cols = df.select_dtypes(include=[np.number]).columns.tolist()
correlations = df[numeric_cols].corr()['machine_failure'].sort_values(ascending=False)

print("Korrelation mit Machine Failure:")
print(correlations)

# Heatmap
plt.figure(figsize=(12, 10))
sns.heatmap(df[numeric_cols].corr(), annot=True, fmt='.2f', cmap='coolwarm', center=0)
plt.title('Correlation Heatmap')
plt.tight_layout()
plt.show()

# 3Ô∏è‚É£ Data Preparation: Feature Engineering

## CRISP-DM Phase 3: Data Preparation

**Ziele:**
- Neue Features erstellen (basierend auf Failure Mode Regeln)
- Kategoriale Features encoden
- Features skalieren

In [None]:
# Feature Engineering basierend auf Failure Mode Regeln

# 1. Power (relevant f√ºr PWF)
# Power = Torque √ó Rotational Speed (in rad/s)
df['power_w'] = df['torque_nm'] * df['rotational_speed_rpm'] * 2 * np.pi / 60

# 2. Temperature Difference (relevant f√ºr HDF)
df['temp_diff_k'] = df['process_temperature_k'] - df['air_temperature_k']

# 3. Strain (relevant f√ºr OSF)
df['strain'] = df['torque_nm'] * df['tool_wear_min']

print("Neue Features erstellt:")
print(df[['power_w', 'temp_diff_k', 'strain']].head())
print("\nDescribe:")
print(df[['power_w', 'temp_diff_k', 'strain']].describe())

### ‚ö†Ô∏è Wichtiger Hinweis: Realit√§t vs. Lernbeispiel

**In diesem Dataset kennen wir die Failure Modes (TWF, HDF, PWF, OSF, RNF)** ‚Äì das ist didaktisch hilfreich, aber **unrealistisch**!

**In der Realit√§t:**
- Wir haben nur: **Sensordaten** (Temperatur, Drehzahl, Drehmoment, ...) und **Ausfall ja/nein**
- Wir wissen **nicht**, *warum* eine Maschine ausf√§llt (Werkzeugverschlei√ü? √úberhitzung? √úberlastung?)
- Genau das ist die **Aufgabe von Explainable AI (XAI)**!

**XAI-Techniken helfen dabei:**
1. **Feature Importance** (z.B. aus Random Forest) ‚Üí Welche Sensoren sind wichtig?
2. **SHAP Values** ‚Üí Wie beeinflussen einzelne Messwerte die Vorhersage?
3. **Decision Trees** (mit `max_depth`) ‚Üí Welche Schwellenwerte trennen Ausfall/OK?
4. **Clustering in Residuen** ‚Üí Gibt es versteckte Ausfallmuster?

**Unser Vorteil hier:** Wir k√∂nnen pr√ºfen, ob das Modell *plausible* Features lernt (z.B. `power_w` f√ºr Power Failure). In der Praxis m√ºssten wir diese Muster erst **entdecken** ‚Äì und das ist genau der Mehrwert von XAI!

**Beispiel aus der Industrie:**
- Ein Predictive Maintenance Modell sagt "Ausfall in 48h" voraus
- Wartungsteam fragt: "Warum? Was sollen wir pr√ºfen?"
- XAI liefert: "Hohe Leistung (7.5 kW) + niedrige Temp-Differenz (8.2 K) ‚Üí W√§rmeabfuhr pr√ºfen!"
- Ohne XAI: Modell ist eine Black Box, Team hat keine Handlungsanweisung

### üîÄ Train-Test-Validation Split

**Wiederholung:**
- **Stratified Split**: Klassenverteilung in allen Sets gleich
- 60% Train, 20% Validation, 20% Test
- Warum wichtig bei Imbalance?

In [None]:
# Zielvariable und Features trennen
target_col = 'machine_failure'
drop_cols = [target_col, 'udi', 'product_id', 'twf', 'hdf', 'pwf', 'osf', 'rnf']
drop_cols = [col for col in drop_cols if col in df.columns]

y = df[target_col].astype(int)
X = df.drop(columns=drop_cols)

# One-Hot Encoding f√ºr Type
X = pd.get_dummies(X, columns=['type'], drop_first=True)

print(f"Features: {X.shape[1]}")
print(f"Target distribution:\n{y.value_counts()}")

# Stratified Split: Train-Val-Test (60-20-20)
X_trainval, X_test, y_trainval, y_test = train_test_split(
    X, y, test_size=0.2, stratify=y, random_state=42
)

X_train, X_val, y_train, y_val = train_test_split(
    X_trainval, y_trainval, test_size=0.25, stratify=y_trainval, random_state=42
)

# Feature Scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_val_scaled = scaler.transform(X_val)
X_test_scaled = scaler.transform(X_test)

print(f"\nTrain: {X_train.shape}, Val: {X_val.shape}, Test: {X_test.shape}")
print(f"Train Failure Rate: {y_train.mean():.2%}")
print(f"Val Failure Rate: {y_val.mean():.2%}")
print(f"Test Failure Rate: {y_test.mean():.2%}")

# 4Ô∏è‚É£ Modeling: Von Baseline zu Ensemble

## CRISP-DM Phase 4: Modeling

**Strategie:**
1. **Baseline**: Logistic Regression (einfach, interpretierbar)
2. **Decision Tree**: Interpretierbar, kann Schwellenwerte zeigen
3. **Random Forest**: Beste Performance erwartet
4. **Vergleich**: Welches Modell f√ºr welchen Zweck?

### Model 1: Logistic Regression (Baseline)

In [None]:
# Logistic Regression mit class_weight='balanced'
from sklearn.linear_model import LogisticRegression

lr_model = LogisticRegression(class_weight='balanced', max_iter=1000, random_state=42)
lr_model.fit(X_train_scaled, y_train)

# Predictions
y_val_pred_lr = lr_model.predict(X_val_scaled)
y_val_prob_lr = lr_model.predict_proba(X_val_scaled)[:, 1]

print("Logistic Regression - Validation Set:")
print(classification_report(y_val, y_val_pred_lr, target_names=['OK', 'Failure']))

### Model 2: Decision Tree (Interpretierbar!)

**Wiederholung:**
- Gini Index: Ma√ü f√ºr Unreinheit
- Splitting nach bestem Feature
- max_depth begrenzen gegen Overfitting

In [None]:
# Decision Tree (einfach, max 4-6 Ebenen f√ºr Interpretierbarkeit)
from sklearn.tree import DecisionTreeClassifier, plot_tree

dt_model = DecisionTreeClassifier(max_depth=6, class_weight='balanced', random_state=42)
dt_model.fit(X_train, y_train)  # Ohne Scaling (Decision Trees invariant)

y_val_pred_dt = dt_model.predict(X_val)

print("Decision Tree - Validation Set:")
print(classification_report(y_val, y_val_pred_dt, target_names=['OK', 'Failure']))

# Visualisierung (kompakt)
plt.figure(figsize=(20, 10))
plot_tree(dt_model, filled=True, feature_names=X_train.columns, 
          class_names=['OK', 'Failure'], max_depth=3, fontsize=10)
plt.title('Decision Tree (erste 3 Ebenen)')
plt.show()

# Feature Importance
feature_importance_dt = pd.DataFrame({
    'feature': X_train.columns,
    'importance': dt_model.feature_importances_
}).sort_values('importance', ascending=False)

print("\nTop 10 wichtigste Features (Decision Tree):")
print(feature_importance_dt.head(10))

### Model 3: Random Forest (Beste Performance!)

**Wiederholung:**
- Ensemble aus vielen Decision Trees
- Bagging: Jeder Tree sieht random Sample
- Robuster gegen Overfitting
- Feature Importance aus allen Trees

In [None]:
# Random Forest
rf_model = RandomForestClassifier(n_estimators=100, max_depth=10, 
                                   class_weight='balanced', random_state=42, n_jobs=-1)
rf_model.fit(X_train, y_train)

y_val_pred_rf = rf_model.predict(X_val)
y_val_prob_rf = rf_model.predict_proba(X_val)[:, 1]

print("Random Forest - Validation Set:")
print(classification_report(y_val, y_val_pred_rf, target_names=['OK', 'Failure']))

# Feature Importance
feature_importance_rf = pd.DataFrame({
    'feature': X_train.columns,
    'importance': rf_model.feature_importances_
}).sort_values('importance', ascending=False)

# Plot Top 15
plt.figure(figsize=(10, 8))
top_features = feature_importance_rf.head(15)
plt.barh(top_features['feature'], top_features['importance'])
plt.xlabel('Importance')
plt.title('Top 15 Feature Importances (Random Forest)')
plt.gca().invert_yaxis()
plt.tight_layout()
plt.show()

print("\nTop 10 Features:")
print(feature_importance_rf.head(10))

# 5Ô∏è‚É£ Evaluation: Test Set Performance

## CRISP-DM Phase 5: Evaluation

**Ziele:**
- Finales Modell auf Test Set evaluieren
- Confusion Matrix analysieren
- Precision, Recall, F1 berechnen
- ROC Curve plotten
- Cost-Sensitive Evaluation

### Test Set Evaluation (Random Forest)

In [None]:
# Final Test Set Predictions mit Random Forest
# Verwende X_test (DataFrame) statt X_test_scaled (NumPy Array) f√ºr Random Forest
y_test_pred = rf_model.predict(X_test)
y_test_proba = rf_model.predict_proba(X_test)[:, 1]

# Classification Report
print("üìä CLASSIFICATION REPORT - TEST SET")
print("="*50)
print(classification_report(y_test, y_test_pred, target_names=['OK', 'Failure'], zero_division=0))

# Confusion Matrix
cm_test = confusion_matrix(y_test, y_test_pred)
print("\nüìà CONFUSION MATRIX")
print("="*50)
print(cm_test)
print(f"\nTrue Negatives (TN): {cm_test[0,0]}")
print(f"False Positives (FP): {cm_test[0,1]}")
print(f"False Negatives (FN): {cm_test[1,0]}")
print(f"True Positives (TP): {cm_test[1,1]}")

In [None]:
# Confusion Matrix Visualization
fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(cm_test, annot=True, fmt='d', cmap='Blues', 
            xticklabels=['OK', 'Failure'], 
            yticklabels=['OK', 'Failure'],
            cbar_kws={'label': 'Count'})
plt.xlabel('Predicted', fontsize=12, fontweight='bold')
plt.ylabel('Actual', fontsize=12, fontweight='bold')
plt.title('Confusion Matrix - Random Forest (Test Set)', fontsize=14, fontweight='bold')
plt.tight_layout()
plt.show()

# Berechne wichtige Metriken manuell
tn, fp, fn, tp = cm_test.ravel()
precision = tp / (tp + fp) if (tp + fp) > 0 else 0
recall = tp / (tp + fn) if (tp + fn) > 0 else 0
f1 = 2 * (precision * recall) / (precision + recall) if (precision + recall) > 0 else 0
specificity = tn / (tn + fp) if (tn + fp) > 0 else 0

print(f"\nüéØ WICHTIGE METRIKEN (Test Set)")
print("="*50)
print(f"Precision (Failure): {precision:.3f}")
print(f"Recall (Failure):    {recall:.3f}")
print(f"F1-Score (Failure):  {f1:.3f}")
print(f"Specificity (OK):    {specificity:.3f}")

### ROC Curve und AUC

**Konzept:**
- ROC = Receiver Operating Characteristic
- Trade-off zwischen True Positive Rate (Recall) und False Positive Rate
- AUC = Area Under Curve (0.5 = random, 1.0 = perfekt)

In [None]:
from sklearn.metrics import RocCurveDisplay, roc_auc_score

# ROC Curve plotten
fig, ax = plt.subplots(figsize=(10, 8))

# Random Forest ROC (mit X_test DataFrame)
RocCurveDisplay.from_estimator(rf_model, X_test, y_test, 
                                name='Random Forest', ax=ax, color='green')

# Logistic Regression ROC (mit X_test_scaled)
RocCurveDisplay.from_estimator(lr_model, X_test_scaled, y_test, 
                                name='Logistic Regression', ax=ax, color='blue')

# Decision Tree ROC (mit X_test DataFrame)
RocCurveDisplay.from_estimator(dt_model, X_test, y_test, 
                                name='Decision Tree', ax=ax, color='red')

# Diagonale (Random Classifier)
ax.plot([0, 1], [0, 1], 'k--', label='Random Classifier (AUC=0.5)')

ax.set_xlabel('False Positive Rate', fontsize=12, fontweight='bold')
ax.set_ylabel('True Positive Rate (Recall)', fontsize=12, fontweight='bold')
ax.set_title('ROC Curves - Model Comparison', fontsize=14, fontweight='bold')
ax.legend(loc='lower right')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# AUC Scores
print("\nüìä AUC SCORES (Test Set)")
print("="*50)
print(f"Random Forest:        {roc_auc_score(y_test, rf_model.predict_proba(X_test)[:, 1]):.4f}")
print(f"Logistic Regression:  {roc_auc_score(y_test, lr_model.predict_proba(X_test_scaled)[:, 1]):.4f}")
print(f"Decision Tree:        {roc_auc_score(y_test, dt_model.predict_proba(X_test)[:, 1]):.4f}")

### Precision-Recall Curve

**Warum wichtig bei Imbalanced Data?**
- Bei stark unbalancierten Klassen (96.6% OK vs. 3.4% Failure)
- ROC kann zu optimistisch sein
- Precision-Recall fokussiert auf Minority Class (Failures)

In [None]:
from sklearn.metrics import PrecisionRecallDisplay, average_precision_score

# Precision-Recall Curves
fig, ax = plt.subplots(figsize=(10, 8))

# Random Forest (mit X_test DataFrame)
PrecisionRecallDisplay.from_estimator(rf_model, X_test, y_test, 
                                       name='Random Forest', ax=ax, color='green')

# Logistic Regression (mit X_test_scaled)
PrecisionRecallDisplay.from_estimator(lr_model, X_test_scaled, y_test, 
                                       name='Logistic Regression', ax=ax, color='blue')

# Decision Tree (mit X_test DataFrame)
PrecisionRecallDisplay.from_estimator(dt_model, X_test, y_test, 
                                       name='Decision Tree', ax=ax, color='red')

# Baseline (Proportion of Failures)
baseline = y_test.sum() / len(y_test)
ax.axhline(y=baseline, color='k', linestyle='--', label=f'Baseline (Prevalence={baseline:.3f})')

ax.set_xlabel('Recall', fontsize=12, fontweight='bold')
ax.set_ylabel('Precision', fontsize=12, fontweight='bold')
ax.set_title('Precision-Recall Curves - Model Comparison', fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Average Precision Scores
print("\nüìä AVERAGE PRECISION SCORES (Test Set)")
print("="*50)
print(f"Random Forest:        {average_precision_score(y_test, rf_model.predict_proba(X_test)[:, 1]):.4f}")
print(f"Logistic Regression:  {average_precision_score(y_test, lr_model.predict_proba(X_test_scaled)[:, 1]):.4f}")
print(f"Decision Tree:        {average_precision_score(y_test, dt_model.predict_proba(X_test)[:, 1]):.4f}")

### Cost-Sensitive Evaluation

**Business Context:**
- False Negative (FN) = Maschine f√§llt unerwartet aus ‚Üí 10.000‚Ç¨ Kosten
- False Positive (FP) = Unn√∂tige Wartung ‚Üí 300‚Ç¨ Kosten
- **Cost Ratio: FN:FP = 30:1**

**Berechnung der Gesamtkosten:**

In [None]:
# Cost-Sensitive Evaluation
COST_FN = 10000  # Kosten f√ºr unerwarteten Ausfall
COST_FP = 300    # Kosten f√ºr unn√∂tige Wartung

def calculate_costs(y_true, y_pred):
    """Berechne Gesamtkosten basierend auf Confusion Matrix"""
    cm = confusion_matrix(y_true, y_pred)
    tn, fp, fn, tp = cm.ravel()
    
    total_cost = (fn * COST_FN) + (fp * COST_FP)
    
    return {
        'TP': tp, 'FP': fp, 'FN': fn, 'TN': tn,
        'Cost_FN': fn * COST_FN,
        'Cost_FP': fp * COST_FP,
        'Total_Cost': total_cost
    }

# Kosten f√ºr alle Modelle berechnen
models = {
    'Logistic Regression': lr_model.predict(X_test_scaled),
    'Decision Tree': dt_model.predict(X_test),
    'Random Forest': rf_model.predict(X_test)
}

print("üí∞ COST-SENSITIVE EVALUATION (Test Set)")
print("="*70)
print(f"Kosten pro FN (unerwarteter Ausfall): {COST_FN:,}‚Ç¨")
print(f"Kosten pro FP (unn√∂tige Wartung):     {COST_FP:,}‚Ç¨")
print(f"Cost Ratio FN:FP = {COST_FN//COST_FP}:1")
print("="*70)

costs_summary = []
for model_name, y_pred in models.items():
    costs = calculate_costs(y_test, y_pred)
    costs_summary.append({
        'Model': model_name,
        'FN': costs['FN'],
        'FP': costs['FP'],
        'Cost_FN': costs['Cost_FN'],
        'Cost_FP': costs['Cost_FP'],
        'Total_Cost': costs['Total_Cost']
    })
    
    print(f"\n{model_name}:")
    print(f"  FN: {costs['FN']:3d} √ó {COST_FN:,}‚Ç¨ = {costs['Cost_FN']:>10,}‚Ç¨")
    print(f"  FP: {costs['FP']:3d} √ó {COST_FP:,}‚Ç¨     = {costs['Cost_FP']:>10,}‚Ç¨")
    print(f"  {'TOTAL COST':40s} = {costs['Total_Cost']:>10,}‚Ç¨")

# Bestes Modell identifizieren
best_model = min(costs_summary, key=lambda x: x['Total_Cost'])
print("\n" + "="*70)
print(f"üèÜ BESTES MODELL (niedrigste Kosten): {best_model['Model']}")
print(f"   Gesamtkosten: {best_model['Total_Cost']:,}‚Ç¨")
print("="*70)

### Threshold Tuning

**Problem:** Default Threshold = 0.5 ist nicht optimal f√ºr imbalanced data
**L√∂sung:** Optimiere Threshold basierend auf Business-Kosten

In [None]:
# Threshold Tuning f√ºr Random Forest
thresholds = np.arange(0.1, 0.9, 0.05)
costs_by_threshold = []

for threshold in thresholds:
    y_pred_threshold = (y_test_proba >= threshold).astype(int)
    costs = calculate_costs(y_test, y_pred_threshold)
    costs_by_threshold.append({
        'threshold': threshold,
        'fn': costs['FN'],
        'fp': costs['FP'],
        'total_cost': costs['Total_Cost']
    })

# Plot: Kosten vs. Threshold
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))

# Linke Grafik: FN und FP counts
ax1.plot([c['threshold'] for c in costs_by_threshold], 
         [c['fn'] for c in costs_by_threshold], 
         'o-', label='False Negatives', color='red', linewidth=2)
ax1.plot([c['threshold'] for c in costs_by_threshold], 
         [c['fp'] for c in costs_by_threshold], 
         's-', label='False Positives', color='orange', linewidth=2)
ax1.set_xlabel('Decision Threshold', fontsize=12, fontweight='bold')
ax1.set_ylabel('Count', fontsize=12, fontweight='bold')
ax1.set_title('FN and FP vs. Threshold', fontsize=14, fontweight='bold')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Rechte Grafik: Gesamtkosten
ax2.plot([c['threshold'] for c in costs_by_threshold], 
         [c['total_cost'] for c in costs_by_threshold], 
         'D-', color='darkred', linewidth=2, markersize=6)
optimal = min(costs_by_threshold, key=lambda x: x['total_cost'])
ax2.axvline(x=optimal['threshold'], color='green', linestyle='--', 
            label=f'Optimal Threshold={optimal["threshold"]:.2f}')
ax2.axvline(x=0.5, color='blue', linestyle='--', alpha=0.5, label='Default Threshold=0.5')
ax2.set_xlabel('Decision Threshold', fontsize=12, fontweight='bold')
ax2.set_ylabel('Total Cost (‚Ç¨)', fontsize=12, fontweight='bold')
ax2.set_title('Total Cost vs. Threshold', fontsize=14, fontweight='bold')
ax2.legend()
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nüéØ OPTIMAL THRESHOLD")
print("="*70)
print(f"Optimal Threshold: {optimal['threshold']:.2f}")
print(f"Total Cost:        {optimal['total_cost']:,}‚Ç¨")
print(f"False Negatives:   {optimal['fn']}")
print(f"False Positives:   {optimal['fp']}")
print("\nVs. Default Threshold 0.5:")
default = next(c for c in costs_by_threshold if abs(c['threshold'] - 0.5) < 0.01)
print(f"Default Cost:      {default['total_cost']:,}‚Ç¨")
print(f"Cost Reduction:    {default['total_cost'] - optimal['total_cost']:,}‚Ç¨ ({(1 - optimal['total_cost']/default['total_cost'])*100:.1f}%)")
print("="*70)

# 6Ô∏è‚É£ Deployment: Von Modell zur Produktion

## CRISP-DM Phase 6: Deployment

**Deployment-Szenarien in der Produktion:**

### 1. Batch Prediction (Offline)
- T√§glich/W√∂chentlich alle Maschinen evaluieren
- Wartungspl√§ne erstellen
- Integration mit MES (Manufacturing Execution System)

### 2. Real-Time API (Online)
- REST API f√ºr Echtzeit-Vorhersagen
- Integration mit SCADA-Systemen
- Sofortige Benachrichtigungen bei kritischen Zust√§nden

### 3. Edge Computing
- Modell direkt auf Maschinen-Controller
- Offline-Betrieb m√∂glich
- Minimale Latenz

---

### Model Persistence (Speichern des Modells)

In [None]:
import joblib
import json
from datetime import datetime

# Feature Namen extrahieren
feature_cols = X_train.columns.tolist()

# Model speichern
model_artifacts = {
    'model': rf_model,
    'scaler': scaler,
    'feature_names': feature_cols,
    'optimal_threshold': optimal['threshold'],
    'metadata': {
        'train_date': datetime.now().isoformat(),
        'model_type': 'RandomForestClassifier',
        'n_estimators': 100,
        'max_depth': 10,
        'test_auc': roc_auc_score(y_test, y_test_proba),
        'test_avg_precision': average_precision_score(y_test, y_test_proba),
        'optimal_cost': optimal['total_cost']
    }
}

# Speichern (in Produktionsumgebung)
# joblib.dump(model_artifacts, 'ai4i_predictive_maintenance_model.pkl')

print("‚úÖ MODEL ARTIFACTS")
print("="*70)
print("Folgende Komponenten w√ºrden gespeichert:")
print("  - Random Forest Model")
print("  - StandardScaler f√ºr Feature Scaling")
print(f"  - Feature Names ({len(feature_cols)} Features) f√ºr Input Validation")
print(f"  - Optimal Threshold: {optimal['threshold']:.2f}")
print(f"  - Metadata: AUC={model_artifacts['metadata']['test_auc']:.4f}")
print("="*70)

### Beispiel: Prediction Function f√ºr Deployment

In [None]:
def predict_machine_failure(machine_data):
    """
    Production-ready prediction function
    
    Input: Dictionary mit Maschinendaten
    Output: Prediction + Probability + Empfehlung
    """
    # Input Validation
    required_features = ['air_temperature_k', 'process_temperature_k', 
                         'rotational_speed_rpm', 'torque_nm', 'tool_wear_min']
    
    if not all(feat in machine_data for feat in required_features):
        raise ValueError(f"Missing features. Required: {required_features}")
    
    # Feature Engineering (wie im Training)
    power_w = (machine_data['torque_nm'] * machine_data['rotational_speed_rpm'] * 
               2 * np.pi / 60)
    temp_diff_k = (machine_data['process_temperature_k'] - 
                   machine_data['air_temperature_k'])
    strain = machine_data['torque_nm'] / machine_data['rotational_speed_rpm']
    
    # Input Array erstellen (mit allen Features inkl. Type)
    X_input = pd.DataFrame([{
        'Type': machine_data.get('Type', 'M'),  # Default = Medium Quality
        'Air temperature [K]': machine_data['air_temperature_k'],
        'Process temperature [K]': machine_data['process_temperature_k'],
        'Rotational speed [rpm]': machine_data['rotational_speed_rpm'],
        'Torque [Nm]': machine_data['torque_nm'],
        'Tool wear [min]': machine_data['tool_wear_min'],
        'power_w': power_w,
        'temp_diff_k': temp_diff_k,
        'strain': strain
    }])
    
    # One-Hot Encoding f√ºr Type
    X_input_encoded = pd.get_dummies(X_input, columns=['Type'], prefix='Type', drop_first=True)
    
    # Sicherstellen dass alle Features vorhanden sind
    for col in feature_cols:
        if col not in X_input_encoded.columns:
            X_input_encoded[col] = 0
    X_input_encoded = X_input_encoded[feature_cols]
    
    # Prediction (Random Forest wurde OHNE Scaling trainiert!)
    # Wichtig: DataFrame √ºbergeben, damit Feature-Namen erhalten bleiben
    proba = rf_model.predict_proba(X_input_encoded)[0, 1]
    prediction = int(proba >= optimal['threshold'])
    
    # Empfehlung generieren
    if prediction == 1:
        if proba >= 0.8:
            recommendation = "üö® CRITICAL: Sofortige Wartung erforderlich!"
        elif proba >= 0.6:
            recommendation = "‚ö†Ô∏è WARNING: Wartung innerhalb 24h einplanen"
        else:
            recommendation = "‚ö†Ô∏è CAUTION: Wartung innerhalb einer Woche empfohlen"
    else:
        if proba >= 0.3:
            recommendation = "‚ÑπÔ∏è INFO: Maschine beobachten, Trend √ºberwachen"
        else:
            recommendation = "‚úÖ OK: Maschine im Normalbetrieb"
    
    return {
        'prediction': 'FAILURE' if prediction == 1 else 'OK',
        'failure_probability': float(proba),
        'threshold': optimal['threshold'],
        'recommendation': recommendation,
        'engineered_features': {
            'power_w': float(power_w),
            'temp_diff_k': float(temp_diff_k),
            'strain': float(strain)
        }
    }

# Test mit Beispiel-Daten
test_machine = {
    'Type': 'L',  # Low Quality
    'air_temperature_k': 298.1,
    'process_temperature_k': 308.6,
    'rotational_speed_rpm': 1551,
    'torque_nm': 42.8,
    'tool_wear_min': 220
}

result = predict_machine_failure(test_machine)
print("\nüîÆ PREDICTION EXAMPLE")
print("="*70)
print(json.dumps(result, indent=2))
print("="*70)

### Monitoring & Retraining

**Production Considerations:**

1. **Model Monitoring:**
   - Track Prediction Distribution √ºber Zeit
   - Erkennung von Concept Drift
   - Feature Distribution Monitoring

2. **Performance Tracking:**
   - Tats√§chliche vs. Vorhergesagte Ausf√§lle
   - FN/FP Rates im Live-Betrieb
   - Gesamtkosten tracking

3. **Retraining Strategy:**
   - Monatliches Retraining mit neuen Daten
   - A/B Testing von Modell-Versionen
   - Automatisierte Pipeline (MLOps)

# üéì Zusammenfassung & Lernziele Review

---

## ‚úÖ Was haben wir gelernt?

### 1Ô∏è‚É£ Business Understanding
- ‚úÖ ROI-Berechnung f√ºr ML-Projekte
- ‚úÖ Cost-Benefit-Analyse (420k‚Ç¨ Kosten ‚Üí 114k‚Ç¨ Profit im Jahr 1)
- ‚úÖ Stakeholder Requirements (MES, ERP, SCADA Integration)

### 2Ô∏è‚É£ Data Understanding
- ‚úÖ Explorative Datenanalyse (EDA)
- ‚úÖ Class Imbalance erkennen (96.6% vs. 3.4%)
- ‚úÖ Feature Distributions analysieren
- ‚úÖ Korrelationsanalyse

### 3Ô∏è‚É£ Data Preparation
- ‚úÖ Feature Engineering (power_w, temp_diff_k, strain)
- ‚úÖ One-Hot Encoding f√ºr kategorische Features
- ‚úÖ Feature Scaling (StandardScaler)
- ‚úÖ Stratified Train-Val-Test Split

### 4Ô∏è‚É£ Modeling
- ‚úÖ Baseline Model (Logistic Regression)
- ‚úÖ Decision Trees mit Interpretierbarkeit
- ‚úÖ Random Forest f√ºr Performance
- ‚úÖ class_weight='balanced' f√ºr Imbalance

### 5Ô∏è‚É£ Evaluation
- ‚úÖ Confusion Matrix verstehen (TP, FP, FN, TN)
- ‚úÖ Precision, Recall, F1-Score berechnen
- ‚úÖ ROC Curves und AUC
- ‚úÖ Precision-Recall Curves f√ºr Imbalanced Data
- ‚úÖ **Cost-Sensitive Evaluation** (FN:FP = 30:1)
- ‚úÖ **Threshold Tuning** f√ºr Business Optimization

### 6Ô∏è‚É£ Deployment
- ‚úÖ Model Persistence (joblib)
- ‚úÖ Production-Ready Prediction Functions
- ‚úÖ Deployment Szenarien (Batch, API, Edge)
- ‚úÖ Monitoring & Retraining Strategy

---

## üîó Bezug zum Gesamtsemester

**Konzepte aus vorherigen Vorlesungen:**
- VL01-02: Python Basics, NumPy, Pandas ‚Üí Datenverarbeitung
- VL03-04: Data Visualization ‚Üí EDA mit Matplotlib/Seaborn
- VL05-06: CRISP-DM Methodik ‚Üí Kompletter Workflow
- VL07-08: Supervised Learning ‚Üí Classification Algorithms
- VL09-10: Feature Engineering ‚Üí Neue Features ableiten
- VL11-12: Model Evaluation ‚Üí Metrics & Validation
- VL13-14: Imbalanced Data ‚Üí class_weight, SMOTE
- VL15-16: Cost-Sensitive Learning ‚Üí Business-getriebene Metriken

**Smart Operations Management (VL28):**
- Predictive Maintenance als Kernkonzept
- Integration in MES/ERP/SCADA Systeme
- OEE (Overall Equipment Effectiveness) Optimierung
- TCO (Total Cost of Ownership) Reduzierung

---

## üí° Key Takeaways

1. **ML ist kein reines Tech-Problem** ‚Üí Business Value ist entscheidend
2. **Default Metrics k√∂nnen irref√ºhrend sein** ‚Üí Cost-Sensitive Evaluation
3. **Threshold 0.5 ist selten optimal** ‚Üí Business-driven Tuning
4. **Deployment ‚â† Ende des Projekts** ‚Üí Monitoring & Retraining
5. **Interpretierbarkeit wichtig** ‚Üí Besonders in kritischen Anwendungen

---

## üìö Klausurvorbereitung

**Pr√ºfungsrelevante Konzepte:**
- ‚úÖ Confusion Matrix berechnen & interpretieren
- ‚úÖ Precision/Recall/F1 Formeln
- ‚úÖ Class Imbalance behandeln (Techniques)
- ‚úÖ Feature Engineering Strategien
- ‚úÖ Train-Test-Split Strategien (Stratified!)
- ‚úÖ ROI-Berechnung f√ºr ML-Projekte
- ‚úÖ Cost-Sensitive Learning
- ‚úÖ CRISP-DM Phasen anwenden

---

## ‚ùì Diskussionsfragen

1. **Ethik:** Was passiert wenn das Modell einen kritischen Ausfall nicht vorhersagt (FN)?
2. **Kosten:** Wie w√ºrde sich das optimale Threshold √§ndern wenn FP-Kosten steigen?
3. **Drift:** Wie erkennt man Concept Drift in Produktionsumgebungen?
4. **Interpretierbarkeit:** Warum ist Feature Importance wichtig f√ºr Wartungsteams?
5. **Skalierung:** Wie w√ºrde das System mit 10.000 Maschinen skalieren?

---

## üöÄ Weiterf√ºhrende Themen

- **Explainable AI (XAI):** SHAP, LIME f√ºr Modell-Erkl√§rungen
- **AutoML:** Automatisierte Feature Engineering & Hyperparameter Tuning
- **Deep Learning:** LSTM f√ºr Zeitreihen-basierte Predictive Maintenance
- **Federated Learning:** Dezentrales Training √ºber mehrere Fabriken
- **Digital Twins:** Integration von Simulations-Modellen