# 📊 Modul 7: Model Evaluation & Interpretation

Notebook ini menjelaskan cara mengevaluasi model Machine Learning secara komprehensif, termasuk analisis metrik (accuracy, precision, recall, f1-score) dan visualisasi hasil seperti confusion matrix, ROC Curve, dan AUC.

In [None]:
# 📥 1. Import Library yang Dibutuhkan
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, roc_curve, auc

import warnings
warnings.filterwarnings('ignore')

In [None]:
# 📊 2. Load Dataset, Train Model, dan Prediksi
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = pd.Series(data.target)

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Melatih model Logistic Regression sebagai baseline
model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("Classification Report:")
print(classification_report(y_test, y_pred))

In [None]:
# 🔲 3. Visualisasi Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6,4))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.xlabel('Predicted')
plt.ylabel('Actual')
plt.title('Confusion Matrix')
plt.show()

In [None]:
# 📈 4. Visualisasi ROC Curve dan Perhitungan AUC
y_prob = model.predict_proba(X_test)[:, 1]  # probabilitas kelas positif
fpr, tpr, thresholds = roc_curve(y_test, y_prob)
roc_auc = auc(fpr, tpr)

plt.figure(figsize=(6,4))
plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC)')
plt.legend(loc="lower right")
plt.show()

## Interpretasi Hasil Evaluasi

- **Accuracy:** Persentase prediksi yang benar. Namun, perlu hati-hati jika data imbalanced.
- **Precision dan Recall:** Lebih tepat digunakan untuk mengukur performa terutama jika false positive atau false negative memiliki konsekuensi serius.
- **F1-Score:** Menyediakan gambaran keseluruhan antara precision dan recall.
- **Confusion Matrix:** Membantu mengidentifikasi jumlah kesalahan yang dibuat oleh model pada masing-masing kelas.
- **ROC Curve dan AUC:** Menilai seberapa baik model membedakan antara kelas positif dan negatif.