# Day 06 — Interpretability basics

This notebook explores **feature importance** and **permutation importance** for a simple classifier.

**Concepts covered**
- Global feature importance from tree models
- Permutation importance for model-agnostic checks
- Comparing signals for stability


In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.inspection import permutation_importance

iris = pd.read_csv("data/iris.csv")
X = iris.drop(columns=["species"])
y = iris["species"]

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)
model = RandomForestClassifier(n_estimators=200, random_state=42)
model.fit(X_train, y_train)


### Built-in feature importance

This ranking shows which features the model relied on most during training.


In [None]:
feature_importance = pd.Series(model.feature_importances_, index=X.columns)
feature_importance.sort_values(ascending=False)


### Permutation importance

Permutation importance shuffles one column at a time and measures how much the model’s score drops.


In [None]:
perm = permutation_importance(model, X_test, y_test, n_repeats=10, random_state=42)
perm_importance = pd.Series(perm.importances_mean, index=X.columns)
perm_importance.sort_values(ascending=False)
