
# 🔍 Model Interpretability & Explainability

This notebook provides **code templates and checklists** for **understanding how ML models make decisions** using explainability techniques.

### 🔹 What’s Covered:
- Feature importance analysis
- SHAP (SHapley Additive Explanations)
- LIME (Local Interpretable Model-Agnostic Explanations)
- Partial dependence plots


In [None]:

# Ensure required libraries are installed (Uncomment if necessary)
# !pip install pandas numpy sklearn shap lime matplotlib seaborn



## 🌳 Feature Importance with Tree-Based Models

✅ Use **feature importance scores** from decision trees.  
✅ Identify which features contribute most to model predictions.  


In [None]:

import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Load dataset (Replace with actual data)
df = pd.read_csv("your_dataset.csv")

# Define features and target
X = df.drop(columns=['target'])
y = df['target']

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train a Random Forest model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Get feature importances
importances = model.feature_importances_
feature_names = X.columns

# Convert to DataFrame for easier visualization
importance_df = pd.DataFrame({'Feature': feature_names, 'Importance': importances}).sort_values(by="Importance", ascending=False)
print(importance_df)



## 🛠️ SHAP (SHapley Additive Explanations)

✅ Use **SHAP values** to explain individual predictions.  
✅ Identify how each feature **pushes predictions higher or lower**.  


In [None]:

import shap

# Create SHAP explainer
explainer = shap.Explainer(model, X_train)
shap_values = explainer(X_test)

# Visualize SHAP summary plot
shap.summary_plot(shap_values, X_test)



## 🔬 LIME (Local Interpretable Model-Agnostic Explanations)

✅ Use **LIME** to explain individual predictions.  
✅ Works well with black-box models like deep learning & ensembles.  


In [None]:

from lime.lime_tabular import LimeTabularExplainer

# Create LIME explainer
explainer = LimeTabularExplainer(X_train.values, feature_names=X_train.columns, class_names=['Class 0', 'Class 1'], mode="classification")

# Explain a single prediction
idx = 0  # Index of sample to explain
exp = explainer.explain_instance(X_test.iloc[idx].values, model.predict_proba)
exp.show_in_notebook()



## 📊 Partial Dependence Plots (PDP)

✅ Show how a **single feature influences model predictions**.  
✅ Helps identify **non-linear relationships** between features & targets.  


In [None]:

from sklearn.inspection import plot_partial_dependence
import matplotlib.pyplot as plt

# Generate partial dependence plot
fig, ax = plt.subplots(figsize=(8,6))
plot_partial_dependence(model, X_train, features=[0, 1], feature_names=X_train.columns, ax=ax)
plt.show()



## ✅ Best Practices & Common Pitfalls

- **Choose the right tool**: SHAP is more robust, but LIME is faster.  
- **Consider computational cost**: SHAP can be slow for large datasets.  
- **Compare results**: Feature importance, SHAP, and LIME may give different insights.  
- **Check for bias**: If a feature is overly dominant, your model might be biased.  
