# Complete Visualization Demo: Classifier Model Interpreter

This notebook demonstrates all key visualization capabilities and explains how this package improves upon native SHAP visualizations.

## What Makes This Package Better Than Native SHAP?

### Key Advantages:

1. **Interactive Plotly visualizations** instead of static matplotlib
2. **Prediction surface visualizations** (contour & 3D) - NOT available in SHAP
3. **Conditional dependence plots** - makes interactions visually obvious
4. **Specialized categorical visualizations** - box plots instead of messy scatter
5. **Simple, integrated API** - one Interpreter class for everything
6. **Business-friendly outputs** - probabilities and clear visuals

## Setup

In [1]:
import sys
from pathlib import Path

parent_dir = Path.cwd().parent
if str(parent_dir) not in sys.path:
    sys.path.insert(0, str(parent_dir))

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from xgboost import XGBClassifier
from src.core import Interpreter
import warnings
warnings.filterwarnings('ignore')

print("✓ Setup complete")

✓ Setup complete


In [2]:
# Load and prepare data
data_path = Path.cwd().parent / 'data' / 'customer_conversion.csv'
df = pd.read_csv(data_path)

# Label encode categoricals
cat_cols = ['occupation', 'country', 'referral_source', 'device_type']
X = df.drop(['customer_id', 'converted'], axis=1).copy()
y = df['converted'].values

label_encoders = {}
for col in cat_cols:
    if col in X.columns:
        le = LabelEncoder()
        X[col] = le.fit_transform(X[col].astype(str))
        label_encoders[col] = le

# Train/test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Train model
model = XGBClassifier(
    n_estimators=200,
    max_depth=6,
    learning_rate=0.1,
    random_state=42,
    eval_metric='logloss'
)
model.fit(X_train, y_train)

print(f"Test accuracy: {model.score(X_test, y_test):.3f}")
print(f"Dataset: {df.shape[0]} rows, conversion rate: {df['converted'].mean():.1%}")

Test accuracy: 0.565
Dataset: 5000 rows, conversion rate: 53.1%


In [3]:
# Initialize Interpreter
interp = Interpreter(model, X_test, y_test, config='detailed_analysis')

# Create label mappings for categorical visualizations
occupation_labels = {i: label for i, label in enumerate(label_encoders['occupation'].classes_)}
device_labels = {i: label for i, label in enumerate(label_encoders['device_type'].classes_)}
country_labels = {i: label for i, label in enumerate(label_encoders['country'].classes_)}

print("✓ Interpreter ready")
print("\nLabel mappings:")
print(f"  Occupation: {occupation_labels}")
print(f"  Device: {device_labels}")

✓ Interpreter ready

Label mappings:
  Occupation: {0: 'professional', 1: 'retired', 2: 'student', 3: 'unemployed'}
  Device: {0: 'desktop', 1: 'mobile', 2: 'tablet'}


---
## 1. Global Feature Importance

**What it shows:** Which features matter most overall

**vs Native SHAP:** Interactive Plotly (hover, zoom) vs static matplotlib

In [4]:
fig = interp.plot_global_importance(top_n=15)
fig.show()

---
## 2. Beeswarm Plot

**What it shows:** Distribution of SHAP values for each feature
- Color shows feature value (red=high, blue=low)
- Spread shows impact variability

**vs Native SHAP:** Similar, but ours is interactive Plotly

In [5]:
fig = interp.plot_beeswarm(top_n=15)
fig.show()

---
## 3. Feature Dependence Plots (Numeric)

**What it shows:** How feature values affect predictions
- X-axis: Feature value
- Y-axis: SHAP value (impact)
- Color: Auto-detected interaction feature

**vs Native SHAP:** Interactive + legend positioned to avoid overlap

In [6]:
fig = interp.plot_dependence('previous_courses')
fig.show()

In [7]:
# Control color feature explicitly
fig = interp.plot_dependence('discount_offered', interaction_feature='occupation')
fig.show()

In [8]:
fig = interp.plot_dependence('engagement_score')
fig.show()

---
## 4. Categorical Dependence Plots

**What it shows:** SHAP distributions for each category (box plots)

**vs Native SHAP:** Native SHAP shows scattered points (messy). We use box plots (clearer).

**UNIQUE FEATURE** - Specialized visualization for categorical features

In [9]:
fig = interp.plot_dependence_categorical('occupation', value_labels=occupation_labels)
fig.show()

print("\n📊 Business insight:")
print("  - Students: positive SHAP (more likely to convert)")
print("  - Unemployed: negative SHAP (less likely to convert)")


📊 Business insight:
  - Students: positive SHAP (more likely to convert)
  - Unemployed: negative SHAP (less likely to convert)


In [10]:
fig = interp.plot_dependence_categorical('device_type', value_labels=device_labels)
fig.show()

In [11]:
fig = interp.plot_dependence_categorical('country', value_labels=country_labels)
fig.show()

---
## 5. Interaction Detection

**What it shows:** Which feature pairs interact most

**vs Native SHAP:** Returns clean DataFrame vs complex API

In [12]:
interactions = interp.detect_interactions(top_n=15, method='shap_variance')
print("Top 15 Feature Interactions:")
interactions

Top 15 Feature Interactions:


Unnamed: 0,feature_1,feature_2,interaction_strength
127,previous_courses,signup_month,0.04134
128,previous_courses,signup_day_of_week,0.030933
86,discount_offered,previous_courses,0.027605
81,discount_offered,age,0.026745
90,discount_offered,days_since_signup,0.020076
82,discount_offered,occupation,0.01886
89,discount_offered,signup_day_of_week,0.017845
87,discount_offered,account_age_days,0.01756
32,time_on_site_mins,email_opens,0.017235
15,engagement_score,days_since_signup,0.016088


---
## 6. Interaction Analysis

**What it shows:** Detailed statistics about a specific interaction

In [13]:
analysis = interp.analyze_interaction('discount_offered', 'occupation')

print("="*60)
print("INTERACTION ANALYSIS: discount_offered × occupation")
print("="*60)
print(f"\nSHAP correlation: {analysis['correlation_shap_values']:.3f}")

print("\nConditional Analysis (discount SHAP by occupation):")
for group, stats in analysis['conditional_analysis'].items():
    occ_label = occupation_labels.get(int(float(group)), group)
    print(f"  {occ_label:20s} Mean SHAP: {stats['mean_shap']:7.4f}  (n={stats['count']})")

INTERACTION ANALYSIS: discount_offered × occupation

SHAP correlation: 0.000

Conditional Analysis (discount SHAP by occupation):
  professional         Mean SHAP:  0.0027  (n=509)
  student              Mean SHAP:  0.0235  (n=241)
  unemployed           Mean SHAP: -0.0318  (n=100)
  retired              Mean SHAP:  0.0024  (n=150)


---
## 7. Traditional Interaction Visualizations

In [14]:
# Scatter plot: shows SHAP values (can be abstract)
fig = interp.plot_interaction_scatter('discount_offered', 'occupation')
fig.show()

print("\n⚠️ Limitation: Shows SHAP values (abstract)")
print("   Better alternatives below: contour & conditional dependence plots")


⚠️ Limitation: Shows SHAP values (abstract)
   Better alternatives below: contour & conditional dependence plots


In [15]:
# Heatmap: average SHAP across feature combinations (FIXED - now shows all discount values)
fig = interp.plot_interaction_heatmap('discount_offered', 'occupation', bins=8)
fig.show()

print("\n✓ FIX: Now shows ALL discount values (0, 10, 20, 30)")
print("   Previously only showed 0 due to binning bug")


✓ FIX: Now shows ALL discount values (0, 10, 20, 30)
   Previously only showed 0 due to binning bug


In [16]:
# Interaction matrix
fig = interp.plot_interaction_matrix(method='correlation')
fig.show()

---
# NEW VISUALIZATIONS (Unique to This Package!)

## 8. Blocky Heatmaps - Prediction Surfaces

**What it shows:** Predicted probability across two features

**Automatic discrete handling:**
- Features with <10 unique values → uses only actual values on axes
- Categorical features → can show actual category names (e.g., "student" vs "0")
- Continuous features → uses grid points with consistent tick marks
- **All features show blocky heatmap (no interpolation smoothing)**

**Why it's better:**
- Shows actual predictions (Y), not SHAP values
- Blocky appearance = clearer interpretation
- No meaningless tick marks (e.g., no "15" when only 0,10,20,30 exist)
- Consistent visual style across all features
- Business stakeholders understand probabilities

**NATIVE SHAP DOESN'T HAVE THIS!**

In [17]:
# Blocky heatmap: discount (discrete) × engagement (continuous)
fig = interp.plot_interaction_contour('discount_offered', 'engagement_score', n_grid=50)
fig.show()

print("\n📊 Notice:")
print("  - X-axis shows ONLY actual discount values: 0, 10, 20, 30")
print("  - No meaningless ticks like 5, 15, 25")
print("  - Y-axis (engagement) uses 50 grid points across its range")
print("  - Blocky heatmap for clear cell-by-cell interpretation")


📊 Notice:
  - X-axis shows ONLY actual discount values: 0, 10, 20, 30
  - No meaningless ticks like 5, 15, 25
  - Y-axis (engagement) uses 50 grid points across its range
  - Blocky heatmap for clear cell-by-cell interpretation


In [18]:
# Categorical labels: discount × occupation (with actual names!)
fig = interp.plot_interaction_contour(
    'discount_offered', 
    'occupation',
    value_labels_2=occupation_labels  # Maps 0→'professional', 1→'retired', etc.
)
fig.show()

print("\n📊 Categorical labeling:")
print("  - Y-axis shows actual occupation names, not encoded numbers")
print("  - 'student', 'professional', etc. instead of 0, 1, 2, 3")
print("  - Much clearer for business stakeholders!")
print("  - Shows blocky heatmap for both discrete features")


📊 Categorical labeling:
  - Y-axis shows actual occupation names, not encoded numbers
  - 'student', 'professional', etc. instead of 0, 1, 2, 3
  - Much clearer for business stakeholders!
  - Shows blocky heatmap for both discrete features


In [19]:
# Continuous features: blocky heatmap with grid points
fig = interp.plot_interaction_contour('age', 'engagement_score', n_grid=50)
fig.show()

print("\n📊 Continuous features:")
print("  - Both age and engagement are continuous")
print("  - Uses 50x50 grid of prediction points")
print("  - Blocky heatmap (no smoothing) for consistent interpretation")
print("  - Tick marks show evenly distributed values across range")


📊 Continuous features:
  - Both age and engagement are continuous
  - Uses 50x50 grid of prediction points
  - Blocky heatmap (no smoothing) for consistent interpretation
  - Tick marks show evenly distributed values across range


---
## 9. 3D Surface Plots

**What it shows:** 3D visualization of prediction landscape
- Z-axis = predicted probability
- Peaks = high conversion areas
- Valleys = low conversion areas

**Why it's better:**
- Great for presentations
- Interactive (rotate, zoom)
- Intuitive for non-technical audiences

**NATIVE SHAP DOESN'T HAVE THIS!**

In [20]:
fig = interp.plot_interaction_surface_3d('discount_offered', 'engagement_score', n_grid=30)
fig.show()

print("\n📊 Interactive 3D - try rotating the plot!")


📊 Interactive 3D - try rotating the plot!


In [21]:
fig = interp.plot_interaction_surface_3d('previous_courses', 'engagement_score')
fig.show()

---
## 10. Conditional Dependence Plots

**What it shows:** How a feature's effect varies by another feature
- Each line = different group/bin
- Parallel lines = no interaction
- Diverging lines = strong interaction

**Why it's better:**
- Makes interactions visually obvious
- Answers business questions directly
- Better than scatter for categorical conditioning

**NATIVE SHAP DOESN'T HAVE THIS!**

In [22]:
# Conditional dependence with categorical labels
fig = interp.plot_conditional_dependence(
    'discount_offered', 
    'occupation',
    value_labels=occupation_labels  # Show actual occupation names in legend
)
fig.show()

print("\n📊 Business question: Does discount effectiveness vary by occupation?")
print("   If lines diverge → YES, target different occupations differently")
print("   If parallel → NO, same discount strategy for all")
print("\n✓ Legend shows 'student', 'professional', etc. instead of 0, 1, 2, 3")


📊 Business question: Does discount effectiveness vary by occupation?
   If lines diverge → YES, target different occupations differently
   If parallel → NO, same discount strategy for all

✓ Legend shows 'student', 'professional', etc. instead of 0, 1, 2, 3


In [23]:
fig = interp.plot_conditional_dependence('engagement_score', 'discount_offered')
fig.show()

print("\n📊 Question: Does engagement value depend on discount level?")


📊 Question: Does engagement value depend on discount level?


In [24]:
fig = interp.plot_conditional_dependence('time_on_site_mins', 'device_type', n_bins=3)
fig.show()

---
# Summary: Why This Package Beats Native SHAP

## Unique Features (Not in SHAP)

1. ✅ **Blocky heatmaps** - Prediction probability surfaces with smart discrete handling
   - Auto-detects discrete features (<10 unique values)
   - Shows only actual values on axes (e.g., 0,10,20,30 not 0,5,10,15...)
   - Categorical labels (e.g., "student" not "2")
2. ✅ **3D surface plots** - Interactive prediction landscapes
3. ✅ **Conditional dependence** - Makes interactions visually obvious
4. ✅ **Categorical box plots** - Better than scatter for categoricals

## Better Implementation

5. ✅ **Interactive Plotly** - vs static matplotlib (hover, zoom, export)
6. ✅ **Simple API** - One Interpreter class vs multiple imports
7. ✅ **Preset configs** - Quick setup vs manual parameters
8. ✅ **Clean outputs** - DataFrames vs complex objects

## Business Value

- Shows **predictions** (Y) not just SHAP values
- Visualizations non-technical stakeholders understand
- Directly answers business questions
- Great for presentations and reports
- Blocky heatmaps make discrete features crystal clear

## When to Use This Package

- Business presentations
- Finding optimal feature combinations
- Understanding interactions clearly
- Interactive dashboards
- Stakeholder communication
- Rapid model interpretation
- Working with categorical or discrete features

## When to Use Native SHAP

- Academic papers (matplotlib standard)
- Force plots for single predictions
- Advanced TreeSHAP features
- Non-tree models (we focus on tree models)

---

**Best approach:** Use both! This package for business insights, native SHAP for advanced features.