# Task 6: Interpretability & Insights

## Objective
This notebook provides comprehensive model interpretability using:
1. SHAP (SHapley Additive exPlanations) for global and local interpretability
2. LIME (Local Interpretable Model-agnostic Explanations)
3. Feature importance analysis
4. Partial dependence plots
5. Business insights and actionable recommendations

---

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import shap
import lime
import lime.lime_tabular
import warnings
import joblib

from sklearn.inspection import permutation_importance, partial_dependence

warnings.filterwarnings('ignore')
shap.initjs()
print('Libraries imported successfully!')

## 1. Load Best Model and Data

In [None]:
# Load the best model (example: XGBoost)
# best_model = mlflow.<framework>.load_model('runs:/<run_id>/model')
# Or load from saved file
# best_model = joblib.load('../models/best_model.pkl')

print('Load the best performing model from Notebook 5')
print('Example: best_model = joblib.load("../models/best_model.pkl")')

In [None]:
# Load test data
df = pd.read_pickle('../data/interim/bank_with_features.pkl')
print(f'Data loaded: {df.shape}')

## 2. Feature Importance Analysis

In [None]:
# For tree-based models, get feature importance
def plot_feature_importance(model, feature_names, top_n=20):
    if hasattr(model, 'feature_importances_'):
        importances = model.feature_importances_
        indices = np.argsort(importances)[::-1][:top_n]
        
        plt.figure(figsize=(10, 8))
        plt.barh(range(top_n), importances[indices])
        plt.yticks(range(top_n), [feature_names[i] for i in indices])
        plt.xlabel('Feature Importance')
        plt.title(f'Top {top_n} Feature Importances')
        plt.gca().invert_yaxis()
        plt.tight_layout()
        plt.savefig('../reports/figures/feature_importance.png', dpi=300, bbox_inches='tight')
        plt.show()
        
        return pd.DataFrame({'Feature': [feature_names[i] for i in indices],
                           'Importance': importances[indices]})
    else:
        print('Model does not have feature_importances_ attribute')
        return None

print('Feature importance plotting function defined')

## 3. SHAP Values - Global Explanations

In [None]:
# Create SHAP explainer
print('Creating SHAP explainer...')

# For tree-based models:
# explainer = shap.TreeExplainer(model)
# For any model:
# explainer = shap.KernelExplainer(model.predict_proba, X_train_sample)

print('Example: explainer = shap.TreeExplainer(best_model)')
print('         shap_values = explainer.shap_values(X_test)')

In [None]:
# SHAP Summary Plot
def plot_shap_summary(shap_values, X_test, feature_names):
    plt.figure(figsize=(12, 8))
    shap.summary_plot(shap_values, X_test, feature_names=feature_names, show=False)
    plt.tight_layout()
    plt.savefig('../reports/figures/shap_summary_plot.png', dpi=300, bbox_inches='tight')
    plt.show()

print('SHAP summary plot function defined')

In [None]:
# SHAP Bar Plot (mean absolute SHAP values)
def plot_shap_bar(shap_values, X_test, feature_names):
    plt.figure(figsize=(10, 8))
    shap.summary_plot(shap_values, X_test, plot_type='bar', 
                     feature_names=feature_names, show=False)
    plt.tight_layout()
    plt.savefig('../reports/figures/shap_bar_plot.png', dpi=300, bbox_inches='tight')
    plt.show()

print('SHAP bar plot function defined')

## 4. SHAP Values - Local Explanations

In [None]:
# Explain individual predictions
def explain_prediction(explainer, X_instance, feature_names, instance_id):
    shap_values_instance = explainer.shap_values(X_instance)
    
    plt.figure(figsize=(10, 6))
    shap.waterfall_plot(shap.Explanation(values=shap_values_instance[0], 
                                         base_values=explainer.expected_value,
                                         data=X_instance[0],
                                         feature_names=feature_names),
                       show=False)
    plt.title(f'SHAP Waterfall Plot - Instance {instance_id}')
    plt.tight_layout()
    plt.savefig(f'../reports/figures/shap_waterfall_instance_{instance_id}.png', 
                dpi=300, bbox_inches='tight')
    plt.show()

print('Local explanation function defined')

## 5. LIME Explanations

In [None]:
# LIME explainer
def explain_with_lime(model, X_train, X_instance, feature_names, class_names, instance_id):
    explainer = lime.lime_tabular.LimeTabularExplainer(
        X_train,
        feature_names=feature_names,
        class_names=class_names,
        mode='classification'
    )
    
    explanation = explainer.explain_instance(
        X_instance[0],
        model.predict_proba,
        num_features=10
    )
    
    # Save explanation
    explanation.save_to_file(f'../reports/figures/lime_explanation_instance_{instance_id}.html')
    
    # Plot
    fig = explanation.as_pyplot_figure()
    plt.tight_layout()
    plt.savefig(f'../reports/figures/lime_plot_instance_{instance_id}.png', 
                dpi=300, bbox_inches='tight')
    plt.show()
    
    return explanation

print('LIME explanation function defined')

## 6. Permutation Importance

In [None]:
# Permutation importance
def calculate_permutation_importance(model, X_test, y_test, feature_names):
    result = permutation_importance(model, X_test, y_test, 
                                   n_repeats=10, random_state=42, n_jobs=-1)
    
    # Sort by importance
    sorted_idx = result.importances_mean.argsort()[::-1][:20]
    
    plt.figure(figsize=(10, 8))
    plt.barh(range(len(sorted_idx)), result.importances_mean[sorted_idx])
    plt.yticks(range(len(sorted_idx)), [feature_names[i] for i in sorted_idx])
    plt.xlabel('Permutation Importance')
    plt.title('Top 20 Features by Permutation Importance')
    plt.gca().invert_yaxis()
    plt.tight_layout()
    plt.savefig('../reports/figures/permutation_importance.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    return result

print('Permutation importance function defined')

## 7. Partial Dependence Plots

In [None]:
# Partial dependence plots
def plot_partial_dependence(model, X, features, feature_names):
    from sklearn.inspection import PartialDependenceDisplay
    
    fig, ax = plt.subplots(figsize=(15, 10))
    display = PartialDependenceDisplay.from_estimator(
        model, X, features, feature_names=feature_names,
        ax=ax, n_cols=3, grid_resolution=50
    )
    plt.tight_layout()
    plt.savefig('../reports/figures/partial_dependence_plots.png', dpi=300, bbox_inches='tight')
    plt.show()

print('Partial dependence plot function defined')

## 8. Business Insights

### Key Findings:

1. **Most Influential Features**:
   - List top 5-10 features based on SHAP/importance analysis
   - Interpret their business meaning

2. **Feature Interactions**:
   - Identify important feature combinations
   - Explain synergistic effects

3. **Client Segmentation**:
   - High-value segments more likely to subscribe
   - Low-risk segments to avoid

4. **Optimal Contact Strategy**:
   - Best timing for contacts
   - Optimal frequency
   - Preferred communication channels

5. **Economic Context Impact**:
   - Effect of economic indicators
   - Market conditions influence

## 9. Actionable Recommendations

### Marketing Strategy Recommendations:

1. **Target Prioritization**:
   - Focus on clients with high predicted probability
   - Characteristics of high-conversion clients

2. **Contact Optimization**:
   - Optimal number of contacts
   - Best times to call (day/month)
   - Personalize approach based on client profile

3. **Resource Allocation**:
   - Allocate more resources to high-potential segments
   - Reduce efforts on low-conversion segments

4. **Product Design**:
   - Tailor term deposit products to different segments
   - Pricing strategies based on client characteristics

5. **Risk Management**:
   - Monitor economic indicators
   - Adjust campaigns based on market conditions

## Summary

This notebook provided comprehensive model interpretability:

✅ Feature importance analysis  
✅ SHAP global and local explanations  
✅ LIME for individual predictions  
✅ Permutation importance  
✅ Partial dependence plots  
✅ Business insights derived from model  
✅ Actionable recommendations for marketing  

---

**Proceed to Notebook 7 for Critical Reflection**