<h1>
<center>Interpretability</center>
</h1>

<font size="3"> 
This notebook explores interpretability of machine learning models using white box (ridge regression) and black box (random forest regression) approaches on the diabetes dataset. It showcases local and global interpretability techniques, including feature importance analysis and SHAP values, highlighting instances where the black box model outperforms the white box model in prediction accuracy.
</font>

## Generals

<font size="3"> 
Packages import and system configurations. 
</font>

In [None]:
import pandas as pd
import numpy as np
from sklearn import datasets
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import mean_absolute_error
from ipywidgets import interactive
import shap

<h2>
<center>Main Functionality</center>
</h2>

## Data Preprocessing

<font size="3"> 
Split features and target.
</font>

In [None]:
def splits_x_y(df):
    x = df.data
    y = df.target
    return x, y

<font size="3"> 
Split train-test sets.
</font>

In [None]:
def split_test_train(df):
    x, y = splits_x_y(df)
    return train_test_split(x, y, test_size=0.25, random_state=42)

<font size="3"> 
Scalling.
</font>

In [None]:
def scaller(train,test):
    StandardScaler = preprocessing.MinMaxScaler()
    scalled_train_data = StandardScaler.fit_transform(train)
    scalled_test_data = StandardScaler.transform(test)
    return scalled_train_data, scalled_test_data

<font size="3"> 
Data preprocesssing pipeline.
</font>

In [None]:
def get_data_pipeline():
    diabetes = datasets.load_diabetes()
    feature_names = diabetes['feature_names']
    x_train, x_test, y_train, y_test =  split_test_train(diabetes)
    x_train, x_test = scaller(x_train, x_test)       
    return x_train, x_test, y_train, y_test, feature_names

## Models Training

<font size="3"> 
The aim of the function bellow is to train a white box model and evaluate its performance.
<br>
<br>
The function performs the following steps:
<ol>
<li>Fit the model using the training data.</li>
<li>Predict the target values for both the training and test data.</li>
<li>Print the mean absolute error (MAE) performance metrics for the training and test sets.</li>
<li>Return the trained model and the predicted test values.</li>
</ol>
</font>

In [None]:
def train_white_box_model(model,x_train,x_test,y_train,y_test):
    model.fit(x_train, y_train)
    predicted_train = model.predict(x_train)
    predicted_test = model.predict(x_test)
    print("Ridge Regression Model Performance:")
    print("MAE in Train Set", round(mean_absolute_error(y_train, predicted_train), 3))
    print("MAE in Test Set", round(mean_absolute_error(y_test, predicted_test), 3))
    return model, predicted_test

<font size="3"> 
The aim of the function bellow is to train a black box model (Random Forest Regression) with optimal hyperparameters and evaluate its performance.
<br>
<br>
The function performs the following steps:
<ol>
<li>Perform a grid search using cross-validation to find the best hyperparameters based on negative mean squared error scoring.</li>
<li>Fit the optimal model to the training data and predict the target values for both the training and test data.</li>
<li>Print the mean absolute error (MAE) performance metrics for the training and test sets.</li>
<li>Return the trained model and the predicted test values.</li>
</ol>
</font>

In [None]:
def train_black_box_model(param_grid, x_train, x_test, y_train, y_test):
    model = RandomForestRegressor(random_state=42)
    grid_search = GridSearchCV(model, param_grid, cv=5, scoring='neg_mean_squared_error')
    grid_search.fit(x_train, y_train)
    print("RandomForestRegressor Best Hyperparameters:", grid_search.best_params_)
    
    optimal_model = RandomForestRegressor(**grid_search.best_params_)
    model.fit(x_train, y_train)
    predicted_train = model.predict(x_train)
    predicted_test = model.predict(x_test)
    print("RandomForest Regression Model Performance:")
    print("MAE in Train Set", round(mean_absolute_error(y_train, predicted_train), 3))
    print("MAE in Test Set", round(mean_absolute_error(y_test, predicted_test), 3))
    return model, predicted_test

## Interpretability

<font size="3"> 
The aim of the function bellow is to perform global interpretation of a white box model (Ridge Regression) by analyzing the feature weights.
<br>
<br>
The function performs the following steps:
<ol>
<li>Extract the feature weights from the linear regression model.</li>
<li>Remove rows where weights are zero.</li>
<li>Print the number of features included in the interpretation.</li>
<li>Display a bar plot of feature weights.</li>
<li>Print the intercept (bias) value of the linear model.</li>
</ol>
</font>

In [None]:
def global_interpretation_white_box(lin_model, feature_names):
    weights = lin_model.coef_
    model_weights = pd.DataFrame({'features': list(feature_names), 'weights': list(weights)})
    model_weights = model_weights.reindex(model_weights['weights'].abs().sort_values(ascending=False).index) # Sort by absolute value
    model_weights = model_weights[(model_weights["weights"] != 0)]
    print("Number of features:", len(model_weights.values))
    print("\nModel Weights : \n", model_weights)
    plt.figure(num=None, figsize=(8, 6), dpi=100, facecolor='w', edgecolor='k')
    sns.barplot(x="weights", y="features", data=model_weights)
    plt.title("Intercept (Bias): " + str(lin_model.intercept_), loc='right')
    plt.xticks(rotation=90)
    plt.show()
    return model_weights

<font size="3"> 
The aim of the function bellow is to perform global interpretation of a black box model (Random Forest Regression) using SHAP values.
<br>
<br>
The function performs the following steps:
<ol>
<li>Initialize a SHAP TreeExplainer object with the black box model and feature names.</li>
<li>Compute the SHAP values for the test data.</li>
<li>Print the feature weights (SHAP values).</li>
<li>Return the DataFrame, SHAP library, explainer object, and SHAP values.</li>
</ol>
</font>

In [None]:
def global_interpretation_black_box(model, x_test, feature_names):
    explainer = shap.TreeExplainer(model, feature_names=list(feature_names))
    shap_values = explainer.shap_values(x_test)
    # Reshape the SHAP values to match the expected shape
    reshape_shap_values = shap_values[0].reshape(-1)
    # Create a DataFrame with feature names and corresponding SHAP values
    shap_df = pd.DataFrame({'Feature': feature_names, 'SHAP Value': reshape_shap_values})
    shap_df = shap_df.sort_values(by='SHAP Value', key=lambda x: abs(x), ascending=False)
    # Print the feature weights
    print("Feature Weights:\n",shap_df)
    return shap_df, shap, explainer, shap_values

<font size="3"> 
The aim of the function bellow is to use the test-set in order to perform local interpretation of a white box model (Linear Regression) for a specific instance.
<br>
<br>
The function performs the following steps:
<ol>
<li>Retrieve the random instance from the test data.</li>
<li>Compute the weighted sum of features for the instance using the model's coefficients.</li>
<li>Compute the overall result by adding the weighted sum to the model's intercept (bias).</li>
<li>Print the number of features considered.</li>
<li>Visualize the feature weights using a bar plot.</li>
</ol>
</font>

In [None]:
def local_interpretation_white_box(lin_model, x_test, y_test, predicted_test, instance, feature_names):
    random_instance = x_test[instance]
    print("Original Value:", y_test[instance], ", Predicted Value:", round(predicted_test[instance],2))
    weights = lin_model.coef_
    summation = sum(weights[0] * random_instance)
    bias = lin_model.intercept_
    result = summation + bias
    print("Sum(weights * instance):", round(summation,2), "+ Intercept (Bias):", round(bias,2), "=", round(result,2))
    
    model_weights = pd.DataFrame({'features': list(feature_names), 'weights*values': list(weights[0] * random_instance)})
    model_weights = model_weights.reindex(model_weights['weights*values'].abs().sort_values(ascending=False).index) # Sort by absolute value
    model_weights = model_weights[(model_weights["weights*values"] != 0)]    
    print("Number of features:", len(model_weights.values))
    print("\nModel Weights : \n", model_weights)
    plt.figure(num=None, figsize=(8, 6), dpi=100, facecolor='w', edgecolor='k')
    sns.barplot(x="weights*values", y="features", data=model_weights)
    plt.xticks(rotation=90)
    plt.show()
    return model_weights

<font size="3"> 
The aim of the function bellow is to use the test-set in order to perform local interpretation of a black box model (Random Forest Regression) for a specific instance.
<br>
<br>
The function performs the following steps:
<ol>
<li>Create a SHAP TreeExplainer object for the given model and feature names.</li>
<li>Compute the SHAP values for the specified instance using the explainer.</li>
<li>Print the feature weights (SHAP values).</li>
<li>Return the DataFrame of feature weights, along with the SHAP objects (explainer and SHAP values).</li>
</ol>
</font>

In [None]:
def local_interpretation_black_box(model, x_test, instance, feature_names):
    explainer = shap.TreeExplainer(model, feature_names=list(feature_names))
    shap_values = explainer.shap_values(x_test[instance:instance+1])
    # Reshape the SHAP values to match the expected shape
    reshape_shap_values = shap_values[0].reshape(-1)
    # Create a DataFrame with feature names and corresponding SHAP values
    shap_df = pd.DataFrame({'Feature': feature_names, 'SHAP Value': reshape_shap_values})
    shap_df = shap_df.sort_values(by='SHAP Value', key=lambda x: abs(x), ascending=False)
    # Print the feature weights
    print("Feature Weights:\n",shap_df)
    return shap_df, shap, explainer, shap_values

<h2>
<center>Pipeline Execution</center>
</h2>

<font size="3"> 
Global Envariables.
</font>

In [None]:
rf_param_grid = {'n_estimators': [200, 220, 250], 'max_depth': [1, 5, 10], 'min_samples_split': [2, 5, 10]}
white_box_model = Ridge(solver ='sag', tol = 2, random_state=42)
instance = 10
shap.initjs()

### Train Models

In [None]:
x_train, x_test, y_train, y_test, feature_names = get_data_pipeline()
print('Train and Evalaute white-box model!')
white_box_model, white_box_predicted_test = train_white_box_model(white_box_model,x_train,x_test,y_train,y_test)
print('\nTrain and Evalaute black-box model!')
black_box_model, black_box_predicted_test = train_black_box_model(rf_param_grid,x_train,x_test,y_train,y_test)

### Global Interpretability

<font size="3"> 
White-box Model (Ridge Regression)
</font>

In [None]:
print('White box global weights!')
white_box_global_weights = global_interpretation_white_box(white_box_model, feature_names)

<font size="3"> 
Black-box Model (Random Forest)
</font>

In [None]:
print('Black box global weights!')
shap_global_df,shap_global,explainer_global,shap_values_global = global_interpretation_black_box(black_box_model, 
                                                           x_test, feature_names)
shap_global.summary_plot(shap_values_global, x_test, feature_names=list(feature_names))

### Local Interpretability (test sample 10)

<font size="3"> 
White-box Model (Ridge Regression)
</font>

In [None]:
print('White box local weights!')
white_box_local_weights = local_interpretation_white_box(white_box_model,x_test,y_test,white_box_predicted_test,instance,feature_names)

<font size="3"> 
Black-box Model (Random Forest)
</font>

In [None]:
print('Black box local weights!')
shap_local_df,shap_local,explainer_local,shap_values_local = local_interpretation_black_box(black_box_model, 
                                                           x_test, instance, feature_names)
shap_local.force_plot(explainer_local.expected_value, shap_values_local[0], 
                      x_test[instance:instance+1], feature_names=list(feature_names))

<h2>
<center>Interpretability Analysis and Performance Comparison</center>
</h2>

<font size="3"> 
<b>Performance Comparison:</b>
The performance of the models is compared based on Mean Absolute Error (MAE). The white-box model (Ridge Regression) achieves an MAE of <b>48.424</b> on the test set, while the black-box model (Random Forest Regression) performs better with an MAE of <b>43.752</b>. The black-box model outperforms the white-box model, achieving lower MAE values in both sets.

<b>Global Interpretability:</b>
For global interpretability, the feature weights are examined. Surprisingly, both models highlight the same top two important features, namely BMI and s5. However, the white-box model assigns higher weights to these features, indicating their stronger influence on predictions.

<b>Local Interpretability:</b>
Analyzing the feature weights for specific instances at the local level, the white-box model consistently assigns higher weights to features s1, s3, and s2 across multiple instances. For example, in one instance with an original value of 94.0 and a predicted value of 133.89, the weights of s1, s3, and s2 contribute significantly to the prediction with values of 6.62, 6.25, and 5.59, respectively. In contrast, the black-box model exhibits more varied feature importance across instances, with varying weights for features such as BMI, s5, and s3.

<b>Comparison of Interpretability:</b>
The comparison of interpretability between the models reveals interesting nuances. While they share common important features at the global level, the white-box model provides more consistent and predictable feature importance at the local level, whereas the black-box model exhibits greater variability.

<b>Conclusion:</b>
The analysis underscores the trade-off between interpretability and performance. Although the black-box model outperforms the white-box model in terms of predictive accuracy, the white-box model offers more consistent interpretability both globally and locally. Understanding this trade-off is crucial for choosing the appropriate model based on the specific requirements of the task at hand.
</font>