# Model Explainability with SHAP: A Guide to Those Who Are Serious About ML
## SUBTITLE TODO
![](images/pexels.jpg)
<figcaption style="text-align: center;">
    <strong>
        Photo by 
        <a href='https://www.pexels.com/@iriser?utm_content=attributionCopyText&utm_medium=referral&utm_source=pexels'>Irina Iriser</a>
        on 
        <a href=https://www.pexels.com/photo/blue-and-red-jellyfish-artwork-1086583/?utm_content=attributionCopyText&utm_medium=referral&utm_source=pexels''></a>
    </strong>
</figcaption>

# Setup

In [1]:
import logging
import time
import warnings

import catboost as cb
import datatable as dt
import joblib
import lightgbm as lgbm
import matplotlib.pyplot as plt
import numpy as np
import optuna
import pandas as pd
import seaborn as sns
import xgboost as xgb
from optuna.samplers import TPESampler
from sklearn.compose import (
    ColumnTransformer,
    make_column_selector,
    make_column_transformer,
)
from sklearn.impute import SimpleImputer
from sklearn.metrics import log_loss, mean_squared_error
from sklearn.model_selection import (
    KFold,
    StratifiedKFold,
    cross_validate,
    train_test_split,
)
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import OneHotEncoder, OrdinalEncoder

logging.basicConfig(
    format="%(asctime)s - %(message)s", datefmt="%d-%b-%y %H:%M:%S", level=logging.INFO
)
optuna.logging.set_verbosity(optuna.logging.WARNING)
warnings.filterwarnings("ignore")
pd.set_option("float_format", "{:.5f}".format)

In [11]:
# For regression
diamonds = sns.load_dataset("diamonds")

X, y = diamonds.drop("price", axis=1), diamonds[["price"]].values.flatten()

# Encode cats
oe = OrdinalEncoder()
cats = X.select_dtypes(exclude=np.number).columns.tolist()
X.loc[:, cats] = oe.fit_transform(X[cats])

X_train, X_valid, y_train, y_valid = train_test_split(
    X, y, test_size=0.15, random_state=1121218
)

In [10]:
# For classification - HIDE
diamonds = sns.load_dataset("diamonds")

X, y = diamonds.drop("cut", axis=1), diamonds[["cut"]].values.flatten()

# Encode cats
oe = OrdinalEncoder()
cats = X.select_dtypes(exclude=np.number).columns.tolist()
X.loc[:, cats] = oe.fit_transform(X[cats])

X_train, X_valid, y_train, y_valid = train_test_split(
    X, y, test_size=0.15, random_state=1121218
)

# Motivation

Today, you can't just come up to your boss and say, "Here is my best model. Let's put it into production and be happy!". No, it doesn't work that way now. Companies and businesses are being picky over the adoption of AI solutions because of their "black box" nature. They demand explainability. 

If ML specialists are coming up with tools to understand and explain the tools *they* created, the concerns and suspicions of others over AI solutions is entirely justified. One of those tools introduced a few years ago is SHAP. It has the ability to break down mechanics of any machine learning model and deep neural nets in a manner that makes everyone happy. 

Today, we will learn how exactly SHAP works and how you can use it in your own practice. 