---

# Interpretable Machine Learning: Shapley Values
#### United Lunch & Learn: June 6, 2019

_Author: Carleton Smith_

Many of the examples and sources cited in this tutorial came from these two excellent podcast episodes by [_Linear Digressions_](http://lineardigressions.com/):

- ["Game Theory for model interpretability: Shapley Values"](http://lineardigressions.com/episodes/2018/5/6/game-theory-for-model-interpretability-shapley-values)
- ["SHAP: Shapley Values in Machine Learning"](http://lineardigressions.com/episodes/2018/5/13/shap-shapley-values-in-machine-learning)

---

<a id='top'></a>
## Tutorial Outline

- [Install Packages](#install)
- [Imports](#imports)
- [Brief Introduction](#intro)
- [Why Do We Care About Interpretable Machine Learning?](#interpret)
- [What Are Shapley Values?](#shapley-values)
- [Acquire data](#acquire)
- [Quick Preprocessing/EDA](#eda-preprocessing)
- [Preprocessing Pipeline](#pipeline)
- [Create a Model: RandomForest](#model)
- [Shapley Values in Action](#in-action)
    - [Visualize Results](#visualize)

---
<a id='install'></a>
## Install Packages

In [2]:
import sys
!conda install -yc conda-forge --prefix {sys.prefix} shap

Collecting package metadata: ...working... done
Solving environment: ...working... 
  - anaconda::ca-certificates-2018.03.07-0, anaconda::certifi-2018.11.29-py37_0, anaconda::openssl-1.1.1a-he774522_0
  - anaconda::ca-certificates-2018.03.07-0, anaconda::openssl-1.1.1a-he774522_0, defaults::certifi-2018.11.29-py37_0
  - anaconda::certifi-2018.11.29-py37_0, anaconda::openssl-1.1.1a-he774522_0, defaults::ca-certificates-2018.03.07-0
  - anaconda::openssl-1.1.1a-he774522_0, defaults::ca-certificates-2018.03.07-0, defaults::certifi-2018.11.29-py37_0
  - anaconda::certifi-2018.11.29-py37_0, defaults::ca-certificates-2018.03.07-0, defaults::openssl-1.1.1a-he774522_0
  - defaults::ca-certificates-2018.03.07-0, defaults::certifi-2018.11.29-py37_0, defaults::openssl-1.1.1a-he774522_0
  - anaconda::ca-certificates-2018.03.07-0, anaconda::certifi-2018.11.29-py37_0, defaults::openssl-1.1.1a-he774522_0
  - anaconda::ca-certificates-2018.03.07-0, defaults::certifi-2018.11.29-py37_0, defaults::openss

<a id='imports'></a>
---
## Import Packages

In [3]:
import os
import shap
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler, OneHotEncoder, OrdinalEncoder, LabelEncoder
from sklearn.pipeline import Pipeline, FeatureUnion
from sklearn.preprocessing import FunctionTransformer

plt.rcParams['figure.figsize'] = (6, 4)
plt.rcParams['font.size'] = 12
plt.style.use("fivethirtyeight")

[Back to top](#top)

<a id='intro'></a>
---
## Brief Introduction

Before demonstrating how to use Shapley Values for machine learning, let's discuss what they are in the first place. This explanation is based on these two papers:

1. [A Unified Approach to Interpreting Model Predictions](http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf)
2. [Consistent Individualized Feature Attribution for Tree Ensembles](https://arxiv.org/pdf/1802.03888.pdf)

#### TL;DR

Shapley Values originated in game theory and are named after famed mathematician and Nobel Prize winner, [Lloyd Shapley](https://en.wikipedia.org/wiki/Lloyd_Shapley). The purpose of Shapley Values from a game theory perspective is to solve the problem of assigning appropriate credit to individual players in a cooperative game. In recent years, machine learning researchers have adopted Shapley Values to assign "credit" to features for predictions produced by a complex model.


[Back to top](#top)

<a id='interpret'></a>
## Why do we care about interpretable machine learning?

There are many reasons.

- Establish trust in the model
- Better understand underlying processes
- Provide insights in how to improve the model
- Ethics and fairness

In addition to Shapley Values, several methods already exist on the market for this purpose:

- LIME
- DeepLIFT
- Layer-Wise Relevance Propagation

What sets Shapley Values apart (according to the [the first paper listed above](http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf)) is that Shapely values are the only one of these method that satisfy all three of the following feature importance quality requirements:

1. **Local Accuracy** - a simple model explaining a complex one around a particular point should produce the same output given the same inputs
2. **Missingness** - if a feature is missing in the input space, it should not appear in the feature attribution
3. **Consistency** - if a feature is increasing in it's contribution to the outcome, then that feature should increase in it's importance

What ends up happening with many of the other methods is that you can come up with counter-examples where you cannot acheive all 3 of these properties at the same time. Shapley Values are the only solution among these that do.

[Back to top](#top)

<a id='shapley-values'></a>
## What are Shapley Values?

### Game Theory Context Example: A Soccer Team

- 11 players make a team
- Well defined positions
- Imagine scores range 0-100
- Also imagine you have a stadium full of soccer players available

**How important are each one of the 11 players to the overall performance of the team?**

The problem is that you have 1 outcome (the score) and 11 individuals making contributions.

You start with 1 player as the whole team. This is probably not a great team, but the individual might be very good and contribute a `+10` by themself.

Then you add in another player, who is a `+9` by themself, but when you add them in, the combined score becomes `21` because of a synergistic effect.

Then you add a third player, who is a `+4`. This player alone is not great, but just happens to play well with the other two, so that player contributes a `+8` to the team score.

Then you add a fourth player (`+12`), but this player gets into fights with the first two, so only contributes a `+2` to the total score.

Then you add a fifth player. And this player is a goalie, which is a very unique position, so this becomes extremely valuable.

Now suppose we add a sixth player, who is also a goalie. Well, at this point the team already has a goalie, so it doesn't add much value to add another one, even if this player is very good. The point here is that the value a player adds to the overall outcome depends on who is already there, in addition to the order they are added.

We can go on, but you get the point. What we're doing is composing all of these coalitions of teams and measuring the change in contribution as we add in players, all while keeping track of who is already on the team and in which order they are added.

The credit attribution becomes complex and large multiplication problem, but the calculation is somewhat straightfoward:

    Add up all of the contributions created by each player over every possible coalition composition, then divide by number of scenarios you have and that's the Shapley Value for that specific player.


Check out [this link](https://clearcode.cc/blog/game-theory-attribution/) to see the math performed with a simple marketing example.

### Machine Learning Context


**Tie back to ML:**    
Swap the soccer score for a model's prediction for a particular case. And instead of creating coalitions of soccer players, we are concerned with which features play the largest role in a particular prediction. This is what Shapley Values do in a machine learning context. Let's grab some data and see it in action.

[Back to top](#top)

<a id='acquire'></a>
## Acquire Data

This is Census data from the _UCI Machine Learning Repository_: https://archive.ics.uci.edu/ml/datasets/adult

In [4]:
adult = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data',
                    na_values=' ?',
                    header=None)

If the above line hangs, then uncomment the line below

In [5]:
# adult = pd.read_csv('./datasets/adult.data.txt', header=None, na_values=' ?')

[Back to top](#top)

<a id='eda-preprocessing'></a>
## Quick Preprocessing/EDA

- Add column headers
- Understand dataset
    - how many rows/columns?
    - what does a row represent?
    - what is our target variable?
- Check for missing values
- Check data types
- Check for unbalanced target variable

In [6]:
adult.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14
0,39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
1,50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
2,38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K
3,53,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,<=50K
4,28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,<=50K


### Add Column Headers

**FEATURES**

1. `age`: continuous.
2. `workclass`: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.
3. `fnlwgt`: continuous.
4. `education`: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.
5. `education-num`: continuous.
6. `marital-status`: Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.
7. `occupation`: Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.
relationship: Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.
8. `race`: White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.
9. `sex`: Female, Male.
10. `capital-gain`: continuous.
11. `capital-loss`: continuousm
12. `hours-per-week`: continuous.
13. `native-country`: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.

In [7]:
features = [
    'age',
    'workclass',
    'fnlwgt',
    'education',
    'education_num',
    'marital_status',
    'occupation',
    'relationship',
    'race',
    'sex',
    'capital_gain',
    'capital_loss',
    'hours_per_week',
    'native_country',
    'income',
]

In [8]:
# assign column names
adult.columns = features
adult.head()

Unnamed: 0,age,workclass,fnlwgt,education,education_num,marital_status,occupation,relationship,race,sex,capital_gain,capital_loss,hours_per_week,native_country,income
0,39,State-gov,77516,Bachelors,13,Never-married,Adm-clerical,Not-in-family,White,Male,2174,0,40,United-States,<=50K
1,50,Self-emp-not-inc,83311,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,0,0,13,United-States,<=50K
2,38,Private,215646,HS-grad,9,Divorced,Handlers-cleaners,Not-in-family,White,Male,0,0,40,United-States,<=50K
3,53,Private,234721,11th,7,Married-civ-spouse,Handlers-cleaners,Husband,Black,Male,0,0,40,United-States,<=50K
4,28,Private,338409,Bachelors,13,Married-civ-spouse,Prof-specialty,Wife,Black,Female,0,0,40,Cuba,<=50K


In [9]:
# how many rows a columns?
adult.shape

(32561, 15)

In [10]:
# any missing values?
adult.isnull().sum()

age                  0
workclass         1836
fnlwgt               0
education            0
education_num        0
marital_status       0
occupation        1843
relationship         0
race                 0
sex                  0
capital_gain         0
capital_loss         0
hours_per_week       0
native_country     583
income               0
dtype: int64

In [11]:
# what are the data types?
adult.dtypes

age                int64
workclass         object
fnlwgt             int64
education         object
education_num      int64
marital_status    object
occupation        object
relationship      object
race              object
sex               object
capital_gain       int64
capital_loss       int64
hours_per_week     int64
native_country    object
income            object
dtype: object

In [12]:
# what is the distribution of our target variable?
adult['income'].value_counts()

 <=50K    24720
 >50K      7841
Name: income, dtype: int64

[Back to top](#top)

<a id='pipeline'></a>
## Preprocessing

In the interest of time, I packaged these preprocessing steps into `Pipelines`.

**PREPROCESSING STEPS**
1. Separate target variable from features - sklearn requires this.
2. Peform a train-test split - Always do this before manipulating dataset
3. With training data:
    - **SEPARATE** numeric columns from categorical ones
    - **NUMERIC DF** preprocessing:
        - Replace nan values
        - Standardize features
   
    - **CATEGORICAL DF** preprocessing:
        - Replace nan values
        - Create dummy variables
    - **CONCATENATE** numeric and categorical DF
    - **ENCODE** target variable
<br>
<br>
4. Package these steps into a `Pipeline`

In [13]:
# make a list of numeric and categorical column names
num_cols = [col for col in adult.columns if adult[col].dtype != 'object']
cat_cols = [col for col in adult.columns if col not in num_cols + ['income']]

# separate features from target variable
X = adult.drop('income', axis=1)
y = adult['income']

# perform a train-test split.... why? stratify on y?
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.3,
    stratify=y,
    random_state=24
)

def feature_extractor(df):
    return df.drop('income', axis=1)


def categorical_extractor(df):
    return df.select_dtypes(include=['object'])


def numeric_extractor(df):
    return df.select_dtypes(exclude=['object'])

# create custom transformers
cat_transformer = FunctionTransformer(categorical_extractor, validate=False)
num_transformer = FunctionTransformer(numeric_extractor, validate=False)

# make numeric pipe
num_pipe = Pipeline([
    ('numeric_transformer', num_transformer),
    ('num_im', SimpleImputer(strategy='median')),
    ('StandardScaler', StandardScaler())
])

# make categorical pipe
cat_pipe = Pipeline([
    ('cat_transformer', cat_transformer),
    ('cat_im', SimpleImputer(strategy='most_frequent')),
    ('OrdinalEncoder', OrdinalEncoder())
])


# make FeatureUnion
feat_union = FeatureUnion([
    ('num_pipe', num_pipe),
    ('cat_pipe', cat_pipe)
])

# make final feature pipe
feature_pipe = Pipeline([
    ('feat_union', feat_union)
])

#### Use this pipeline to _fit_ and _transform_ `X_train`

In [14]:
# fit and transform training data
X_train_prepared = pd.DataFrame(
    feature_pipe.fit(X_train).transform(X_train),
    index=X_train.index,
    columns=X_train.columns)
X_train_prepared.head()

Unnamed: 0,age,workclass,fnlwgt,education,education_num,marital_status,occupation,relationship,race,sex,capital_gain,capital_loss,hours_per_week,native_country
10348,-0.924812,0.012344,1.133671,-0.146749,-0.214716,-0.843279,3.0,9.0,4.0,7.0,1.0,4.0,1.0,38.0
11062,1.712486,2.162427,-1.198581,-0.146749,-0.214716,-1.572986,3.0,1.0,5.0,7.0,1.0,2.0,0.0,38.0
25734,-1.144586,2.329859,-0.421164,-0.146749,-0.214716,0.616136,3.0,11.0,4.0,11.0,1.0,4.0,1.0,38.0
401,-0.778295,-0.76374,-0.032455,-0.146749,-0.214716,-0.032493,3.0,15.0,4.0,7.0,1.0,4.0,0.0,38.0
28063,1.12642,1.600147,1.133671,0.844477,-0.214716,1.183686,4.0,9.0,2.0,3.0,0.0,4.0,1.0,38.0


In [15]:
# transform testing data
X_test_prepared = pd.DataFrame(
    feature_pipe.transform(X_test),
    index=X_test.index,
    columns=X_test.columns)
X_test_prepared.head()

Unnamed: 0,age,workclass,fnlwgt,education,education_num,marital_status,occupation,relationship,race,sex,capital_gain,capital_loss,hours_per_week,native_country
2093,0.906645,-0.129961,1.133671,-0.146749,-0.214716,0.535058,3.0,9.0,2.0,11.0,0.0,4.0,1.0,38.0
29473,-1.437619,1.329928,-1.198581,-0.146749,-0.214716,-0.032493,3.0,1.0,5.0,7.0,3.0,4.0,0.0,38.0
14123,0.174062,1.430095,-2.753415,-0.146749,-0.214716,-0.032493,3.0,4.0,2.0,9.0,0.0,4.0,1.0,25.0
10193,0.906645,-1.536521,-0.421164,-0.146749,-0.214716,-0.032493,0.0,11.0,2.0,12.0,0.0,4.0,1.0,38.0
18789,0.320579,1.045784,-0.032455,-0.146749,-0.214716,-0.032493,3.0,15.0,2.0,0.0,0.0,4.0,1.0,38.0


#### Use `LabelEncoder` to transform the `income` to be numeric

In [16]:
y_train[:5]

10348     <=50K
11062     <=50K
25734     <=50K
401       <=50K
28063      >50K
Name: income, dtype: object

In [17]:
# fit and transform y_train
le = LabelEncoder()
y_train_encoded = pd.Series(le.fit_transform(y_train), index=y_train.index)
y_train_encoded[:5]

10348    0
11062    0
25734    0
401      0
28063    1
dtype: int32

In [18]:
# transform y_test
y_test_encoded = pd.Series(le.transform(y_test), index=y_test.index)
y_test_encoded[:5]

2093     1
29473    0
14123    0
10193    0
18789    0
dtype: int32

#### Calculate Baseline

In [19]:
y_test_encoded.value_counts()[0] / y_test_encoded.value_counts().sum()

0.7592384072064694

[Back to top](#top)

<a id='model'></a>
## Create a Model: `RandomForest`

In [20]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score

In [21]:
rf = RandomForestClassifier(
    n_estimators=10,
    class_weight='balanced',
    oob_score=True
)
rf.fit(X_train_prepared, y_train_encoded)

  warn("Some inputs do not have OOB scores. "
  predictions[k].sum(axis=1)[:, np.newaxis])


RandomForestClassifier(bootstrap=True, class_weight='balanced',
            criterion='gini', max_depth=None, max_features='auto',
            max_leaf_nodes=None, min_impurity_decrease=0.0,
            min_impurity_split=None, min_samples_leaf=1,
            min_samples_split=2, min_weight_fraction_leaf=0.0,
            n_estimators=10, n_jobs=None, oob_score=True,
            random_state=None, verbose=0, warm_start=False)

In [22]:
# make predictions for training and test:
y_pred_train = rf.predict(X_train_prepared)
y_pred_test = rf.predict(X_test_prepared)

In [23]:
print('CLASSIFICATION METRICS FOR TRAINING: \n')
print(classification_report(y_train_encoded, y_pred_train))
print('#########################################################\n')

print('CLASSIFICATION METRICS FOR TESTING: \n')
print(classification_report(y_test_encoded, y_pred_test))

CLASSIFICATION METRICS FOR TRAINING: 

              precision    recall  f1-score   support

           0       0.99      1.00      0.99     17303
           1       0.99      0.95      0.97      5489

   micro avg       0.99      0.99      0.99     22792
   macro avg       0.99      0.98      0.98     22792
weighted avg       0.99      0.99      0.99     22792

#########################################################

CLASSIFICATION METRICS FOR TESTING: 

              precision    recall  f1-score   support

           0       0.88      0.94      0.91      7417
           1       0.74      0.59      0.66      2352

   micro avg       0.85      0.85      0.85      9769
   macro avg       0.81      0.76      0.78      9769
weighted avg       0.84      0.85      0.85      9769



In [24]:
accuracy_score(y_test_encoded, y_pred_test)

0.8516736615825571

In [25]:
from sklearn.model_selection import cross_val_score

In [26]:
cross_val_score(rf, X_train_prepared, y_train_encoded, scoring='recall', cv=5)

  warn("Some inputs do not have OOB scores. "
  predictions[k].sum(axis=1)[:, np.newaxis])
  warn("Some inputs do not have OOB scores. "
  predictions[k].sum(axis=1)[:, np.newaxis])
  warn("Some inputs do not have OOB scores. "
  predictions[k].sum(axis=1)[:, np.newaxis])
  warn("Some inputs do not have OOB scores. "
  predictions[k].sum(axis=1)[:, np.newaxis])
  warn("Some inputs do not have OOB scores. "
  predictions[k].sum(axis=1)[:, np.newaxis])


array([0.58105647, 0.5582878 , 0.54735883, 0.57468124, 0.57247037])

#### Print Top 10 Features

In [27]:
feat_imp_lst = list(zip(X_train_prepared.columns, rf.feature_importances_))
feat_lst = sorted(feat_imp_lst, key=lambda x: x[1], reverse=True)
for tup in feat_lst[:10]:
    print(tup)

('capital_gain', 0.15940464900671467)
('age', 0.15164577096686196)
('workclass', 0.14449902004210102)
('race', 0.10138573141536011)
('fnlwgt', 0.09526376074916464)
('education', 0.08345244990686272)
('marital_status', 0.07644069066980426)
('sex', 0.060222118956933815)
('occupation', 0.033573442539666355)
('relationship', 0.032884909124658954)


[Back to top](#top)

<a id='in-action'></a>
## Shapley Values in Action

The `shap` package includes a C++ optimized implementation for several popular Python tree models.

In [28]:
# explainer = shap.TreeExplainer(rf)
# shap_values = explainer.shap_values(X_train_prepared)

In [29]:
import pickle

In [152]:
# with open('rf-explainer.pickle', 'wb') as f:
#     pickle.dump(explainer, f, pickle.HIGHEST_PROTOCOL)
# with open('rf-shap-values.pickle', 'wb') as s:
#     pickle.dump(shap_values, s, pickle.HIGHEST_PROTOCOL)

In [154]:
with open('rf-explainer.pickle', 'rb') as f:
    explainer2 = pickle.load(f)
with open('rf-shap-values.pickle', 'rb') as s:
    shap_values2 = pickle.load(s)

[Back to top](#top)

<a id='visualize'></a>
### Plot the Shapley Values for the first observation in `X_train_prepared`

In [93]:
# load JS visualization code to notebook
shap.initjs()

# show 
display(X_train_prepared.iloc[0])
print()
print("Actual Label:\t\t{}".format(y_train_encoded[0]))
print("Predicted Label:\t{}".format(y_pred_train[0]))

# plot the explanation of each feature for the first prediction
shap.force_plot(explainer2.expected_value[0], shap_values[1][0, :], X_train_prepared.iloc[0, :])

age               -0.924812
workclass          0.012344
fnlwgt             1.133671
education         -0.146749
education_num     -0.214716
marital_status    -0.843279
occupation         3.000000
relationship       9.000000
race               4.000000
sex                7.000000
capital_gain       1.000000
capital_loss       4.000000
hours_per_week     1.000000
native_country    38.000000
Name: 10348, dtype: float64


Actual Label:		0
Predicted Label:	0


This is great, but we have 32,561 observations. We can visualize the entire data set as a distribution:

#### Look at explanations for 100 Class 0 Predictions

In [147]:
# make constants for which class is of interest
EXP_VALUE = 0
PRED_CLASS = 0
NUM_OBS = 100

# grab first 100 observations predicted class 1
data_to_plot = X_train_prepared.reset_index(drop=True)[y_pred_train == PRED_CLASS].iloc[:NUM_OBS, :]

# grab row indexes for these instances
row_idx = data_to_plot.index

# create the plot
shap.force_plot(
    explainer2.expected_value[EXP_VALUE],
    shap_values[PRED_CLASS][row_idx],
    data_to_plot
)

#### Look at explanations for Class 1 Predictions

In [151]:
# make constants for which class is of interest
EXP_VALUE = 1
PRED_CLASS = 1
NUM_OBS = 100

# grab first 100 observations predicted class 1
data_to_plot = X_train_prepared.reset_index(drop=True)[y_pred_train == PRED_CLASS].iloc[:NUM_OBS, :]

# grab row indexes for these instances
row_idx = data_to_plot.index

# create the plot
shap.force_plot(
    explainer2.expected_value[EXP_VALUE],
    shap_values[PRED_CLASS][row_idx],
    data_to_plot
)

[Back to top](#top)

Scratch Work

In [None]:
from sklearn.model_selection import RandomizedSearchCV

In [None]:
param_lst = {
    "n_estimators": np.arange(10, 105, 10),
    "max_depth": [None, 30, 10, 5],
    "class_weight": ['balanced'],
    "oob_score": [True],
}

In [None]:
rs = RandomizedSearchCV(rf, param_distributions=param_lst, verbose=2)

In [None]:
rs.fit(X_train_prepared, y_train_encoded)

In [None]:
rs.best_estimator_