## Permutation Importance
With this insight, the process is as follows:

1. *Get a trained model*.
2. *Shuffle the values in a single column, make predictions using the resulting dataset. Use these predictions and the true target values to calculate how much the loss function suffered from shuffling. That performance deterioration measures the importance of the variable you just shuffled.*
3. *Return the data to the original order (undoing the shuffle from step 2). Now repeat step 2 with the next column in the dataset, until you have calculated the importance of each column.*

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

data = pd.read_csv('FIFA_statistics.csv')
y = (data['Man of the Match'] == "Yes")  # Convert from string "Yes"/"No" to binary
feature_names = [i for i in data.columns if data[i].dtype in [np.int64]]
X = data[feature_names]
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state=1)
my_model = RandomForestClassifier(n_estimators=100,
                                  random_state=0).fit(train_X, train_y)

In [2]:
!pip install eli5



In [3]:
import eli5
from eli5.sklearn import PermutationImportance

perm = PermutationImportance(my_model, random_state=1).fit(val_X, val_y)
eli5.show_weights(perm, feature_names = val_X.columns.tolist())

Weight,Feature
0.1750  ± 0.0848,Goal Scored
0.0500  ± 0.0637,Distance Covered (Kms)
0.0437  ± 0.0637,Yellow Card
0.0187  ± 0.0500,Off-Target
0.0187  ± 0.0637,Free Kicks
0.0187  ± 0.0637,Fouls Committed
0.0125  ± 0.0637,Pass Accuracy %
0.0125  ± 0.0306,Blocked
0.0063  ± 0.0612,Saves
0.0063  ± 0.0250,Ball Possession %


## Partial Plots
While feature importance shows what variables most affect predictions, partial dependence plots show how a feature affects predictions

In [4]:
from sklearn.tree import DecisionTreeClassifier
tree_model = DecisionTreeClassifier(random_state=0, max_depth=5, min_samples_split=5).fit(train_X, train_y)

In [5]:
# from sklearn import tree
# import graphviz


# tree_graph = tree.export_graphviz(tree_model,out_file=None,feature_names=feature_names)
# graphviz.Source(tree_graph)

In [7]:
# from matplotlib import pyplot as plt
# from pdpbox import pdp, get_dataset, info_plots

# # Create the data that we will plot
# pdp_goals = pdp.pdp_isolate(model=tree_model, dataset=val_X, model_features=feature_names, feature='Goal Scored')

# # plot it
# pdp.pdp_plot(pdp_goals, 'Goal Scored')
# plt.show()


Building wheels for collected packages: matplotlib
  Building wheel for matplotlib (setup.py): started
  Building wheel for matplotlib (setup.py): finished with status 'error'
  Running setup.py clean for matplotlib
Failed to build matplotlib
Installing collected packages: matplotlib, pdpbox
  Attempting uninstall: matplotlib
    Found existing installation: matplotlib 3.5.0
    Uninstalling matplotlib-3.5.0:
      Successfully uninstalled matplotlib-3.5.0
    Running setup.py install for matplotlib: started
    Running setup.py install for matplotlib: finished with status 'error'
  Rolling back uninstall of matplotlib
  Moving to c:\users\prvzs\onedrive\desktop\desktop\datasets_and_projects\kaggle\kaggle_course\env\lib\site-packages\__pycache__\pylab.cpython-39.pyc
   from C:\Users\prvzs\AppData\Local\Temp\pip-uninstall-b3b50vw8\pylab.cpython-39.pyc
  Moving to c:\users\prvzs\onedrive\desktop\desktop\datasets_and_projects\kaggle\kaggle_course\env\lib\site-packages\matplotlib-3.5.0-py

  creating build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\afm.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\animation.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\artist.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\axis.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\backend_bases.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\backend_managers.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\backend_tools.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\bezier.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\blocking_input.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\category.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\cm.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\collections.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\colorbar.py -> build

   from C:\Users\prvzs\OneDrive\Desktop\Desktop\Datasets_and_Projects\Kaggle\KAGGLE_COURSE\env\Lib\site-packages\~atplotlib
  Moving to c:\users\prvzs\onedrive\desktop\desktop\datasets_and_projects\kaggle\kaggle_course\env\lib\site-packages\mpl_toolkits\axes_grid1\
   from C:\Users\prvzs\OneDrive\Desktop\Desktop\Datasets_and_Projects\Kaggle\KAGGLE_COURSE\env\Lib\site-packages\mpl_toolkits\~xes_grid1
  Moving to c:\users\prvzs\onedrive\desktop\desktop\datasets_and_projects\kaggle\kaggle_course\env\lib\site-packages\mpl_toolkits\axes_grid\
   from C:\Users\prvzs\OneDrive\Desktop\Desktop\Datasets_and_Projects\Kaggle\KAGGLE_COURSE\env\Lib\site-packages\mpl_toolkits\~xes_grid
  Moving to c:\users\prvzs\onedrive\desktop\desktop\datasets_and_projects\kaggle\kaggle_course\env\lib\site-packages\mpl_toolkits\axisartist\
   from C:\Users\prvzs\OneDrive\Desktop\Desktop\Datasets_and_Projects\Kaggle\KAGGLE_COURSE\env\Lib\site-packages\mpl_toolkits\~xisartist
  Moving to c:\users\prvzs\onedrive\deskt

  copying lib\matplotlib\pylab.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\pyplot.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\quiver.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\rcsetup.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\sankey.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\scale.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\spines.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\stackplot.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\streamplot.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\table.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\texmanager.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\text.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\textpath.py -> build\lib.win-amd64-3.9\matplotlib
  copying lib\matplotlib\ticker.py -> buil

In [8]:
# feature_to_plot = 'Distance Covered (Kms)'
# pdp_dist = pdp.pdp_isolate(model=tree_model, dataset=val_X, model_features=feature_names, feature=feature_to_plot)

# pdp.pdp_plot(pdp_dist, feature_to_plot)
# plt.show()


  copying lib\mpl_toolkits\axes_grid\axis_artist.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\clip_path.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\colorbar.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\floating_axes.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\grid_finder.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\grid_helper_curvelinear.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\inset_locator.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\parasite_axes.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  copying lib\mpl_toolkits\axes_grid\__init__.py -> build\lib.win-amd64-3.9\mpl_toolkits\axes_grid
  creating build\lib.win-amd64-3.9\mpl_toolkits\axes_grid1
  copying li

In [9]:
# Build Random Forest model
# rf_model = RandomForestClassifier(random_state=0).fit(train_X, train_y)

# pdp_dist = pdp.pdp_isolate(model=rf_model, dataset=val_X, model_features=feature_names, feature=feature_to_plot)

# pdp.pdp_plot(pdp_dist, feature_to_plot)
# plt.show()


  copying lib\matplotlib\mpl-data\sample_data\demodata.csv -> build\lib.win-amd64-3.9\matplotlib\mpl-data\sample_data
  copying lib\matplotlib\mpl-data\fonts\afm\phvbo8an.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
  copying lib\matplotlib\mpl-data\images\matplotlib.svg -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
  copying lib\matplotlib\mpl-data\fonts\afm\ptmri8a.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
  copying lib\matplotlib\mpl-data\images\subplots.gif -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
  copying lib\matplotlib\backends\web_backend\jquery-ui-1.12.1\images\ui-icons_777777_256x240.png -> build\lib.win-amd64-3.9\matplotlib\backends\web_backend\jquery-ui-1.12.1\images
  copying lib\matplotlib\mpl-data\images\filesave.svg -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
  copying lib\matplotlib\mpl-data\fonts\ttf\STIXSizTwoSymReg.ttf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\ttf
  creating build\lib.win-

  copying lib\matplotlib\mpl-data\fonts\afm\phvb8a.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
  copying lib\matplotlib\mpl-data\images\move_large.gif -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
  copying lib\matplotlib\mpl-data\images\zoom_to_rect.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
  copying lib\matplotlib\mpl-data\fonts\pdfcorefonts\Times-Bold.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\pdfcorefonts
  copying lib\matplotlib\mpl-data\images\move_large.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
  copying lib\matplotlib\mpl-data\fonts\pdfcorefonts\Symbol.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\pdfcorefonts
  copying lib\matplotlib\mpl-data\images\home_large.gif -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
  copying lib\matplotlib\mpl-data\stylelib\seaborn-white.mplstyle -> build\lib.win-amd64-3.9\matplotlib\mpl-data\stylelib
  copying lib\matplotlib\backends\web_backend\js\mpl.js -> b

    OPTIONAL PACKAGE DATA
            dlls: no  [skipping due to configuration]
    
    running install
    running build
    running build_py
    creating build
    creating build\lib.win-amd64-3.9
    copying lib\pylab.py -> build\lib.win-amd64-3.9
    creating build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\afm.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\animation.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\artist.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\axis.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\backend_bases.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\backend_managers.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\backend_tools.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\bezier.py -> build\lib.win-amd64-3.9\matplotlib
    copying lib\matplotlib\blocking_input.py -> build\lib.win-amd64-3.9\matplo

In [10]:
# Similar to previous PDP plot except we use pdp_interact instead of pdp_isolate and pdp_interact_plot instead of pdp_isolate_plot
# features_to_plot = ['Goal Scored', 'Distance Covered (Kms)']
# inter1  =  pdp.pdp_interact(model=tree_model, dataset=val_X, model_features=feature_names, features=features_to_plot)

# pdp.pdp_interact_plot(pdp_interact_out=inter1, feature_names=features_to_plot, plot_type='contour')
# plt.show()


    creating build\lib.win-amd64-3.9\mpl_toolkits
    copying lib\mpl_toolkits\__init__.py -> build\lib.win-amd64-3.9\mpl_toolkits
    creating build\lib.win-amd64-3.9\matplotlib\axes
    copying lib\matplotlib\axes\_axes.py -> build\lib.win-amd64-3.9\matplotlib\axes
    copying lib\matplotlib\axes\_base.py -> build\lib.win-amd64-3.9\matplotlib\axes
    copying lib\matplotlib\axes\_secondary_axes.py -> build\lib.win-amd64-3.9\matplotlib\axes
    copying lib\matplotlib\axes\_subplots.py -> build\lib.win-amd64-3.9\matplotlib\axes
    copying lib\matplotlib\axes\__init__.py -> build\lib.win-amd64-3.9\matplotlib\axes
    creating build\lib.win-amd64-3.9\matplotlib\backends
    copying lib\matplotlib\backends\backend_agg.py -> build\lib.win-amd64-3.9\matplotlib\backends
    copying lib\matplotlib\backends\backend_cairo.py -> build\lib.win-amd64-3.9\matplotlib\backends
    copying lib\matplotlib\backends\backend_gtk3.py -> build\lib.win-amd64-3.9\matplotlib\backends
    copying lib\matplotl

    copying lib\matplotlib\mpl-data\fonts\afm\ptmr8a.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
    copying lib\matplotlib\mpl-data\images\move.gif -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\images\forward_large.gif -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    creating build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\ttf
    copying lib\matplotlib\mpl-data\fonts\ttf\DejaVuSerif.ttf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\ttf
    creating build\lib.win-amd64-3.9\matplotlib\mpl-data\stylelib
    copying lib\matplotlib\mpl-data\stylelib\seaborn-darkgrid.mplstyle -> build\lib.win-amd64-3.9\matplotlib\mpl-data\stylelib
    copying lib\matplotlib\mpl-data\images\subplots.svg -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\fonts\ttf\STIXNonUniIta.ttf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\ttf
    creating build\lib.win-amd64-3.9\matplotlib\backend

    copying lib\matplotlib\mpl-data\fonts\afm\pagd8a.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
    copying lib\matplotlib\mpl-data\images\move.pdf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\fonts\afm\phvbo8a.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
    copying lib\matplotlib\mpl-data\images\zoom_to_rect_large.gif -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\fonts\afm\cmsy10.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
    copying lib\matplotlib\mpl-data\fonts\ttf\STIXNonUni.ttf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\ttf
    copying lib\matplotlib\mpl-data\images\subplots_large.gif -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    creating build\lib.win-amd64-3.9\matplotlib\backends\web_backend\jquery
    creating build\lib.win-amd64-3.9\matplotlib\backends\web_backend\jquery\js
    copying lib\matplotlib\backends\web_ba

    copying lib\matplotlib\mpl-data\images\hand.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\images\move.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\stylelib\_classic_test.mplstyle -> build\lib.win-amd64-3.9\matplotlib\mpl-data\stylelib
    copying lib\matplotlib\mpl-data\images\help.pdf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\fonts\ttf\cmsy10.ttf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\ttf
    copying lib\matplotlib\mpl-data\images\subplots.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\sample_data\logo2.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\sample_data
    copying lib\matplotlib\mpl-data\fonts\ttf\STIXSizTwoSymReg.ttf -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\ttf
    copying lib\matplotlib\mpl-data\images\qt4_editor_options.svg -> build\lib.win-amd64-3.9\

    copying lib\matplotlib\mpl-data\fonts\afm\phvlo8a.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
    copying lib\matplotlib\mpl-data\sample_data\jacksboro_fault_dem.npz -> build\lib.win-amd64-3.9\matplotlib\mpl-data\sample_data
    copying lib\matplotlib\mpl-data\fonts\afm\pncr8a.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
    copying lib\matplotlib\mpl-data\images\move_large.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\images\forward.svg -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\sample_data\aapl.npz -> build\lib.win-amd64-3.9\matplotlib\mpl-data\sample_data
    copying lib\matplotlib\mpl-data\images\zoom_to_rect.png -> build\lib.win-amd64-3.9\matplotlib\mpl-data\images
    copying lib\matplotlib\mpl-data\fonts\afm\psyr.afm -> build\lib.win-amd64-3.9\matplotlib\mpl-data\fonts\afm
    copying lib\matplotlib\mpl-data\images\help.ppm -> build\lib.win-amd64-3.

## SHAP Values
SHAP Values (an acronym from SHapley Additive exPlanations) break down a prediction to show the impact of each feature.

In [12]:
row_to_show = 5
data_for_prediction = val_X.iloc[row_to_show]
data_for_prediction_array = data_for_prediction.values.reshape(1,-1)
my_model.predict_proba(data_for_prediction_array)



array([[0.29, 0.71]])

In [13]:
!pip install shap

Collecting shap
  Downloading shap-0.40.0-cp39-cp39-win_amd64.whl (432 kB)
Collecting slicer==0.0.7
  Downloading slicer-0.0.7-py3-none-any.whl (14 kB)
Collecting cloudpickle
  Using cached cloudpickle-2.0.0-py3-none-any.whl (25 kB)
Collecting numba
  Downloading numba-0.54.1-cp39-cp39-win_amd64.whl (2.3 MB)
Collecting tqdm>4.25.0
  Using cached tqdm-4.62.3-py2.py3-none-any.whl (76 kB)
Collecting llvmlite<0.38,>=0.37.0rc1
  Downloading llvmlite-0.37.0-cp39-cp39-win_amd64.whl (17.0 MB)
Collecting numpy
  Downloading numpy-1.20.3-cp39-cp39-win_amd64.whl (13.7 MB)
Installing collected packages: numpy, llvmlite, tqdm, slicer, numba, cloudpickle, shap
  Attempting uninstall: numpy
    Found existing installation: numpy 1.21.2
    Uninstalling numpy-1.21.2:
      Successfully uninstalled numpy-1.21.2
Successfully installed cloudpickle-2.0.0 llvmlite-0.37.0 numba-0.54.1 numpy-1.20.3 shap-0.40.0 slicer-0.0.7 tqdm-4.62.3


In [15]:
# import shap
# explainer = shap.TreeExplainer(my_model)
# shap_values = explainer.shap_values(data_for_prediction)

In [16]:
# shap.initjs()
# shap.force_plot(explainer.expected_value[1], shap_values[1], data_for_prediction)

In [17]:
# # use Kernel SHAP to explain test set predictions
# k_explainer = shap.KernelExplainer(my_model.predict_proba, train_X)
# k_shap_values = k_explainer.shap_values(data_for_prediction)
# shap.force_plot(k_explainer.expected_value[1], k_shap_values[1], data_for_prediction