Before starting SHAP Analysis you need to install the following:
1. Python
2. Anaconda Navigator.
3. Install required liberaries.

For installing liberaries:

Open "Anaconda Prompt" as Administrator and Run the following commands:

command 1: pip3 install xgboost

command 2: pip install shap

# Importing Relevant Libraries

In [None]:
import numpy as np
import pandas as pd
import shap
import matplotlib.pyplot as plt
import seaborn as sns
from xgboost import XGBRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split,GridSearchCV
from sklearn.metrics import mean_squared_error

# For loading and previewing of data

1. We are reading data from a .csv file.
2. If you don't have a .csv file of data, first create one.
3. Place the .csv file in the folder where you created your Python file.
4. In our case it is "Downloads\Untitled Folder".
5. The "SHAP.csv" is the name of .csv file created in "Step 2".
6. The name of file and this should be same.

In [None]:
df = pd.read_csv("SHAP.csv")
df.head()

# Viewing summary of data

In [None]:
df.info()

# Declaring "X" and "Y" values

Make sure the names of variables are same both in code and in .csv file

In [None]:
X = df[['A','B','C','D','E','F','G','H']]

y = df['Model Values']

# Splitting the data into train and test data

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, shuffle=True, random_state=0)

# Building the model with Random Forest Classifier

In [None]:
model = RandomForestRegressor(max_depth=6, random_state=0, n_estimators=10)
model.fit(X_train, y_train)

# Generating Predictions

In [None]:
y_pred = model.predict(X_test)

# Evaluating Performance

In [None]:
mse = mean_squared_error(y_test, y_pred)**(0.5)
mse

# Explaining the model's predictions using SHAP

In [None]:
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_train)

# Visualizing the first prediction's explanation

This plot shows each feature contributing to push the model output from the base value (the average model output over the training dataset) 
to the model output.
Features pushing the prediction higher are shown in red and those pushing the prediction lower are in blue.

So, E, G, and H pushes the prediction higher and D, B, A, and F pushes the prediction lower.

The base value of the "Model Value" is 579.89

In [None]:
shap.initjs()
shap.force_plot(explainer.expected_value, shap_values[0,:], X_train.iloc[0,:])

# Visualizing the training set predictions

The following plot is interactive. Just scroll the mouse and see the different values.

In [None]:
shap.force_plot(explainer.expected_value, shap_values, X_train)

# Standard way of plotting feature importance

To visualize the SHAP feature importance for a trained random forest model, the features are sorted in decreasing order of importance and plotted accordingly. The resulting plot displays the mean absolute Shapley values.

In the specific figure presented, the variable "H" is identified as the most important feature.

This code automatically saves the picture in parent folder with the defined name in code.

In [None]:
shap_values = shap.TreeExplainer(model).shap_values(X_train)
shap.summary_plot(shap_values, X_train, show=False, plot_type="bar")
plt.savefig('Parameter Importance.JPG', dpi=300, bbox_inches ='tight')

# Summary plot combining feature importance with feature effects

The summary plot combines feature importance with feature effects. Each point on the summary plot is a Shapley value for a feature and an instance. The position on the y-axis is determined by the feature and on the x-axis by the Shapley value.

The features are ordered according to their importance.

This code automatically saves the picture in parent folder with the defined name in code.

In [None]:
shap.summary_plot(shap_values, X_train, show=False)
plt.savefig('Parameter Influence.JPG', dpi=300, bbox_inches ='tight')

# Dependence plots

A SHAP dependence plot visualizes the relationship between a feature and the model's predicted outcome. 

It helps identify linear or non-linear relationships and interactions with other features.

The automatically selects the variable that has the strongest interaction with the chosen variable, providing insight into how their combined effect affects the predicted outcome.

This code automatically saves the picture in parent folder with the defined name in code.

In [None]:
shap.dependence_plot('D', shap_values, X_train, show=False)
plt.savefig('D Dependence.JPG', dpi=300, bbox_inches ='tight')

In [None]:
shap.dependence_plot('E', shap_values, X_train, show=False)
plt.savefig('E Dependence.JPG', dpi=300, bbox_inches ='tight')