## Task 3 - Model Explainability
Model explainability is crucial for understanding, trust, and debugging in machine learning models. You will use SHAP (Shapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) to interpret the models you built for fraud detection.

Using SHAP for Explainability
SHAP values provide a unified measure of feature importance, explaining the contribution of each feature to the prediction.
Installing SHAP
pip install shap
Explaining a Model with SHAP
SHAP Plots
Summary Plot: Provides an overview of the most important features.
Force Plot: Visualizes the contribution of features for a single prediction.
Dependence Plot: This shows the relationship between a feature and the model output.
Using LIME for Explainability
LIME explains individual predictions by approximating the model locally with an interpretable model.
Installing LIME
pip install lime
Explaining a Model with LIME
LIME Plots
Feature Importance Plot: Shows the most influential features for a specific prediction.

installing SHAP

In [3]:
pip install shap

Note: you may need to restart the kernel to use updated packages.


In [11]:
import shap
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from data_analysis_and_preprocessing import load_data
from sklearn.model_selection import train_test_split
import pandas as pd

In [8]:
Fraud_data=load_data(r'C:\Users\ASUS VIVO\Desktop\e-commerce\Improved-detection-of-fraud-cases-for-e-commerce-and-bank-transactions\data\data\Fraud_data_1.csv')
creditcard=load_data(r'C:\Users\ASUS VIVO\Desktop\e-commerce\Improved-detection-of-fraud-cases-for-e-commerce-and-bank-transactions\data\data\creditcard_final.csv')




In [12]:
list=['signup_time','purchase_time']
for column in list:
 Fraud_data[column]=pd.to_datetime(Fraud_data[column])

In [13]:
X_Fraud_data=Fraud_data.drop(['class','signup_time','purchase_time','device_id'],axis=1)
Y_Fraud_data=Fraud_data['class']
X_creditcard=creditcard.drop('Class',axis=1)
Y_creditcard=creditcard['Class']

In [14]:

X_train_fraud, X_test_fraud, y_train_fraud, y_test_fraud = train_test_split(X_Fraud_data, Y_Fraud_data, test_size=0.2, random_state=42)

X_train_creditcard, X_test_creditcard, y_train_creditcard, y_test_creditcard = train_test_split(X_creditcard, Y_creditcard, test_size=0.2, random_state=42)

In [None]:
model_fraud = RandomForestClassifier(class_weight='balanced')
model_fraud.fit(X_train_fraud, y_train_fraud)

explainer = shap.TreeExplainer(model_fraud)
shap_values = explainer.shap_values(X_train_fraud)

Summary Plot: Provides an overview of the most important features.

In [None]:
shap.summary_plot(shap_values,X_train_fraud)

Force Plot: Visualizes the contribution of features for a single prediction.

In [None]:
shap.force_plot(shap_values,X_train_fraud)

Dependence Plot: Shows the relationship between a feature and the model output.

In [None]:
shap.dependence_plot('feature name',shap_values[1],X_train_fraud)