# Explainable fraud detection model

In this example we develop a small fraud detection model for credit card transactions based on XGBoost, export it to TorchScript using Hummingbird (https://github.com/microsoft/hummingbird) and run Shapley Value Sampling explanations (see https://captum.ai/api/shapley_value_sampling.html for reference) on it, via torch script.

We load both the original model and the explainability script in RedisAI and trigger them in a DAG.

## Data

For this example we use a dataset of transactions made by credit cards in September 2013 by European cardholders. 
The dataset presents transactions that occurred in two days, with 492 frauds out of 284,807 transactions.

The dataset is available at https://www.kaggle.com/mlg-ulb/creditcardfraud. For anonymity purposes, the features are 28 PCA features (V1 to V28), along with transaction Time and Amount.

__In order to run this notebook please download the `creditcard.csv` file from Kaggle and place it in the `data/` directory.__

Once the file is in place, we start by importing Pandas and reading the data. We create a dataframe of covariates and a dataframe of targets.

In [1]:
import pandas as pd
import numpy as np

df = pd.read_csv('data/creditcard.csv')

In [2]:
X = df.drop(['Class'], axis=1)
Y = df['Class']

## Model

We start off by randomly splitting train and test datasets.

In [3]:
from sklearn.model_selection import train_test_split

seed = 7
test_size = 0.33
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=test_size, random_state=seed)

Next we use XGBoost to classify the transactions. Note that we convert the arguments to `fit` to NumPy arrays.

In [4]:
from xgboost import XGBClassifier

model = XGBClassifier(label_encoder=False)
model.fit(X_train.to_numpy(), y_train.to_numpy())



Parameters: { "label_encoder" } might not be used.

  This may not be accurate due to some parameters are only used in language bindings but
  passed down to XGBoost core.  Or some parameters are not used but slip through this
  verification. Please open an issue if you find above cases.




XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints='',
              label_encoder=False, learning_rate=0.300000012, max_delta_step=0,
              max_depth=6, min_child_weight=1, missing=nan,
              monotone_constraints='()', n_estimators=100, n_jobs=16,
              num_parallel_tree=1, random_state=0, reg_alpha=0, reg_lambda=1,
              scale_pos_weight=1, subsample=1, tree_method='exact',
              validate_parameters=1, verbosity=None)

We now obtain predictions on the test dataset and binarize the output probabilities to get a target.

In [5]:
y_pred = model.predict(X_test.to_numpy())
predictions = [round(value) for value in y_pred]

We evaluate the accuracy of our model on the test set (this is just an example: the dataset is heavily unbalanced so accuracy is not a fair characterization in this case).

In [6]:
from sklearn.metrics import accuracy_score, confusion_matrix

accuracy = accuracy_score(y_test, predictions)
print("Accuracy: %.2f%%" % (accuracy * 100.0))

Accuracy: 99.96%


Looking at the confusion matrix gives a clearer representation.

In [7]:
confusion_matrix(y_test, predictions)

array([[93813,     8],
       [   28,   138]])

We are interested to explore are cases of fraud, so we extract them from the test set.

In [8]:
X_test_fraud = X_test[y_test == 1].to_numpy()

We verify how many times we are getting it right.

In [9]:
model.predict(X_test_fraud) == 1

array([ True,  True,  True,  True,  True,  True,  True, False,  True,
       False,  True,  True,  True, False,  True,  True, False,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
       False,  True,  True,  True,  True, False,  True, False,  True,
       False,  True,  True,  True,  True, False, False,  True,  True,
        True,  True,  True, False,  True, False,  True, False, False,
        True,  True,  True,  True,  True,  True,  True, False,  True,
        True,  True,  True,  True,  True,  True,  True, False, False,
        True,  True, False,  True,  True,  True,  True,  True,  True,
        True,  True, False,  True,  True,  True,  True,  True,  True,
        True,  True,  True, False, False,  True,  True,  True,  True,
        True, False, False,  True,  True,  True, False,  True,  True,
        True,  True,  True,  True,  True,  True,  True, False,  True,
        True,  True, False,  True,  True,  True,  True, False,  True,
        True,  True,

## Exporting to TorchScript with Hummingbird

From the project page (https://github.com/microsoft/hummingbird):

> Hummingbird is a library for compiling trained traditional ML models into tensor computations. Hummingbird allows users to seamlessly leverage neural network frameworks (such as PyTorch) to accelerate traditional ML models.

Hummingbird can take scikit-learn, XGBoost or LightGBM models and export them to PyTorch, TorchScript, ONNX and TVM. This works very well for running ML models on RedisAI and take advantage of vectorized CPU instructions or GPU.

We choose to convert the boosted tree to tensor computations using the `gemm` implementation.

In [10]:
from hummingbird.ml import convert, load

In [11]:
extra_config={
     "tree_implementation": "gemm"
}

hummingbird_model = convert(model, 'torchscript', test_input=X_test_fraud, extra_config=extra_config)
import torch
torch.jit.save(hummingbird_model.model, "models/fraud_detection_model.pt")

At this point, `hm_model` is an object containing a TorchScript model that is ready to be exported.

We can verify everything works by loading the model and running a prediction. The model outputs a tuple containing the predicted classes and the output probabilities.

In [12]:
loaded_model = torch.jit.load("models/fraud_detection_model.pt")

X_test_fraud_tensor = torch.from_numpy(X_test_fraud)

loaded_output_classes, loaded_output_probs = loaded_model(X_test_fraud_tensor)

We can now compare against the original output from the XGBoost model.

In [13]:
xgboost_output_classes = torch.from_numpy(model.predict(X_test_fraud))

torch.equal(loaded_output_classes, xgboost_output_classes)

True

## Explainer Script

The script `torch_shapely.py` is a torch script degined specificly running on RedisAI, and utilizes RedisAI extension for torch script, that allows to run any model stored in RedisAI from within the script. Let's go over the details:

In RedisAI, each entry point (function in script) should have the signature:
`function_name(tensors: List[Tensor], keys: List[str], args: List[str]):`
In our case our entry point is `shapely_sample(tensors: List[Tensor], keys: List[str], args: List[str]):` and the parameters are:
```
Tensors:
    tensors[0] - x : Input tensor to the model
    tensors[1] - baselines : Optional - reference values which replace each feature when
        ablated; if no baselines are provided, baselines are set
        to all zeros

Keys:
    keys[0] - model_key: Redis key name where the model is stored as RedisAI model.
        
Args:
    args[0] - n_samples: number of random feature permutations performed
    args[1] - number_of_outputs - number of model outputs
    args[2] - output_tensor_index - index of the tested output tensor
    args[3] - Optional - target: output indices for which Shapley Value Sampling is
            computed; if model returns a single scalar, target can be
            None
```

The script will create `n_samples` amount of permutations of the input features. For each permutation it will check for each feature what was its contribution to the result by running the model repeatedly on a new subset of input features.


## Serving model and explainer in RedisAI

At this point we can load the model we exported into RedisAI and serve it from there. We will also load the `torch_shapely.py` script, that allows calculating the Shapely value of a model, from within RedisAI. After making sure RedisAI is running, we initialize the client.

In [14]:
import redisai

rai = redisai.Client()

We read the model and the script.

In [15]:
with open("models/fraud_detection_model.pt", "rb") as f:
    fraud_detection_model_blob = f.read()

with open("torch_shapley.py", "rb") as f:
    shapely_script = f.read()

We load both model and script into RedisAI.

In [16]:
rai.modelstore("fraud_detection_model", "TORCH", "CPU", fraud_detection_model_blob)
#rai.scriptstore("shapley_script", device='CPU', script=shapely_script, entry_points=["shapley_sample"] )

'OK'

All set, it's now test time. We reuse our `X_test_fraud` NumPy array we created previously. We set it, and run the Shapley script and get explanations as arrays.

In [17]:
rai.tensorset("fraud_input", X_test_fraud, dtype="float")

rai.scriptexecute("shapley_script", "shapley_sample", inputs = ["fraud_input"], keys = ["fraud_detection_model"], args = ["20", "2", "0"], outputs=["fraud_explanations"])

rai_expl = rai.tensorget("fraud_explanations")

winning_feature_redisai = np.argmax(rai_expl[0], axis=0)

print("Winning feature: %d" % winning_feature_redisai)

Winning feature: 14


Alternatively we can set up a RedisAI DAG and run it in one swoop.

In [18]:
dag = rai.dag(routing ="fraud_detection_model")
dag.tensorset("fraud_input", X_test_fraud, dtype="float")
dag.modelexecute("fraud_detection_model", "fraud_input", ["fraud_pred", "fraud_prob"])
dag.scriptexecute("shapley_script", "shapley_sample", inputs = ["fraud_input"], keys = ["fraud_detection_model"], args = ["20", "2", "0"], outputs=["fraud_explanations"])
dag.tensorget("fraud_pred")
dag.tensorget("fraud_explanations")

<redisai.dag.Dag at 0x7f28845b6eb0>

We now set the input and request a DAG execution, which will produce the desired outputs.

In [19]:
# rai.tensorset("fraud_input", X_test_fraud, dtype="float")

_, _, _, dag_pred, dag_expl = dag.execute()

In [20]:
dag_pred

array([1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1,
       1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,
       1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1])

We can now check that the winning feature matches with what we computed earlier on the first sample in the test batch.

In [21]:
winning_feature_redisai_dag = np.argmax(dag_expl[0])

print("Winning feature: %d" % winning_feature_redisai_dag)

Winning feature: 14


In [22]:
dag_expl[0]

array([ 0.  ,  0.  ,  0.05,  0.1 ,  0.15,  0.  ,  0.  ,  0.  ,  0.  ,
       -0.05,  0.05,  0.  ,  0.1 ,  0.  ,  0.6 ,  0.  ,  0.  ,  0.05,
       -0.05,  0.  ,  0.05, -0.05,  0.  , -0.1 ,  0.1 ,  0.  ,  0.  ,
       -0.05,  0.  ,  0.05])

## Compare to Captum
Being the reference implementation, we now would like to test our implementation to the original one:

In [23]:
from captum.attr import ShapleyValueSampling
import torch
def forward_func(*x):
    return hummingbird_model.model(*x)[0]

gs = ShapleyValueSampling(forward_func)
shapley_values = gs.attribute(inputs=torch.from_numpy(X_test_fraud), baselines = torch.zeros(X_test_fraud.shape), target=None, n_samples=20)


In [24]:
np.argmax(shapley_values.numpy(), axis=1)


array([14, 14, 14, 14, 14, 12, 14,  0, 14,  0, 14, 14, 14,  0, 14, 14,  0,
       14, 14, 12, 14,  4, 14, 14, 14, 14, 14,  0, 14, 14, 14, 14,  0, 14,
        0, 14,  0, 14, 14, 14, 14, 14,  0, 14, 14, 14, 14, 14,  0, 14, 14,
       14,  7, 14, 14, 14, 14, 12, 14, 14, 14,  0, 14, 14, 14, 14, 14, 14,
        4, 14, 14, 14, 14, 14,  0, 14, 14, 14, 14, 14, 14, 14, 14,  0, 14,
       14, 14, 14, 14,  4, 14, 14, 14,  0,  0, 14, 12, 14, 14, 14,  0,  0,
       14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,  4, 14,  0, 14, 14, 14,
        0, 14, 14, 14, 14,  0, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
       14, 14, 14, 14, 14, 14, 14, 14, 14, 17, 14, 14, 14, 14,  4, 14, 14,
       14, 14, 14, 14, 14, 14, 14,  0, 14, 14, 14, 14, 14])

In [25]:
np.argmax(dag_expl, axis=1)


array([14, 14, 14, 14, 14, 14, 14,  0, 14,  0, 14, 14, 14,  0, 14, 14,  0,
       14, 14, 14, 14, 12, 14, 14, 12, 14, 14,  0, 14, 14, 14, 14,  0, 14,
        0, 14,  0, 14, 14, 14, 14,  7,  0, 14, 14, 14, 14, 14,  0,  7,  7,
       14, 23, 10, 14, 14, 12, 14, 10, 14, 14,  0, 14, 14, 12, 14, 14, 14,
       14, 14, 10, 10, 14, 14,  0, 14, 12, 14, 14, 14, 14, 14, 14,  0, 14,
       14, 14, 14, 14, 12, 14, 14, 14,  0,  0, 14,  4, 14, 14, 14,  0,  0,
       14, 14, 12, 10, 14, 14, 14, 10, 14, 14, 14, 17, 14,  0, 14, 14, 14,
        0, 14, 14, 14, 14,  0, 14, 10, 14, 14, 14, 14, 14, 14, 14, 10, 14,
       14, 14, 14, 14, 14, 14, 12, 14, 14, 12, 14, 14, 10,  4, 12, 14, 14,
       14, 14, 14, 14, 10, 14, 14,  0, 14, 14, 14, 14, 14])