# Neural Network Inference for Fraud Detection Using FHE (simple)

**NOTE:**

This notebook is very similar to the previous Neural Network Fraud Detection example in Notebook 02, but uses a single line of FHE code and is much more simple. Unlike Notebook 02, you do not have to call the Optimizer, specify different parameters or encode and encrypt; you only call a single FHE command using the pyhelayers extension API. 

This example demonstrates the *pyhelayersext API*, which offers an easy integration with the keras library and replaces the keras predictions with the FHE implementation. The FHE configuration details are taken from fhe.json configuration file. This config file contains FHE parameters that the user can tune (e.g. batch size, security level, etc.).

#### This demo uses the Credit Card Fraud Detection dataset, originally taken from: https://www.kaggle.com/mlg-ulb/creditcardfraud 
This dataset contains actual anonymized transactions made by credit card holders from September 2013 and is labeled for transactions being fraudulent or genuine. See references at the bottom of the page.

<br>

#### Step 1. Import different machine learning libraries

In [None]:
import os
import utils 

utils.verify_memory()

##### For reproducibility
seed_value= 1
os.environ['PYTHONHASHSEED']=str(seed_value)
import random
random.seed(seed_value)
import numpy as np
np.random.seed(seed_value)
import tensorflow as tf
tf.random.set_seed(seed_value)
#####
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.models import Sequential
import h5py
import numpy as np 
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
from sklearn import metrics
#####
# import utils
import sys
path_to_utils='.'
sys.path.append(path_to_utils)
import utils
# import activations
sys.path.append(os.path.join(path_to_utils, 'data_gen'))
from activations import SquareActivation

epochs = 3
batch_size = 32 
optimizer = Adam
lr = 0.01
print("misc. init complete")

PATH = os.path.join(utils.get_data_sets_dir(), 'net_fraud')

<br>

#### Step 2. Read the data set

In [None]:
df = pd.read_csv(os.path.join(utils.get_data_sets_dir(path_to_utils), 'net_fraud', 'creditcard.csv'))

print(f'Reading {df.shape[0]} samples')

X = df.loc[:, df.columns.tolist()[1:30]].values
Y = df.loc[:, 'Class'].values
print(f'number of features: {X.shape[1]}')

X = preprocessing.normalize(X)

x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, stratify=Y, random_state=0)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.33, random_state=0)


<br>

#### Step 3. Replicate the smallest class for balancing 

In [None]:
def replicate_smallest_class(x, y, class_id):
        y_fraud_list = y[y == class_id]
        x_fraud_list = x[y == class_id]

        for _ in range(5):
            copy_fraudlist = np.copy(x_fraud_list)
            y_fraud_copy = np.copy(y_fraud_list)
            x = np.concatenate((x, copy_fraudlist))
            y = np.concatenate((y, y_fraud_copy))

        permut = np.random.permutation(x.shape[0])
        x = x[permut]
        y = y[permut]

        return x, y

x_train, y_train = replicate_smallest_class(x_train, y_train, class_id=1)

nb_train_samples = (x_train.shape[0] // batch_size) * batch_size
x_train = x_train[:nb_train_samples]
y_train = y_train[:nb_train_samples]


print("After replicating items from the smaller class:")
print(f'x_train: {x_train.shape}')
print(f'x_val: {x_val.shape}')
print(f'x_test: {x_test.shape}')

<br>

#### Step 4. Reshape labels

In [None]:
y_train = y_train.reshape(y_train.shape[0], -1)
y_val = y_val.reshape(y_val.shape[0], -1)
y_test = y_test.reshape(y_test.shape[0], -1)

print("Training data ready")

<br>

#### Step 5. Save the data set

In [None]:
def save_data_set(x, y, data_type, s=''):
    print("Saving x_{} of shape {}".format(data_type, x.shape))
    xf = h5py.File(os.path.join(PATH, f'x_{data_type}{s}.h5'), 'w')
    xf.create_dataset('x_{}'.format(data_type), data=x)
    xf.close()

    yf = h5py.File(os.path.join(PATH, f'y_{data_type}{s}.h5'), 'w')
    yf.create_dataset(f'y_{data_type}', data=y)
    yf.close()
    
save_data_set(x_test, y_test, data_type='test')

<br>

#### Step 6. Fraud detection network

In [None]:
model = Sequential()

model.add(Dense(20, input_shape=(x_train.shape[1],)))
model.add(SquareActivation())
model.add(Dense(5))
model.add(SquareActivation())
model.add(Dense(1))
model.add(SquareActivation())

model.compile(loss='binary_crossentropy',
              optimizer=optimizer(learning_rate=lr),
              metrics=['accuracy'])

model.summary()

<br>

#### Step 7. Train the neural network model

In [None]:
model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              verbose=2,
              validation_data=(x_val, y_val),
              shuffle=True,
              )
score = model.evaluate(x_test, y_test, verbose=0)

print(f'Test loss: {score[0]:.3f}')
print(f'Test accuracy: {score[1] * 100:.3f}%')

<br>

#### Step 8. Define the batch size

In [None]:
batch_size = 4096
x_test = x_test[0:batch_size,:]
y_test = y_test[0:batch_size,:]


Confusion Matrix - TEST
After the replacement below instead of regular keras code runs code predicting on encrypted data

<br>

#### Step 9. Perform FHE prediction
This is the only line of code that deals with FHE! The only thing you need to add is the pyhelayers extension.

In [None]:
model.predict = __import__('pyhelayers.ext').ext.replace(model.predict, config_file='./fhe.json')
utils.start_timer()
y_pred_vals = model.predict(x_test)
y_pred = (y_pred_vals > 0.5).astype(np.int32)
utils.end_timer("NN prediction")

<br>

#### Step 10. Test the results

In [None]:
f,t,thresholds = metrics.roc_curve(y_test, y_pred)
cm = metrics.confusion_matrix(y_test, y_pred)
print(f"AUC Score: {metrics.auc(f,t):.3f}")
print("Classification report:")
print(metrics.classification_report(y_test, y_pred))
print("Confusion Matrix:")
print(cm)

<br>

References:

<sub><sup> 1.	Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015 </sup></sub>

<sub><sup> 2.	Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon </sup></sub>

<sub><sup> 3.	Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE </sup></sub>

<sub><sup> 4.	Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi) </sup></sub>

<sub><sup> 5.	Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier </sup></sub>

<sub><sup> 6.	Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing </sup></sub>

<sub><sup> 7.	Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019 </sup></sub>

<sub><sup> 8.	Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019 </sup></sub>

<sub><sup> 9.	Yann-Aël Le Borgne, Gianluca Bontempi Machine Learning for Credit Card Fraud Detection - Practical Handbook </sup></sub>