## "Detection of Anomalies in Financial Transactions using Deep Autoencoder Networks"

This GPU Technology Conference (GTC) 2018 lab was developed by Mr. X, and Mr. Y

## 01. Environment Verification

#### 01.1 Python Verification

Before we begin, let's verify that Python is working on your system. To do this, execute the cell block below by giving it focus (clicking on it with your mouse), and hitting Shift-Enter, or pressing the play button in the toolbar above. If all goes well, you should see some output returned below the grey cell.

In [4]:
print("The answer should be forty-two: " + str(40+2))

The answer should be forty-two: 42


#### 01.2 CUDNN / GPU Verficiation

In [27]:
# print CUDNN backend version
now = datetime.utcnow().strftime("%Y%m%d-%H:%M:%S")
print('[LOG TRAIN {}] CUDNN backend version: {}'.format(now, torch.backends.cudnn.version()))

[PT LOG TRAIN 20180124-15:17:49] CUDNN backend version: None


Let's execute the cell below to display information about the GPUs running on the server.

In [5]:
!nvidia-smi

/bin/sh: nvidia-smi: command not found


#### 01.3 Import Python Libraries

In [11]:
# importing utilities
import os
from datetime import datetime
from IPython.display import Image

# importing pytorch libraries
import torch
from torch import nn
from torch import autograd
from torch.utils.data import DataLoader

# importing data science libraries
import pandas as pd
import random as rd
import numpy as np

In [28]:
# print current PyTorch version
now = datetime.utcnow().strftime("%Y%m%d-%H:%M:%S")
print('[LOG TRAIN {}] PyTorch version: {}'.format(now, torch.__version__))

[LOG TRAIN 20180124-15:19:31] PyTorch version: 0.3.0.post4


## 02. Lab Overview

ToDo -- Timur and Marco

<img align="middle" style="max-width: 550px; height: auto" src="images/accounting.png">

## 03. Autoencoder Neural Networks

#### 03.1 Introduction to Autoencoder Neural Networks

<img align="middle" style="max-width: 600px; height: auto" src="images/autoencoder.png">

#### 03.2 Implementing the Encoder Network

In [21]:
class encoder(nn.Module):

    def __init__(self):

        super(encoder, self).__init__()

        self.dropout = nn.Dropout(p=0.0, inplace=True)

        self.encoder_L1 = nn.Linear(401, 512, bias=True)
        nn.init.xavier_uniform(self.encoder_L1.weight)
        self.encoder_R1 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L2 = nn.Linear(512, 256, bias=True)
        nn.init.xavier_uniform(self.encoder_L2.weight)
        self.encoder_R2 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L3 = nn.Linear(256, 128, bias=True)
        nn.init.xavier_uniform(self.encoder_L3.weight)
        self.encoder_R3 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L4 = nn.Linear(128, 64, bias=True)
        nn.init.xavier_uniform(self.encoder_L4.weight)
        self.encoder_R4 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L5 = nn.Linear(64, 32, bias=True)
        nn.init.xavier_uniform(self.encoder_L5.weight)
        self.encoder_R5 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L6 = nn.Linear(32, 16, bias=True)
        nn.init.xavier_uniform(self.encoder_L6.weight)
        self.encoder_R6 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L7 = nn.Linear(16, 8, bias=True)
        nn.init.xavier_uniform(self.encoder_L7.weight)
        self.encoder_R7 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L8 = nn.Linear(8, 4, bias=True)
        nn.init.xavier_uniform(self.encoder_L8.weight)
        self.encoder_R8 = nn.LeakyReLU(negative_slope= 0.4, inplace=True)

        self.encoder_L9 = nn.Linear(4, 3, bias=True)
        nn.init.xavier_uniform(self.encoder_L9.weight)
        self.encoder_R9 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

    def forward(self, x):

        x = self.encoder_R1(self.dropout(self.encoder_L1(x)))
        x = self.encoder_R2(self.dropout(self.encoder_L2(x)))
        x = self.encoder_R3(self.dropout(self.encoder_L3(x)))
        x = self.encoder_R4(self.dropout(self.encoder_L4(x)))
        x = self.encoder_R5(self.dropout(self.encoder_L5(x)))
        x = self.encoder_R6(self.dropout(self.encoder_L6(x)))
        x = self.encoder_R7(self.dropout(self.encoder_L7(x)))
        x = self.encoder_R8(self.dropout(self.encoder_L8(x)))
        x = self.encoder_R9(self.encoder_L9(x))

        return x

#### 03.3 Implementing the Decoder Network

In [22]:
class decoder(nn.Module):

    def __init__(self):

        super(decoder, self).__init__()

        self.dropout = nn.Dropout(p=0.0, inplace=True)

        self.decoder_L1 = nn.Linear(3, 4, bias=True)
        nn.init.xavier_uniform(self.decoder_L1.weight)
        self.decoder_R1 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L2 = nn.Linear(4, 8, bias=True)
        nn.init.xavier_uniform(self.decoder_L2.weight)
        self.decoder_R2 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L3 = nn.Linear(8, 16, bias=True)
        nn.init.xavier_uniform(self.decoder_L3.weight)
        self.decoder_R3 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L4 = nn.Linear(16, 32, bias=True)
        nn.init.xavier_uniform(self.decoder_L4.weight)
        self.decoder_R4 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L5 = nn.Linear(32, 64, bias=True)
        nn.init.xavier_uniform(self.decoder_L5.weight)
        self.decoder_R5 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L6 = nn.Linear(64, 128, bias=True)
        nn.init.xavier_uniform(self.decoder_L6.weight)
        self.decoder_R6 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L7 = nn.Linear(128, 256, bias=True)
        nn.init.xavier_uniform(self.decoder_L7.weight)
        self.decoder_R7 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L8 = nn.Linear(256, 512, bias=True)
        nn.init.xavier_uniform(self.decoder_L8.weight)
        self.decoder_R8 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

        self.decoder_L9 = nn.Linear(512, 401, bias=True)
        nn.init.xavier_uniform(self.decoder_L9.weight)
        self.decoder_R9 = nn.LeakyReLU(negative_slope=0.4, inplace=True)

    def forward(self, x):

        x = self.decoder_R1(self.dropout(self.decoder_L1(x)))
        x = self.decoder_R2(self.dropout(self.decoder_L2(x)))
        x = self.decoder_R3(self.dropout(self.decoder_L3(x)))
        x = self.decoder_R4(self.dropout(self.decoder_L4(x)))
        x = self.decoder_R5(self.dropout(self.decoder_L5(x)))
        x = self.decoder_R6(self.dropout(self.decoder_L6(x)))
        x = self.decoder_R7(self.dropout(self.decoder_L7(x)))
        x = self.decoder_R8(self.dropout(self.decoder_L8(x)))
        x = self.decoder_R9(self.decoder_L9(x))
        
        return x

In [23]:
# init the encoder and decoder architectures
encoder = encoder()
decoder = decoder()

# push to cuda if cudnn is available
if (torch.backends.cudnn.version() != None):
    encoder = encoder().cuda()
    decoder = decoder().cuda()

In [25]:
# print the initialized architectures
now = datetime.utcnow().strftime("%Y%m%d-%H:%M:%S")
print('[LOG {}] encoder architecture:\n\n{}\n'.format(now, encoder))
print('[LOG {}] decoder architecture:\n\n{}\n'.format(now, decoder))

[LOG 20180124-15:12:42] encoder architecture:

encoder(
  (dropout): Dropout(p=0.0, inplace)
  (encoder_L1): Linear(in_features=401, out_features=512)
  (encoder_R1): LeakyReLU(0.4, inplace)
  (encoder_L2): Linear(in_features=512, out_features=256)
  (encoder_R2): LeakyReLU(0.4, inplace)
  (encoder_L3): Linear(in_features=256, out_features=128)
  (encoder_R3): LeakyReLU(0.4, inplace)
  (encoder_L4): Linear(in_features=128, out_features=64)
  (encoder_R4): LeakyReLU(0.4, inplace)
  (encoder_L5): Linear(in_features=64, out_features=32)
  (encoder_R5): LeakyReLU(0.4, inplace)
  (encoder_L6): Linear(in_features=32, out_features=16)
  (encoder_R6): LeakyReLU(0.4, inplace)
  (encoder_L7): Linear(in_features=16, out_features=8)
  (encoder_R7): LeakyReLU(0.4, inplace)
  (encoder_L8): Linear(in_features=8, out_features=4)
  (encoder_R8): LeakyReLU(0.4, inplace)
  (encoder_L9): Linear(in_features=4, out_features=3)
  (encoder_R9): LeakyReLU(0.4, inplace)
)

[LOG 20180124-15:12:42] decoder archit

## 04. Financial Fraud Detection Dataset

ToDo -- Timur

In [29]:
# import original and encoded transactions
ori_dataset = pd.read_csv("./data/transactions.csv", sep=",", header=0, encoding="utf-8")
enc_dataset = pd.read_csv("./data/enc_transactions.csv", sep=",", header=0, encoding="utf-8").astype(float)

now = datetime.utcnow().strftime("%Y%m%d-%H:%M:%S")
print("[LOG {}] encoded transactions of shape [{}/{}] imported".format(str(now), str(enc_dataset.shape[0]), str(enc_dataset.shape[1])))

[PT LOG 20180124-15:24:27] encoded transactions of shape [307457/401] imported


In [31]:
features = ["AccountID_Key", "CurrencyCode_Key", "TaxCode_Key", "CompanyKey_Key", "ShipToCountry_Key", "ShipFromCountry_Key"]
ranges = [0, 62+1, 121+1, 183+1, 234+1, 349+1, 400+1]

# init training results
columns = ["timestamp", "node", "seed", "architecture", "epoch", "rec_loss", "roc_auc", "anomalies", "normalies", "anomalies_s", "normalies_s", "max_threshold", "max_tpr_s", "max_fpr_s", "precision", "recall", "f1_score", "fpr", "tpr", "thresholds"]
evaluations = ["err_" + str(element) for element in range(0, (len(features))+1)]
columns.extend(evaluations)
evaluations = ["ano_c1_" + str(element) for element in range(0, (len(features))+1)]
columns.extend(evaluations)
evaluations = ["ano_c2_" + str(element) for element in range(0, (len(features))+1)]
columns.extend(evaluations)
evaluation_results = pd.DataFrame(columns=columns)

# convert to pytorch tensor - none cuda enabled
torch_dataset = torch.from_numpy(enc_dataset.values).float()
dataloader = DataLoader(torch_dataset, batch_size=mini_batch_size, shuffle=False, num_workers=0)
# set num_workers to zero to retreive deterministic results

# determine if CUDA is available at compute node
if (torch.backends.cudnn.version() != None) and (use_cuda == True):

    dataloader = DataLoader(torch_dataset.cuda(), batch_size=mini_batch_size, shuffle=False)

## 05. Network Training

ToDo - Timur and Marco

In [36]:
# init deterministic seed
seed_value = 1234 #4444 #3333 #2222 #1111 #1234
rd.seed(seed_value) # set random seed
np.random.seed(seed_value) # set numpy seed
torch.manual_seed(seed_value) # set pytorch seed CPU
torch.cuda.manual_seed(seed_value) # set pytorch seed GPU

# init training parameters
num_epochs = 1
mini_batch_size = 128
learning_rate = 1e-3

# define optimization criterion and optimizer
criterion = nn.BCEWithLogitsLoss()
encoder_optimizer = torch.optim.Adam(encoder.parameters(), lr=learning_rate)
decoder_optimizer = torch.optim.Adam(decoder.parameters(), lr=learning_rate)

# train autoencoder model
for epoch in range(num_epochs):

    # init mini batch counter
    mini_batch_count = 0

    # determine if CUDA is available at compute node
    if (torch.backends.cudnn.version() != None) and (use_cuda == True):

        # set all networks / models in CPU mode
        encoder.cuda()
        decoder.cuda()

    # set networks in training mode (apply dropout when needed)
    encoder.train()
    decoder.train()

    for mini_batch_data in dataloader:

        # increase mini batch counter
        mini_batch_count += 1

        # convert mini batch to torch variable
        mini_batch_torch = autograd.Variable(mini_batch_data)

        # =================== forward pass =====================

        # run forward pass
        z_representation = encoder(mini_batch_torch) # encode mini-batch data
        mini_batch_reconstruction = decoder(z_representation) # decode mini-batch data

        # determine reconstruction loss
        reconstruction_loss = criterion(mini_batch_reconstruction, mini_batch_torch)

        # =================== backward pass ====================

        # reset graph gradients
        decoder_optimizer.zero_grad()
        encoder_optimizer.zero_grad()

        # run backward pass
        reconstruction_loss.backward()

        # update network parameters
        decoder_optimizer.step()
        encoder_optimizer.step()

        # =================== log ==============================

        if mini_batch_count % 100 == 0:

            # print mini batch reconstuction results
            now = datetime.utcnow().strftime("%Y%m%d-%H:%M:%S")
            print('[LOG TRAIN {}] epoch: [{:04}/{:04}], batch: {:04}, loss: {:.10f}'.format(now, epoch + 1, num_epochs, mini_batch_count, reconstruction_loss.data[0]))

# save trained encoder model file to disk
encoder_model_name = "{}_encoder_model.pth".format(exp_timestamp)
torch.save(encoder.state_dict(), os.path.join("models", encoder_model_name))

# save trained decoder model file to disk
decoder_model_name = "{}_decoder_model.pth".format(exp_timestamp)
torch.save(decoder.state_dict(), os.path.join("models", decoder_model_name))

[LOG TRAIN 20180124-15:56:39] epoch: [0001/0001], batch: 0100, loss: 0.0026184090
[LOG TRAIN 20180124-15:56:41] epoch: [0001/0001], batch: 0200, loss: 0.0029546609
[LOG TRAIN 20180124-15:56:43] epoch: [0001/0001], batch: 0300, loss: 0.0017882250
[LOG TRAIN 20180124-15:56:45] epoch: [0001/0001], batch: 0400, loss: 0.0011323976
[LOG TRAIN 20180124-15:56:47] epoch: [0001/0001], batch: 0500, loss: 0.0014175044
[LOG TRAIN 20180124-15:56:50] epoch: [0001/0001], batch: 0600, loss: 0.0020317647
[LOG TRAIN 20180124-15:56:53] epoch: [0001/0001], batch: 0700, loss: 0.0029863003
[LOG TRAIN 20180124-15:56:57] epoch: [0001/0001], batch: 0800, loss: 0.0012323590
[LOG TRAIN 20180124-15:57:00] epoch: [0001/0001], batch: 0900, loss: 0.0038964848
[LOG TRAIN 20180124-15:57:03] epoch: [0001/0001], batch: 1000, loss: 0.0015858796
[LOG TRAIN 20180124-15:57:06] epoch: [0001/0001], batch: 1100, loss: 0.0025408012
[LOG TRAIN 20180124-15:57:09] epoch: [0001/0001], batch: 1200, loss: 0.0017881792
[LOG TRAIN 20180

## 06. Result Evaluation

In [38]:
# load trained models
encoder_trained = torch.load(os.path.join("models", encoder_model_name))
decoder_trained = torch.load(os.path.join("models", decoder_model_name))

In [41]:
# convert mini batch to torch variable
data = autograd.Variable(torch_dataset)

# evaluate trained models
reconstruction = decoder(encoder(data)) # autoencoder data

# determine reconstruction loss
reconstruction_loss = criterion(reconstruction, data)

# =================== log evaluation measures ========================

# print epoch reconstuction results
now = datetime.utcnow().strftime("%Y%m%d-%H:%M:%S")
print('[LOG TRAIN {}] reconstruction loss: {:.10f}'.format(now, reconstruction_loss.data[0]))

[LOG TRAIN 20180124-16:12:30] reconstruction loss: 0.0159976669


## 07. Optional Excercises

ToDo - Timur and Marco

## 08. Lab Summary

ToDo - Timur and Marco

## 09. Next Steps

ToDo - Timur and Marco