# MonLAD: Money Laundering Agents Detection in Transaction Streams
We propose a novel approach MonLAD to detect money laundering agent accounts in a transaction stream, it is able to fast answer the detection query at any time based on the statistical features.

In [None]:
import sys
sys.path.insert(0, "/Users/sunxiaobing/4-code/spartan2")
import spartan as st
import numpy as np
import pdb
import os

# Prepare data and set parameters
- deltaUp: minimum thresholds for an effective fan-in.

- deltaDown:minimum thresholds for an effective fan-out.

- epsilon: a small residual epsilon > 0 to nullify the fraudsters’ attempts to evade detection by keeping a low balance. 

- has_edge: whether the input data are the edges of the transaction (including two nodes)

**Data**: Due to the privacy of bank data, we only provide a demo data to help users better use it.

**Input format**: We provide two input formats corresponding to different files (i.e., ZeroOutCore.py and ZeroOutCoreCFD.py).
- `(source_id, destination_id, timestamp, weight)` corr. to `has_edge=True`
- `(account_id, transaction_type, weight)` corr. to `has_edge=False`

**Note that**: If you choose the second one, you may need to change the name of the transaction_type (`source_type` and `des_type`) , that is, PRIJEM represents transfer in and VYDAJ represents transfer out.


In [None]:
f = open("inputData/cfd.csv", "r")
tensor_stream = st.TensorStream(f, col_idx = [0,1,2,3], col_types=[int,int,str,float], sep=',', mappers={},hasvalue=True)

In [None]:
deltaUp = deltaDown = delta = 10000
epsilon = 10000
param_dict={'deltaUp': deltaUp, 'deltaDown': deltaDown, 'epsilon':epsilon, 'window':1, 'stride':1, 'ts_idx':1, \
    'has_edge': False, 'source_type': 'VYDAJ', 'des_type': 'PRIJEM'}

# Run as a model

In [None]:
monlad = st.MonLAD(tensor_stream, **param_dict)

### Get statistical features

In [None]:
count_df = monlad.run()

In [None]:
save_path = None # './result/'
if save_path:
    if not os.path.exists(save_path):
        os.makedirs(save_path)
    count_fileName = os.path.join(save_path, 'CFD_count' + str(delta // 1000) + 'k.csv')
    print('Output count.csv to: ', count_fileName)
    count_df.to_csv(count_fileName, index=False)

### Anomaly detection
- For normal data, it is recommended to set `alpha = 0.98, k = 1.5, p = 0.9~0.99`.

In [None]:
# recommend: alpha=0.98, k=1.5, p=0.99
# 1: part1; 2: part3; 3: part2-1; 4:part2-2
anomalous_acc = monlad.anomaly_detection(detect_part=[1, 2, 3, 4], alpha=0.5, k=1, p=0.8, outpath=None)