Multi-GNN

This repository contains all models and adaptations needed to run Multi-GNN for Anti-Money Laundering. The repository consists of four Graph Neural Network model classes (GIN, GAT, PNA, RGCN) and the below-described model adaptations utilized for financial crime detection in Egressy et al.. Note that this repository solely focuses on the Anti-Money Laundering use case. This repository has been created for experiments in Provably Powerful Graph Neural Networks for Directed Multigraphs [AAAI 2024] and Realistic Synthetic Financial Transactions for Anti-Money Laundering Models [NeurIPS 2023].

Setup

To use the repository, you first need to install the conda environment via

conda env create -f env.yml

Then, the data needed for the experiments can be found on Kaggle. To use this data with the provided training scripts, you first need to perform a pre-processing step for the downloaded transaction files (e.g. HI-Small_Trans.csv):

python format_kaggle_files.py /path/to/kaggle-files/HI-Small_Trans.csv

Make sure to change the filepaths in the data_config.json file. The aml_data path should be changed to wherever you stored the formatted_transactions.csv file generated by the pre-processing step.

Usage

To run the experiments you need to run the main.py function and specify any arguments you want to use. There are two required arguments, namely --data and --model. For the --data argument, make sure you store the different datasets in different folders. Then, specify the folder name, e.g --data Small_HI. The --model parameter should be set to any of the model classed that are available, i.e. to one of --model [gin, gat, rgcn, pna]. Thus, to run a standard GNN, you need to run, e.g.:

python main.py --data Small_HI --model gin

Then you can add different adaptations to the models by selecting the respective arguments from:

Argument	Description
`--emlps`	Edge updates via MLPs
`--reverse_mp`	Reverse Message Passing
`--ego`	Ego ID's to the center nodes
`--ports`	Port Numberings for edges

Thus, to run Multi-GIN with edge updates, you would run the following command:

python main.py --data Small_HI --model gin --emlps --reverse_mp --ego --ports

Additional functionalities

There are several arguments that can be set for additional functionality. Here's a list with them:

Argument	Description
`--tqdm`	Displays a progress bar during training and inference.
`--save_model`	Saves the best model to the specified `model_to_save` path in the `data_config.json` file. Requires argment `--unique_name` to be specified.
`--finetune`	Loads a previously trained model (with name given by `--unique_name` and stored in `model_to_load` path in the `data_config.json`) to be finetuned.
`--inference`	Loads a previously trained model (with name given by `--unique_name` and stored in `model_to_load` path in the `data_config.json`) to do inference only.

Licence

Apache License Version 2.0, January 2004

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
LICENSE		LICENSE
README.md		README.md
data_config.json		data_config.json
data_loading.py		data_loading.py
data_util.py		data_util.py
env.yml		env.yml
format_kaggle_files.py		format_kaggle_files.py
inference.py		inference.py
main.py		main.py
model_settings.json		model_settings.json
models.py		models.py
train_util.py		train_util.py
training.py		training.py
util.py		util.py

License

IBM/Multi-GNN

Folders and files

Latest commit

History

Repository files navigation

Multi-GNN

Setup

Usage

Additional functionalities

Licence

About

Resources

License

Stars

Watchers

Forks

Languages