This is the repository for our paper Multi-Scale Adaptive Neighborhood Awareness Transformer For Graph Fraud Detection. In this work, we propose Multi-scale Neighborhood Awareness Transformer (MANDATE), which alleviates the inherent inductive bias of GNNs and solves the problem of different homophily distribution in GFD tasks. Specifically, we design a multi-scale positional encoding strategy to encode the positional information of various distances from the central node. By incorporating it with the self-attention mechanism, the global modeling ability can be enhanced significantly. Meanwhile, we design different embedding strategies for homophilic and heterophilic connections. This mitigates the homophily distribution differences between benign and fraudulent nodes. Moreover, an embedding fusion strategy is designed for multi-relation graphs, which alleviates the distribution bias caused by different relationships.
The overall architecture of MANDATE is shown above. For simplicity, we set the number of relationships to 2 as an example.
First, you need to download the dataset and put them under the dataset folder. We provide one example to illustrate the usage of the code. All training configuration parameters are provided in the yaml file in the config folder. For instance, we run MANDATE on YelpChi dataset (parameters can be modified in yaml file).
python train.py --dataset YelpChi
We use YelpChi as an example to provide a description of the configuration parameters. Other parameters do not need to be modified and are used only to retain records.
model:
mlp_hz: 100 # hidden size for position encoding mlp
attention_hz: 128 # hidden size for attention
khop: 3 # position encoding range
mlp_dp: 0.5 # dropout for position encoding mlp
cls_dp: 0.4 # dropout for classifier
lambd: 0.03 # lambd for orth loss
data:
views: 4 # the number of relations
dataset: "YelpChi" # dataset name
training:
epochs: 2000
learning_rate: 0.001
weight_decay: 0.003
use_lap_eig_loss: True
labeled_rate: 0.2
mode: "supervised"
...
