This repository contains the source code for our paper "Adversarial Inter-Group Link Injection Degrades the Fairness of Graph Neural Networks" accepted by ICDM'22.
This repository is built using PyTorch. The primary dependencies are listed as follows. You can install all the necessary libraries by running
pip install -r ./requirements.txt
- Python == 3.9
- torch == 1.9.0
- cuda == 11.5
- numpy == 1.20.3
- scipy == 1.5.4
- networkx == 2.6.2
- dgl == 0.7.0
- deeprobust == 0.2.2
- scikit-learn == 0.24.2
We ran our experiments on three read-world datasets, i.e., Pokec_z, Pokec_n, and DBLP. All the data are present in the './datasets' folder.
The main script running the experiments of our attack strategy on the state-of-the-art GNN models is in './src/train.py'.
Script 1: Evaluate fairness and node classification performance of GCN on the clean graph:
python train.py --dataset pokec_z --model gcn --attack_type none
Script 2: Evaluate fairness and node classification performance of GCN on the perturbed graph with our FA-GNN at 5% perturbation rate, the attack are controlled by direction (node type of one attack subset) and strategy (same/different labels/sens):
python train.py --dataset pokec_z --model gcn --attack_type fair_attack --direction y1s1 --strategy DD --ptb_rate 0.05
Script 3: Evaluate fairness and node classification performance of GAT on the perturbed graph with our FA-GNN at 5% perturbation rate, the attack are controlled by direction (node type of one attack subset) and strategy (same/different labels/sens): (GAT have hyperparameters different from the default ones only for Pokec_z and Pokec_n)
python train.py --dataset pokec_z --model gat --attack_type fair_attack --direction y1s1 --strategy DD --ptb_rate 0.05 --dropout 0.8 --in_drop 0.338 --attn_drop 0.57 --negative_slope 0.15 --weight_decay 0.019 --num_layers 1 --num_heads 2
The main script will output the fairness and classification metrics for each of the 5 random seeds as well as the mean and std values, the results will also be saved in a .csv file in the './results/' folder.
We perform the same experiments on synthetic datasets described in Sectio 3.4. We start with a heterophilic graph: the edge density is
The following Figure A.1 shows the statistical parity difference and the error rates of this simulation and supports our hypotheses. In this setup, EE only shows an increase in
Figure A.1 Fairness attacks on heterophilic synthetic graph. We show the statistical parity difference (
On random graphs, we generally see similar results, except that error rates do not go up with same-class linking (ED/EE). We leave out the detailed results here.
- GCN, GAT, and GraphSAGE: we adopt the implementations provided in the DGL package (https://github.com/dmlc/dgl/).
- FairGNN: we adopt the implementation provided by the authors (https://github.com/EnyanDai/FairGNN).
- NIFTY: we adopt the implementation provided by the authors (https://github.com/chirag126/nifty).
- Random and DICE: we adopt the implementation provided in the deeprobust package (https://github.com/DSE-MSU/DeepRobust).
- PR-BCD: we use the implementation provided by the authors (https://github.com/sigeisler/robustness_of_gnns_at_scale).
- The hyperparameters
$\alpha$ and$\beta$ in FairGNN for balancing the loss functions are set to 2 and 0.1, respectively. - For the victim GNN model training, we randomly pick 50%, 25%, 25% of labeled nodes as training set, validation set, and testing set.
- We train all GNN models both on the clean networks and attacked networks with a learning rate of 0.001.
- For GAT on Pokec datasets, we use a dropout of 0.8 and a weight decay of 1.9e-2, for all the other models, we set dropout as 0.6, and weight decay as 5e-4.
- For the surrogate GCN, we use two hidden layers with 16 dimensions for the first hidden layer, and we train the model for 500 epochs with a learning rate 0.01, weight decay 5e-4, and droupout 0.5.