Skip to content

[WSDM'23] GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection

License

Notifications You must be signed in to change notification settings

yixinliu233/G-OOD-D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection

This is the source code of WSDM'23 paper "GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection".

Requirements

This code requires the following:

  • Python==3.9
  • Pytorch==1.11.0
  • Pytorch Geometric==2.0.4
  • Numpy==1.21.2
  • Scikit-learn==1.0.2
  • OGB==1.3.3
  • NetworkX==2.7.1
  • FAISS-GPU==1.7.2

Usage

Just run the script corresponding to the experiment and dataset you want. For instance:

  • Run out-of-distribution detection on BZR (ID) and COX2 (OOD) datasets:
bash script/oodd_BZR+COX2.sh
  • Run anomaly detection on PROTEINS_full datasets:
bash script/ad_PROTEINS_full.sh

Statistic of Graph-level OOD Detection Benchmark

The statistic of each dataset pair in our benchmark is provided as follows.

ID datasetOOD dataset
No.Name# Graph
(Train/Test)
# Node
(avg.)
# Edge
(avg.)
Name# Graph
(Test)
# Node
(avg.)
# Edge
(avg.)
1BZR364/4135.838.4 COX24141.243.5
2PTC-MR309/3514.314.7 MUTAG3517.919.8
3AIDS1,800/20015.716.2 DHFR20042.444.5
4ENZYMES540/6032.662.1 PROTEIN6039.172.8
5IMDB-B1,350/15019.896.5 IMDB-M15013.065.9
6Tox217,047/78418.619.3 SIDER78433.635.4
7FreeSolv577/658.78.4 ToxCast6518.819.3
8BBBP1,835/20424.126.0 BACE20434.136.9
9ClinTox1,329/14826.227.9 LIPO14827.029.5
10Esol1,015/11313.313.7 MUV11324.226.3

Statistic of Graph-level Anomaly Detection Datasets

The statistic of each dataset in the anomaly detection experiments is provided as follows.

Dataset# Graph
(Train/Test)
# Node
(avg.)
# Edge
(avg.)
PROTEINS-full360/22339.172.8
ENZYMES400/12032.662.1
AIDS1280/40015.716.2
DHFR368/15242.444.5
BZR69/8135.838.4
COX281/9441.243.5
DD390/236284.3715.7
NCI11646/82229.832.3
IMDB-B400/20019.896.5
REDDIT-B800/400429.6497.8
COLLAB1920/100074.52457.8
HSE423/26716.917.2
MMP6170/23817.618.0
p538088/26917.918.3
PPAR-gamma219/26717.417.7

Implementation Details

Hyper-parameters

For the sake of efficiency, we set the structural encoding dimensions $d_s^{(rw)}$ and $d_s^{(dg)}$ to $16$. The encoders are 5-layer GINs with $16$ hidden dimensions. The number of dimensions of projected embeddings is the same as which of node embeddings. The batch size is selected from $16$ to $128$ according to the graph size of datasets. The number of clusters $K$ and self-adaptiveness parameter $\alpha$ are selected through grid search, with the scopes of ${2, 3, 5, 10, 15, 20, 30}$ and ${0, 0.2, 0.4, 0.6, 0.8, 1.0}$, respectively. The model is trained by the Adam optimizer with a learning rate of $0.0001$ until converging.

Computing Infrastructures

We conduct the experiments on a Linux server with an Intel Xeon Gold 6226R CPU and two Tesla V100S GPUs. We implement our method with PyTorch 1.11.0 and Pytorch Geometric 2.0.4.

Cite

If you compare with, build on, or use aspects of this work, please cite the following:

@inproceedings{liu2023goodd,
  title={GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection},
  author={Liu, Yixin and Ding, Kaize and Liu, Huan and Pan, Shirui},
  booktitle={Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining},
  year={2023}
}

About

[WSDM'23] GOOD-D: On Unsupervised Graph Out-Of-Distribution Detection

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published