Add data preprocessing script in this code.
Deep Graph Infomax (Veličković et al., ICLR 2019): https://arxiv.org/abs/1809.10341
Here we provide an implementation of Deep Graph Infomax (DGI) in PyTorch, along with a minimal execution example (on the Cora dataset). The repository is organised as follows:
data/
contains the necessary dataset files for Cora;models/
contains the implementation of the DGI pipeline (dgi.py
) and our logistic regressor (logreg.py
);layers/
contains the implementation of a GCN layer (gcn.py
), the averaging readout (readout.py
), and the bilinear discriminator (discriminator.py
);utils/
contains the necessary processing subroutines (process.py
).
Finally, execute.py
puts all of the above together and may be used to execute a full training run on Cora.
This project add preprocess.py
, this file include preprocessing works of the cora
dataset.
In data
folder, we move the data used in DGI to cora_dgi
folder, and add the original cora dataset into the folder named cora_ori
. Besides, we also put the dataset used in plantoid in cora_nonorm
folder.
The preprocessing can be done through python preprocess.py
, and the format which DGI used can be made.
** The script can make the original cora data to the formatted data used in plantoid, but these data still are not totally equal to the data used in DGI, so there still need some work to do, maybe normalizing etc. **
If you make advantage of DGI in your research, please cite the following in your manuscript:
@inproceedings{
velickovic2018deep,
title="{Deep Graph Infomax}",
author={Petar Veli{\v{c}}kovi{\'{c}} and William Fedus and William L. Hamilton and Pietro Li{\`{o}} and Yoshua Bengio and R Devon Hjelm},
booktitle={International Conference on Learning Representations},
year={2019},
url={https://openreview.net/forum?id=rklz9iAcKQ},
}
MIT