An early version of this code has a bug that makes it incorrectly processes undirected graphs. This bug is rectified in the current version.
- data
- algos
- eval
- embds
See an example in data/wiki/
- attrs.txt
- edgelist.txt
- labels.txt
- full.mat
Download edgelists & labels from here Download mat files from here
In embds/wiki/
- Split training, testing and negative edge sets for link prediction
$ cd eval/
$ python splitTrainTest.py --action split --data wiki --ratio=0.3
$ python splitTrainTest.py --action select --data wiki --ratio=0.3
Then convert edgelist.train.txt to a train.mat, which contains four variables: 'A', 'P', 'Dout', 'Din' and format '-v7.3'. 'A' is adjacency matrix, 'P' is transition matrix, 'Dout' is out-degree array, 'Din' is in-degree array. 'A' and 'P' are sparse matrices.
- Generate node pairs for graph reconstruction
$ cd eval/
$ python gen_nodepairs.py --data wiki --ratio=0.01
See readme.md in algos/nrp/
$ cd eval/
$ python eval_linkpred.py --algo nrp --data wiki --d 128
$ python graphreconstruct_util.py --algo nrp --data wiki --d 128
@article{yang2020homogeneous,
title={Homogeneous network embedding for massive graphs via reweighted personalized PageRank},
author={Yang, Renchi and Shi, Jieming and Xiao, Xiaokui and Yang, Yin and Bhowmick, Sourav S},
journal={Proceedings of the VLDB Endowment},
volume={13},
number={5},
pages={670--683},
year={2020},
publisher={VLDB Endowment}
}