Skip to content

Latest commit

 

History

History
 
 

strucvec

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

struc2vec: Learning Node Representations from Structural Identity

Struc2vec is is a concept of symmetry in which network nodes are identified according to the network structure and their relationship to other nodes. A novel and flexible framework for learning latent representations is proposed in the paper of struc2vec. We reproduce Struc2vec algorithm in the PGL.

DataSet

The paper of use air-traffic network to valid algorithm of Struc2vec. The each edge in the dataset indicate that having one flight between the airports. Using the the connection between the airports to predict the level of activity. The following dataset will be used to valid the algorithm accuracy.Data collected from the Bureau of Transportation Statistics2 from January to October, 2016. The network has 1,190 nodes, 13,599 edges (diameter is 8). Link

  • usa-airports.edgelist
  • labels-usa-airports.txt

Dependencies

If use want to use the struc2vec model in pgl, please install the gensim, pathos, fastdtw additional.

  • paddlepaddle>=1.6
  • pgl
  • gensim
  • pathos
  • fastdtw

How to use

For examples, we want to train and valid the Struc2vec model on American airpot dataset

python struc2vec.py --edge_file data/usa-airports.edgelist --label_file data/labels-usa-airports.txt --train True --valid True --opt2 True

Hyperparameters

Args Meaning
edge_file input file name for edges
label_file input file name for node label
emb_file input file name for node label
walk_depth The step3 for random walk
opt1 The flag to open optimization 1 to reduce time cost
opt2 The flag to open optimization 2 to reduce time cost
w2v_emb_size The dims of output the word2vec embedding
w2v_window_size The context length of word2vec
w2v_epoch The num of epoch to train the model.
train The flag to run the struc2vec algorithm to get the w2v embedding
valid The flag to use the w2v embedding to valid the classification result
num_class The num of class in classification model to be trained

Experiment results

Dataset Model Metric PGL Result Paper repo Result
American airport dataset Struc2vec without time cost optimization ACC 0.6483 0.6340
American airport dataset Struc2vec with optimization 1 ACC 0.6466 0.6242
American airport dataset Struc2vec with optimization 2 ACC 0.6252 0.6241
American airport dataset Struc2vec with optimization1&2 ACC 0.6226 0.6083