Skip to content

ZhijunLiu95/MTSN

 
 

Repository files navigation

Motif-Preserving Dynamic Network Embedding via Temporal Shift Module

This README is also based on DySAT.

This repository contains a TensorFlow implementation of TIME - Motif-Preserving Dynamic Network Embedding via Temporal Shift Module. We use PGD to get the motif matrix of the network TIME is an unsupervised temporal graph embedding model to learn node embeddings in dynamic time-evolving attributed graphs.

Requirements:

Recent versions of TensorFlow (<= 1.14), numpy, scipy, sklearn, and networkx (<= 1.11) are required. The code has been tested under Python 2.7. The required packages can be installed using the following command:

$ pip install -r requirements.txt

To guarantee that you have the right package versions, you can use Anaconda to set up a virtual environment and install the dependencies from requirements.txt.

Input Format

In order to use your own data, you have to provide:

  • graphs: list of networkx graphs (or multigraphs) for each time step, saved as .npz files. Have a look at the load_graphs() and load_feats() functions in utils/preprocess.py for an example.

  • features: list of N x D feature matrices (N is the number of nodes and D is the number of features per node) in scipy sparse format) -- optional.

Repository Organization

  • data/ contains the necessary input file(s) for each dataset after pre-processing.
  • raw_data/ contains data pre-processing jupyter notebooks for reference.
  • models/ contains the implementation of model - TIME.
  • utils/ contains:
    • preprocessing subroutines (preprocess.py, utilities.py, random_walk.py);
    • minibatch iterators (minibatch.py, incremental_minibatch.py);
  • eval/ contains evaluation scripts that use simple logistic regression classifiers for link prediction based on the learnt node embeddings.

Running the code

The code can be run by executing python run_script.py. The default values of all parameters are set in the script file and can be specified as command line arguments. The most important arguments are min_time and max_time that specify the range of time steps to train the model. This script calls multiple instances of train.py (or train_incremental.py) with time steps in this range (both ends included).

For example, if min_time is 2 and max_time is 3, two instances of the model are trained, where the first one trains on the G1, while the second instance trains on G1 and G2. In case of link prediction, the evaluation is performed on the links in G2 for the first instance, and the links of G3 for the second.

The other hyper-parameters of the model are specified in run_script.py (along with detailed descriptions) and may need to be appropriately tuned for different datasets. Parameters not mentioned in the paper have NO effect in this Implementation.

Logging Directory

For logging, the model flag should be provided to specify the variant/version of the experimented model (initially set to default), in addition to choosing base_model as TIME.

A logging directory log_dir is then created at ./logs/<base_model>_<model>/, overwriting any existing files that might conflict.

The output of the model, log files and evaluation results (on link prediction) will be stored in subdirectories of log_dir, with date-wise logged files, along with the set of hyper-parameters and settings used in the experiment.

The learnt embeddings will be stored in numpy formatted files at subdirectory output/ and the results of downstream evaluation tasks will be stored in a subdirectory csv/, within log_dir.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 61.1%
  • Jupyter Notebook 38.9%