Skip to content

dfdazac/modgraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Modular Framework for Unsupervised Graph Representation Learning

Methods for unsupervised representation learning on graphs can be described in terms of modules:

  • Graph encoders
  • Representations
  • Scoring functions
  • Loss functions
  • Sampling strategies

By identifying this we can reproduce existing methods:

Variational Graph Autoencoders (Kipf and Welling, 2016):

encoder = GCNEncoder(dataset.num_features, hidden_dims=[256, 128])
representation = GaussianVariational()
loss = bceloss
sampling = FirstNeighborSampling

Graph Autoencoders (Kipf and Welling, 2016):

encoder = GCNEncoder(dataset.num_features, hidden_dims=[256, 128])
representation = EuclideanInnerProduct()
loss = bceloss
sampling = FirstNeighborSampling

Deep Graph Infomax (Veličković et al., 2018):

encoder = GCNEncoder(dataset.num_features, hidden_dims=[256, 128])
representation = EuclideanBilinear()
loss = bceloss
sampling = GraphCorruptionSampling

Graph2Gauss (Bojchevski and Günnemann, 2017):

encoder = MLPEncoder(dataset.num_features, hidden_dims=[256, 128])
representation = Gaussian()
loss = square_exponential
sampling = RankedSampling

We can also use this framework to create new methods. For example, we can simplify Graph2Gauss with an Euclidean distance:

encoder = MLPEncoder(dataset.num_features, hidden_dims=[256, 128])
representation = EuclideanDistance()
loss = square_exponential
sampling = RankedSampling

Under this framework, all these methods can be trained and evaluated with the same procedure:

method = EmbeddingMethod(encoder, representation, loss, sampling)
embeddings, results = train(dataset, method)

Installation

Create a conda environment with all the requirements (edit environment.yml if you want to change the name of the environment):

conda env create -f environment.yml

Activate the environment

source activate graphlearn

We use Sacred to run and log all the experiments. To list the configuration variables and their default values, run

python train.py print_config

Two commands are available: link_pred_experiments and node_class_experiments.

Running the experiments

The default settings train our best method (EB-GAE) on the link prediction task with the Cora dataset:

python train.py link_pred_experiments

Other methods can be evaluated as well:

GAE

python train.py link_pred_experiments \
    with dataset_str='cora' \
    encoder_str='gcn' \
    repr_str='euclidean_inner' \
    loss_str='bce_loss' \
    sampling_str='first_neighbors'

DGI

python train.py link_pred_experiments \
    with dataset_str='cora' \
    encoder_str='gcn' \
    repr_str='euclidean_infomax' \
    loss_str='bce_loss' \
    sampling_str='graph_corruption'

Graph2Gauss

python train.py link_pred_experiments \
    with dataset_str='cora' \
    encoder_str='mlp' \
    repr_str='gaussian' \
    loss_str='square_exponential_loss' \
    sampling_str='ranked'

About

Unsupervised learning on graphs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages