**CS 598 Final Project Implementation**

Ratchahan Sujithan (rls7)


Smarak Pattnaik	(smarakp2)

**Graph Convolution Definition**

Define a single Graph Convolution Layer as outlined in the paper with
- constructor (__init__)
- reset_parameters
- forward
- object representation function (__repr__)

**For function `reset_parameters()`:**

Section 5.2 - "We initialize weights using the initialization described in Glorot & Bengio (2010) and
accordingly (row-)normalize input feature vector"

In the paper "Understanding the difficulty of training deep feedforward neural networks" by Xavier Glorot and Yoshua Bengio:

"We initialized the biases to be 0 and the weights Wij at
each layer with the following commonly used heuristic:

$W_{ij} \sim U \left( -\frac{1}{\sqrt{n}}, \frac{1}{\sqrt{n}} \right)$

where U[−a, a] is the uniform distribution in the interval
(−a, a) and n is the size of the previous layer (the number
of columns of W)."

**For function `forward()`:**

Section 2 - "We consider a multi-layer Graph Convolutional
Network (GCN) with the following layer-wise propagation rule:"

$H^{(l+1)} = \sigma\left(\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}H^{(l)}W^{(l)}\right)$

Section 3.2 - "In practice, we make use of TensorFlow (Abadi et al., 2015) for an efficient GPU-based implementation2 of Eq. 9 using sparse-dense matrix multiplications."


In [1]:
import math

import torch

from torch.nn.parameter import Parameter
from torch.nn.modules.module import Module


class GraphConvolution(Module):
    """
    Simple GCN layer, similar to https://arxiv.org/abs/1609.02907
    """

    def __init__(self, in_features, out_features, bias=True):
        super(GraphConvolution, self).__init__()
        self.in_features = in_features
        self.out_features = out_features
        # constructor loads and creates weight Parameter object
        self.weight = Parameter(torch.FloatTensor(in_features, out_features))
        # load and create bias Parameter object
        if bias:
            self.bias = Parameter(torch.FloatTensor(out_features))
        else:
            self.register_parameter('bias', None)
        self.reset_parameters()

    # From section 5.2 - "We initialize weights using the initialization 
    # described in Glorot & Bengio (2010) and accordingly (row-)normalize input feature vector"
    def reset_parameters(self):
        stdv = 1. / math.sqrt(self.weight.size(1))
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

    # From Section 3.2 - "In practice, we make use of TensorFlow (Abadi et al., 2015) 
    # for an efficient GPU-based implementation2 of Eq. 9 using sparse-dense matrix multiplications."
    def forward(self, input, adj):
        # matrix multiplication between input and weight
        support = torch.mm(input, self.weight)
        # sparse matrix multiplication between adj and the output of the previous product
        output = torch.spmm(adj, support)
        if self.bias is not None:
            return output + self.bias
        else:
            return output

    def __repr__(self):
        return self.__class__.__name__ + ' (' \
               + str(self.in_features) + ' -> ' \
               + str(self.out_features) + ')'

**GCN Model Definition**

Define the constructor and forward function for the full GCN as outlined in the paper using the previously defined Graph Convolution Layer.

We followed the below equation given in the paper

$Z = f(X,A) = softmax(\hat{A} \space ReLU \space (\hat{A} X W^{(0)})W^{(1)})$

Specifically the order of operations is as follows:

- Renormalization of the incoming adjacency list to prevent exploding/vanishing gradients.The adjacency matrix is normalized with a diagonal matrix where each element of the diagonal is the sum of the corresponding row in the adjacency matrix. In other words, the adjacency matrix is normalized with the degree of each corresponding node in the graph
- Pass to the first GCN layer
- Pass through ReLU activation function
- Pass through dropout layer
- Pass to the second GCN layer
- Pass through log\_softmax

In [2]:
import torch.nn as nn
import torch.nn.functional as F

class GCN(nn.Module):
    def __init__(self, nfeat, nhid, nclass, dropout):
        super(GCN, self).__init__()

        self.gc1 = GraphConvolution(nfeat, nhid)
        self.gc2 = GraphConvolution(nhid, nclass)
        self.dropout = dropout

    def forward(self, x, adj):
        x = F.relu(self.gc1(x, adj))
        x = F.dropout(x, self.dropout, training=self.training)
        x = self.gc2(x, adj)
        return F.log_softmax(x, dim=1)

**Utils Functions**

Utils functions used in the training module.

The original `load_data` function was written to only handle the Cora dataset. The Cora dataset contains a .cites file ("the citation graph of the corpus") and a .content file (each line contains a "unique string ID of the paper followed by binary values indicating whether each word in the vocabulary is present (indicated by 1) or absent (indicated by 0) in the paper"). The data source that the original author provides (http://www.cs.umd.edu/~sen/lbc-proj/LBC.html) does not appear to have been maintained. Unlike the Cora dataset, the Citeseer dataset contains text in the .cites file instead of just node ids which breaks the original `load_data()` function. In addition, the Pubmed data appears to have been taken down and we were unable to find a Pubmed dataset online that had data of the same format as the Cora dataset.

The DGL library contains the functions `dgl.data.citation_graph.CitationGraphDataset`, `dgl.data.PubmedGraphDataset`, `dgl.data.CiteseerGraphDataset`. These functions load the graph dataset as `Dataset` objects. We rewrote the load_dataset function to handle the `Dataset` objects and get the `adj`, `features`, `labels`, `idx_train`, `idx_val`, `idx_test` values. The function first gets the `graph`, `features`, and `labels` from the dataset using the `ndata` function. We get the adjacency matrix of the graph as a dense tensor and convert this dense tensor into a numpy array. We then turn the adjacency matrix into a scipy sparse matrix with compressed row storage format. We normalize the adjacency matrix as specified in section 2.2 of the paper. Finally, we get the training, validation, and test set indices all of which are conveniently provided by the Dataset objects. Lastly, we convert the data to PyTorch tensors and return the results.

From section 2.2 - " Repeated application of this operator can therefore lead to numerical instabilities and exploding/vanishing gradients when used in a deep neural network model. To alleviate this problem, we introduce the following renormalization trick":

$I_N + D^{-\frac{1}{2}} A D^{-\frac{1}{2}} \rightarrow \tilde{D}^{-\frac{1}{2}} \tilde{A} \tilde{D}^{-\frac{1}{2}}, \text{ with }
\tilde{A} = A + I_N \text{ and } \tilde{D}_{ii} = \sum_j \tilde{A}_{ij}$

In [3]:
import numpy as np
import scipy.sparse as sp
import torch


def encode_onehot(labels):
    classes = set(labels)
    classes_dict = {c: np.identity(len(classes))[i, :] for i, c in
                    enumerate(classes)}
    labels_onehot = np.array(list(map(classes_dict.get, labels)),
                             dtype=np.int32)
    return labels_onehot
  
# The original load_data function
#def load_data(path="../data/cora/", dataset="cora"):
#    """Load citation network dataset (cora only for now)"""
#    print('Loading {} dataset...'.format(dataset))

#    idx_features_labels = np.genfromtxt("{}{}.content".format(path, dataset),
#                                        dtype=np.dtype(str))
#    features = sp.csr_matrix(idx_features_labels[:, 1:-1], dtype=np.float32)
#    labels = encode_onehot(idx_features_labels[:, -1])
#
#    # build graph
#    idx = np.array(idx_features_labels[:, 0], dtype=np.int32)
#    idx_map = {j: i for i, j in enumerate(idx)}
#    edges_unordered = np.genfromtxt("{}{}.cites".format(path, dataset),
#                                    dtype=np.int32)
#    edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),
#                     dtype=np.int32).reshape(edges_unordered.shape)
#    adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
#                        shape=(labels.shape[0], labels.shape[0]),
#                        dtype=np.float32)
#
#    # build symmetric adjacency matrix
#    adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)
#
#    features = normalize(features)
#    adj = normalize(adj + sp.eye(adj.shape[0]))
#
#    idx_train = range(140)
#    idx_val = range(200, 500)
#    idx_test = range(500, 1500)
#
#    features = torch.FloatTensor(np.array(features.todense()))
#    labels = torch.LongTensor(np.where(labels)[1])
#    adj = sparse_mx_to_torch_sparse_tensor(adj)
#
#    idx_train = torch.LongTensor(idx_train)
#    idx_val = torch.LongTensor(idx_val)
#    idx_test = torch.LongTensor(idx_test)
#
#    return adj, features, labels, idx_train, idx_val, idx_test

def load_data(dataset):
    # Extract graph, features, and labels from the dataset
    graph = dataset[0]
    features = graph.ndata['feat']
    labels = graph.ndata['label']
    train_mask = graph.ndata['train_mask']
    val_mask = graph.ndata['val_mask']
    test_mask = graph.ndata['test_mask']

    # Get the adjacency matrix of the graph as a dense tensor and convert this 
    # dense tensor into a numpy array
    adj = graph.adjacency_matrix().to_dense().numpy()

    # Convert to a scipy sparse matrix (CSR format)
    adj = sp.csr_matrix(adj)

    # normalize adjacency matrix and features as outlined in section 2.2
    adj = normalize(adj + sp.eye(adj.shape[0]))
    features = normalize(features)

    # Get training, validation, and test set indices
    idx_train = torch.LongTensor(np.where(train_mask)[0])
    idx_val = torch.LongTensor(np.where(val_mask)[0])
    idx_test = torch.LongTensor(np.where(test_mask)[0])

    # Convert data to PyTorch tensors
    features = torch.FloatTensor(features)
    labels = torch.LongTensor(labels)
    adj = sparse_mx_to_torch_sparse_tensor(adj)

    return adj, features, labels, idx_train, idx_val, idx_test

# From section 2.2 - "Repeated application of this operator can therefore lead 
# to numerical instabilities and exploding/vanishing gradients when used in a deep 
# neural network model. To alleviate this problem, we introduce the (...) renormalization trick"
def normalize(mx):
    """Row-normalize sparse matrix"""
    rowsum = np.array(mx.sum(1))
    r_inv = np.power(rowsum, -1).flatten()
    r_inv[np.isinf(r_inv)] = 0.
    r_mat_inv = sp.diags(r_inv)
    mx = r_mat_inv.dot(mx)
    return mx


def accuracy(output, labels):
    preds = output.max(1)[1].type_as(labels)
    correct = preds.eq(labels).double()
    correct = correct.sum()
    return correct / len(labels)


def sparse_mx_to_torch_sparse_tensor(sparse_mx):
    """Convert a scipy sparse matrix to a torch sparse tensor."""
    sparse_mx = sparse_mx.tocoo().astype(np.float32)
    indices = torch.from_numpy(
        np.vstack((sparse_mx.row, sparse_mx.col)).astype(np.int64))
    values = torch.from_numpy(sparse_mx.data)
    shape = torch.Size(sparse_mx.shape)
    return torch.sparse.FloatTensor(indices, values, shape)

**Training Module**

We added parameters `model`, `adj`, `labels`, `features`, `idx_train`, `idx_val`, and `optimizer` to the `train()` function since we now need to support different model instances for different datasets (as opposed to the original code where we only needed to support the Cora dataset). Similarly, we added parameters `adj`, `labels`, `features`, and `idx_test` to the test() function. Now our train and test module are no longer dependent on the global definition of these values and we can more easily train and test different instances of the model.

In [4]:
!pip install dgl

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [5]:
from __future__ import division
from __future__ import print_function

import time
import argparse
import numpy as np

import torch
import torch.nn.functional as F
import torch.optim as optim
import dgl

eval_vaidation_separately = True

def train(epoch, model, adj, labels, features, idx_train, idx_val, optimizer):
    t = time.time()
    model.train()
    optimizer.zero_grad()
    output = model(features, adj)
    loss_train = F.nll_loss(output[idx_train], labels[idx_train])
    acc_train = accuracy(output[idx_train], labels[idx_train])
    loss_train.backward()
    optimizer.step()

    if eval_vaidation_separately:
        # Evaluate validation set performance separately,
        # deactivates dropout during validation run.
        model.eval()
        output = model(features, adj)

    loss_val = F.nll_loss(output[idx_val], labels[idx_val])
    acc_val = accuracy(output[idx_val], labels[idx_val])
    #print('Epoch: {:04d}'.format(epoch+1),
    #      'loss_train: {:.4f}'.format(loss_train.item()),
    #      'acc_train: {:.4f}'.format(acc_train.item()),
    #      'loss_val: {:.4f}'.format(loss_val.item()),
    #      'acc_val: {:.4f}'.format(acc_val.item()),
    #      'time: {:.4f}s'.format(time.time() - t))


def test(model, adj, labels, features, idx_test):
    model.eval()
    output = model(features, adj)
    loss_test = F.nll_loss(output[idx_test], labels[idx_test])
    acc_test = accuracy(output[idx_test], labels[idx_test])
    print("Test set results:",
          "loss= {:.4f}".format(loss_test.item()),
          "accuracy= {:.4f}".format(acc_test.item()))


Define function `train_and_test_model` which takes the dataset as a parameter. This function calls the `load_data` function, then defines the model and optimizer, finally it calls the `train` and `test` functions. Now we can easily create model instances and train and test those models.

In [6]:
def train_and_test_model(dataset):
  np.random.seed(42)
  torch.manual_seed(42)
  torch.cuda.manual_seed(42)
  

  # Load data
  adj, features, labels, idx_train, idx_val, idx_test = load_data(dataset)

  # Model and optimizer
  model = GCN(nfeat=features.shape[1],
              nhid=16,
              nclass=labels.max().item() + 1,
              dropout=0.5)
  optimizer = optim.Adam(model.parameters(),
                        lr=0.01, weight_decay=5e-4)
  
  print(torch.cuda.is_available())
  model.cuda()
  features = features.cuda()
  adj = adj.cuda()
  labels = labels.cuda()
  idx_train = idx_train.cuda()
  idx_val = idx_val.cuda()
  idx_test = idx_test.cuda()

  # Train model
  t_total = time.time()
  print("TRAIN")
  for epoch in range(200):
      train(epoch, model, adj, labels, features, idx_train, idx_val, optimizer)
  print("Optimization Finished!")
  print("Total time elapsed: {:.4f}s".format(time.time() - t_total))

  # Testing
  print("TEST")
  test(model, adj, labels, features, idx_test)

Run with the three citation network datasets

In [7]:
citeseer_dataset =  dgl.data.CiteseerGraphDataset()
train_and_test_model(citeseer_dataset)

  NumNodes: 3327
  NumEdges: 9228
  NumFeats: 3703
  NumClasses: 6
  NumTrainingSamples: 120
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.


  r_inv = np.power(rowsum, -1).flatten()


True
TRAIN
Optimization Finished!
Total time elapsed: 3.4275s
TEST
Test set results: loss= 1.0236 accuracy= 0.7050


In [8]:
cora_dataset =  dgl.data.CoraGraphDataset()
train_and_test_model(cora_dataset)

  NumNodes: 2708
  NumEdges: 10556
  NumFeats: 1433
  NumClasses: 7
  NumTrainingSamples: 140
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
True
TRAIN
Optimization Finished!
Total time elapsed: 1.0107s
TEST
Test set results: loss= 0.7089 accuracy= 0.8160


In [9]:
pubmed_dataset =  dgl.data.PubmedGraphDataset()
train_and_test_model(pubmed_dataset)

  NumNodes: 19717
  NumEdges: 88651
  NumFeats: 500
  NumClasses: 3
  NumTrainingSamples: 60
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
True
TRAIN
Optimization Finished!
Total time elapsed: 1.1611s
TEST
Test set results: loss= 0.5586 accuracy= 0.7850


**Abalation 1 - Renormalization Removal**

In section 2.2 of the paper the authors use a renormalization trick to prevent exploding/vanishing gradients. The authors note that this is an important technique to maximize the accuracy of the classification results. The adjacency matrix is normalized with a diagonal matrix where each element of the diagonal is the sum of the corresponding row in the adjacency matrix. In other words the adjacency matrix is normalized with the degree of each corresponding node in the graph. We will remove this normalization and see how it impacts the performance of the model.

Remove the normalization from the `load_data` function. Name this new function `load_data_no_normalization`. Call this function from `train_and_test_model_no_normalization`.

In [10]:
def load_data_no_normalization(dataset):
    # Extract graph, features, and labels from the dataset
    graph = dataset[0]
    features = graph.ndata['feat']
    labels = graph.ndata['label']
    train_mask = graph.ndata['train_mask']
    val_mask = graph.ndata['val_mask']
    test_mask = graph.ndata['test_mask']

    # Get the adjacency matrix of the graph as a dense tensor and convert this 
    # dense tensor into a numpy array
    adj = graph.adjacency_matrix().to_dense().numpy()

    # Convert to a scipy sparse matrix (CSR format)
    adj = sp.csr_matrix(adj)

    ## REMOVE THE NORMALIZATION
    # normalize adjacency matrix and features as outlined in section 2.2
    #adj = normalize(adj + sp.eye(adj.shape[0]))
    #features = normalize(features)
    adj = adj + sp.eye(adj.shape[0])

    # Get training, validation, and test set indices
    idx_train = torch.LongTensor(np.where(train_mask)[0])
    idx_val = torch.LongTensor(np.where(val_mask)[0])
    idx_test = torch.LongTensor(np.where(test_mask)[0])

    # Convert data to PyTorch tensors
    features = torch.FloatTensor(features)
    labels = torch.LongTensor(labels)
    adj = sparse_mx_to_torch_sparse_tensor(adj)

    return adj, features, labels, idx_train, idx_val, idx_test

In [11]:
def train_and_test_model_no_normalization(dataset):
  np.random.seed(42)
  torch.manual_seed(42)
  torch.cuda.manual_seed(42)
  

  # Load data
  adj, features, labels, idx_train, idx_val, idx_test = load_data_no_normalization(dataset)

  # Model and optimizer
  model = GCN(nfeat=features.shape[1],
              nhid=16,
              nclass=labels.max().item() + 1,
              dropout=0.5)
  optimizer = optim.Adam(model.parameters(),
                        lr=0.01, weight_decay=5e-4)
  
  print(torch.cuda.is_available())
  model.cuda()
  features = features.cuda()
  adj = adj.cuda()
  labels = labels.cuda()
  idx_train = idx_train.cuda()
  idx_val = idx_val.cuda()
  idx_test = idx_test.cuda()

  # Train model
  t_total = time.time()
  print("TRAIN")
  for epoch in range(200):
      train(epoch, model, adj, labels, features, idx_train, idx_val, optimizer)
  print("Optimization Finished!")
  print("Total time elapsed: {:.4f}s".format(time.time() - t_total))

  # Testing
  print("TEST")
  test(model, adj, labels, features, idx_test)

In [12]:
citeseer_dataset =  dgl.data.CiteseerGraphDataset()
train_and_test_model_no_normalization(citeseer_dataset)

  NumNodes: 3327
  NumEdges: 9228
  NumFeats: 3703
  NumClasses: 6
  NumTrainingSamples: 120
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
True
TRAIN
Optimization Finished!
Total time elapsed: 1.0170s
TEST
Test set results: loss= 1.4504 accuracy= 0.6590


In [13]:
cora_dataset =  dgl.data.CoraGraphDataset()
train_and_test_model_no_normalization(cora_dataset)

  NumNodes: 2708
  NumEdges: 10556
  NumFeats: 1433
  NumClasses: 7
  NumTrainingSamples: 140
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
True
TRAIN
Optimization Finished!
Total time elapsed: 1.0568s
TEST
Test set results: loss= 0.9736 accuracy= 0.7720


In [14]:
pubmed_dataset =  dgl.data.PubmedGraphDataset()
train_and_test_model_no_normalization(pubmed_dataset)

  NumNodes: 19717
  NumEdges: 88651
  NumFeats: 500
  NumClasses: 3
  NumTrainingSamples: 60
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
True
TRAIN
Optimization Finished!
Total time elapsed: 1.1446s
TEST
Test set results: loss= 1.6247 accuracy= 0.7720


**Ablation 2 - Sparse Matrix Multiplication**


The core component of the paper is that the Graph Con-
volutional Network (GCN) provides an efficient framework
for semi-supervised learning on graph data. The authors use
sparse-dense matrix multiplication to minimize computational
complexity and memory usage. For this ablation study, we
will replace the sparse-dense matrix multiplication operation
with dense-dense matrix multiplication operation and see if
the results support this core argument.


We define a new class named GraphConvolutionDense. This
class inherits from the GraphConvolution class and overrides
the forward function. In this new forward function, we re-
place the sparse-dense matrix multiplication with dense-dense
matrix multiplication. Specifically, we calculate the product
of the input matrix and the weight matrix which outputs a
sparse adjacency matrix. We then convert this matrix to a dense
matrix and perform dense-dense matrix multiplication between
this dense adjacency matrix and the product that we calculated
in the previous step.


Finally, we created a new load data dense function and a
normalize dense function. The sparse adjacency matrix of the
graph is converted to a dense matrix. We then convert this
dense matrix into a numpy array. We apply renormalization
using our normalize dense function. The rest of this function
remains the same as the original load data function.

In [15]:
class GraphConvolutionDense(GraphConvolution):
    # From Section 3.2 - "In practice, we make use of TensorFlow (Abadi et al., 2015) 
    # for an efficient GPU-based implementation2 of Eq. 9 using sparse-dense matrix multiplications."
    def forward(self, input, adj):
        # matrix multiplication between input and weight
        support = torch.mm(input, self.weight)

        # USE DENSE-DENSE MATRIX MULTIPLICATION INSTEAD OF SPARSE-DENSE
        # sparse matrix multiplication between adj and the output of the previous product
        
        # convert the sparse adj matrix into a dense matrix
        adj_dense = adj.to_dense()
        # use dense-dense matrix multiplication instead of sparse-dense
        output = torch.mm(adj_dense, support)

        if self.bias is not None:
            return output + self.bias
        else:
            return output

In [16]:
import torch.nn as nn
import torch.nn.functional as F

class GCN(nn.Module):
    def __init__(self, nfeat, nhid, nclass, dropout):
        super(GCN, self).__init__()

        self.gc1 = GraphConvolutionDense(nfeat, nhid)
        self.gc2 = GraphConvolutionDense(nhid, nclass)
        self.dropout = dropout

    def forward(self, x, adj):
        x = F.relu(self.gc1(x, adj))
        x = F.dropout(x, self.dropout, training=self.training)
        x = self.gc2(x, adj)
        return F.log_softmax(x, dim=1)

Data loading, training, and testing

In [17]:
import numpy as np
import torch
import scipy.sparse as sp

def load_data_dense(dataset):
    # Extract graph, features, and labels from the dataset
    graph = dataset[0]
    features = graph.ndata['feat']
    labels = graph.ndata['label']
    train_mask = graph.ndata['train_mask']
    val_mask = graph.ndata['val_mask']
    test_mask = graph.ndata['test_mask']

    # Get the adjacency matrix of the graph as a dense tensor and convert this 
    # dense tensor into a numpy array
    adj = graph.adjacency_matrix().to_dense().numpy()

    # Normalize adjacency matrix and features as outlined in section 2.2
    adj = normalize_dense(adj + np.eye(adj.shape[0]))
    features = normalize_dense(features)

    # Get training, validation, and test set indices
    idx_train = torch.LongTensor(np.where(train_mask)[0])
    idx_val = torch.LongTensor(np.where(val_mask)[0])
    idx_test = torch.LongTensor(np.where(test_mask)[0])

    # Convert data to PyTorch tensors
    features = torch.FloatTensor(features)
    labels = torch.LongTensor(labels)
    adj = torch.FloatTensor(adj)

    return adj, features, labels, idx_train, idx_val, idx_test

def normalize_dense(mx):
    """Row-normalize dense matrix."""
    rowsum = np.array(mx.sum(1))
    r_inv = np.power(rowsum, -1).flatten()
    r_inv[np.isinf(r_inv)] = 0.
    r_mat_inv = np.diag(r_inv)
    mx = r_mat_inv.dot(mx)
    return mx

In [18]:
def train_and_test_dense(dataset):
  np.random.seed(42)
  torch.manual_seed(42)
  torch.cuda.manual_seed(42)
  

  # Load data
  adj, features, labels, idx_train, idx_val, idx_test = load_data_dense(dataset)

  # Model and optimizer
  model = GCN(nfeat=features.shape[1],
              nhid=16,
              nclass=labels.max().item() + 1,
              dropout=0.5)
  optimizer = optim.Adam(model.parameters(),
                        lr=0.01, weight_decay=5e-4)
  
  print(torch.cuda.is_available())
  model.cuda()
  features = features.cuda()
  adj = adj.cuda()
  labels = labels.cuda()
  idx_train = idx_train.cuda()
  idx_val = idx_val.cuda()
  idx_test = idx_test.cuda()

  # Train model
  t_total = time.time()
  print("TRAIN")
  for epoch in range(200):
      train(epoch, model, adj, labels, features, idx_train, idx_val, optimizer)
  print("Optimization Finished!")
  print("Total time elapsed: {:.4f}s".format(time.time() - t_total))

  # Testing
  print("TEST")
  test(model, adj, labels, features, idx_test)

In [22]:
citeseer_dataset =  dgl.data.CiteseerGraphDataset()
train_and_test_dense(citeseer_dataset)

  NumNodes: 3327
  NumEdges: 9228
  NumFeats: 3703
  NumClasses: 6
  NumTrainingSamples: 120
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.


  r_inv = np.power(rowsum, -1).flatten()


True
TRAIN
Optimization Finished!
Total time elapsed: 0.6280s
TEST
Test set results: loss= 1.0236 accuracy= 0.7050


In [23]:
cora_dataset =  dgl.data.CoraGraphDataset()
train_and_test_dense(cora_dataset)

  NumNodes: 2708
  NumEdges: 10556
  NumFeats: 1433
  NumClasses: 7
  NumTrainingSamples: 140
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
True
TRAIN
Optimization Finished!
Total time elapsed: 0.4722s
TEST
Test set results: loss= 0.7089 accuracy= 0.8160


In [24]:
pubmed_dataset =  dgl.data.PubmedGraphDataset()
train_and_test_dense(pubmed_dataset)

  NumNodes: 19717
  NumEdges: 88651
  NumFeats: 500
  NumClasses: 3
  NumTrainingSamples: 60
  NumValidationSamples: 500
  NumTestSamples: 1000
Done loading data from cached files.
True
TRAIN
Optimization Finished!
Total time elapsed: 10.6824s
TEST
Test set results: loss= 0.5586 accuracy= 0.7850
