# Build Your Own Graph Structure Learning Algorithm

In this notebook let's try to implement a simple GSL algorithm using OpenGSL.

We provide multiple choices for each component in OpenGSL.

All of them can be freely chosen and assembled into a **GraphLearner** as shown below.

<img src="../docs/source/img/package.jpg" alt="描述" width="900">


## Load Data

In this notebook we simply run experiments on *cora*, you can try other datasets on your own.

In [1]:
from opengsl.data.dataset import Dataset
dataset = Dataset("cora", n_splits=1)
train_mask = dataset.train_masks[0]
val_mask = dataset.val_masks[0]
test_mask = dataset.test_masks[0]

----Data statistics------'
                #Nodes 2708
                #Edges 5278
                #Classes 7
----Split statistics of 1 splits------'
                #Train samples 140
                #Val samples 500
                #Test samples 1000


## Build Model

It's easy to implement a simple GSL algorithm using our provided components.

Let's **choose the basic components and build the graphlearner**.

We use the **GCNDiagEncoder** as in *["Graph-Revised Convolutional Network"](https://arxiv.org/abs/1911.07123)*, followed by **Cosine** (metric), **KNN** (transform), **Interpolate** (fuse). Then a **GraphLearner** is built with these components.

In [2]:
import torch
from opengsl.module.encoder import GCNEncoder, GCNDiagEncoder
from opengsl.module import GraphLearner
from opengsl.module.transform import KNN
from opengsl.module.metric import Cosine
from opengsl.module.fuse import Interpolate
from opengsl.utils import set_seed

device = torch.device('cuda')
set_seed(42)
encoder = GCNDiagEncoder(2, dataset.dim_feats)
metric = Cosine()
postprocess = [KNN(150)]
fuse = Interpolate(1, 1)
# build the graphlearner
graphlearner = GraphLearner(encoder=encoder, metric=metric, postprocess=postprocess, fuse=fuse).to(device)
# define gnn model
gnn = GCNEncoder(dataset.dim_feats, nhid=64, nclass=dataset.n_classes, n_layers=2, dropout=0.5).to(device)

To learn a new structure, you just need to input feature and original adj as `new_adj = graphlearner(x ,adj)`

## Training and Evaluation

With the defined graphlearner and gnn, you can use them in a common training and evaluation process.

In [18]:
import time
from copy import deepcopy
from opengsl.module.functional import normalize
from opengsl.utils import accuracy
import torch.nn.functional as F

n_epochs = 100
lr = 1e-2
wd = 5e-4
best_valid = 0
gsl_weights =None
gnn_weights =None
start_time = time.time()
optim = torch.optim.Adam([{'params': gnn.parameters()}, {'params': graphlearner.parameters()}], lr=lr, weight_decay=wd)

for epoch in range(n_epochs):
    improve = ''
    t0 = time.time()
    gnn.train()
    graphlearner.train()
    optim.zero_grad()

    # forward and backward
    adj = graphlearner(dataset.feats, normalize(dataset.adj))
    output = gnn(dataset.feats, normalize(adj, add_loop=False))

    loss_train = F.cross_entropy(output[train_mask], dataset.labels[train_mask])
    acc_train = accuracy(dataset.labels[train_mask].cpu().numpy(), output[train_mask].detach().cpu().numpy())
    loss_train.backward()
    optim.step()

    # Evaluate
    gnn.eval()
    graphlearner.eval()
    with torch.no_grad():
        adj = graphlearner(dataset.feats, normalize(dataset.adj))
        output = gnn(dataset.feats, normalize(adj, add_loop=False))
        loss_val = F.cross_entropy(output[val_mask], dataset.labels[val_mask])
        acc_val = accuracy(dataset.labels[val_mask].cpu().numpy(), output[val_mask].detach().cpu().numpy())

    # save
    if acc_val > best_valid:
        improve = '*'
        gsl_weights = deepcopy(graphlearner.state_dict())
        gnn_weights = deepcopy(gnn.state_dict())
        total_time = time.time() - start_time
        best_val_loss = loss_val
        best_valid = acc_val
        best_adj = adj.detach().clone()

    # debug
    print("Epoch {:05d} | Time(s) {:.4f} | Loss(train) {:.4f} | Acc(train) {:.4f} | Loss(val) {:.4f} | Acc(val) {:.4f} | {}".format(
        epoch+1, time.time() -t0, loss_train.item(), acc_train, loss_val, acc_val, improve))

print('Optimization Finished!')
print('Time(s): {:.4f}'.format(total_time))
# test
graphlearner.load_state_dict(gsl_weights)
gnn.load_state_dict(gnn_weights)
with torch.no_grad():
    adj = graphlearner(dataset.feats, normalize(dataset.adj))
    output = gnn(dataset.feats, normalize(adj, add_loop=False))
    loss_test = F.cross_entropy(output[test_mask], dataset.labels[test_mask])
    acc_test = accuracy(dataset.labels[test_mask].cpu().numpy(), output[test_mask].detach().cpu().numpy())

print("Loss(test) {:.4f} | Acc(test) {:.4f}".format(loss_test.item(), acc_test))



Epoch 00001 | Time(s) 0.5434 | Loss(train) 1.9515 | Acc(train) 0.1429 | Loss(val) 1.8794 | Acc(val) 0.3740 | *
Epoch 00002 | Time(s) 0.1974 | Loss(train) 1.9129 | Acc(train) 0.2357 | Loss(val) 1.8233 | Acc(val) 0.6220 | *
Epoch 00003 | Time(s) 0.2000 | Loss(train) 1.8381 | Acc(train) 0.5357 | Loss(val) 1.7645 | Acc(val) 0.7100 | *
Epoch 00004 | Time(s) 0.1989 | Loss(train) 1.7641 | Acc(train) 0.7143 | Loss(val) 1.6986 | Acc(val) 0.5900 | 
Epoch 00005 | Time(s) 0.1990 | Loss(train) 1.6764 | Acc(train) 0.6500 | Loss(val) 1.6161 | Acc(val) 0.5900 | 
Epoch 00006 | Time(s) 0.1972 | Loss(train) 1.5728 | Acc(train) 0.6571 | Loss(val) 1.5193 | Acc(val) 0.6140 | 
Epoch 00007 | Time(s) 0.2003 | Loss(train) 1.4543 | Acc(train) 0.6571 | Loss(val) 1.4120 | Acc(val) 0.6520 | 
Epoch 00008 | Time(s) 0.2011 | Loss(train) 1.3379 | Acc(train) 0.7357 | Loss(val) 1.2974 | Acc(val) 0.6880 | 
Epoch 00009 | Time(s) 0.1961 | Loss(train) 1.2028 | Acc(train) 0.7643 | Loss(val) 1.1808 | Acc(val) 0.7060 | 
Epoch 0

It seems that Our created GSL algorithm help improve the performance of GCN to 82.8, which is better than the usual 81+.

## Training and Evaluation using our pipeline
We recommend you to use our provided pipline with **GSLSolver** and **ExpManager** to simplify the above process. Only *set_method* needs to be customized in this pipeline.

In [5]:
from opengsl.module.solver import GSLSolver
from opengsl import ExpManager
import argparse

class MyGSL(GSLSolver):
    def set_method(self):
        encoder = GCNDiagEncoder(2, dataset.dim_feats)
        metric = Cosine()
        postprocess = [KNN(150)]
        fuse = Interpolate(1, 1)
        # build the graphlearner
        self.graphlearner = GraphLearner(encoder=encoder, metric=metric, postprocess=postprocess, fuse=fuse).to(device)
        # define gnn model
        self.model = GCNEncoder(dataset.dim_feats, nhid=64, nclass=dataset.n_classes, n_layers=2, dropout=0.5).to(device)
        self.optim = torch.optim.Adam([{'params': self.model.parameters()}, {'params': self.graphlearner.parameters()}], lr=self.conf.training['lr'], weight_decay=self.conf.training['weight_decay'])

conf = {'model': {'n_hidden': 64, 'n_layer': 2},
    'training': {'lr': 1e-2,
    'weight_decay': 5e-4,
    'n_epochs': 100,
    'patience': None,
    'criterion': 'metric'},
    'dataset': {'feat_norm': False, 'sparse': True},
    'analysis': {'flag': False, 'save_graph': False}}
mygsl = MyGSL(argparse.Namespace(**conf), dataset)
exp = ExpManager(solver=mygsl)
exp.run(n_runs=3, debug=True)

Exp 0/3
Epoch 00001 | Time(s) 0.4687 | Loss(train) 1.9483 | Acc(train) 0.1429 | Loss(val) 1.9186 | Acc(val) 0.3100 | *
Epoch 00002 | Time(s) 0.2022 | Loss(train) 1.9172 | Acc(train) 0.3714 | Loss(val) 1.8581 | Acc(val) 0.4660 | *
Epoch 00003 | Time(s) 0.2005 | Loss(train) 1.8505 | Acc(train) 0.5071 | Loss(val) 1.7884 | Acc(val) 0.5720 | *
Epoch 00004 | Time(s) 0.2004 | Loss(train) 1.7720 | Acc(train) 0.6571 | Loss(val) 1.7089 | Acc(val) 0.5860 | *
Epoch 00005 | Time(s) 0.2036 | Loss(train) 1.6840 | Acc(train) 0.6643 | Loss(val) 1.6176 | Acc(val) 0.6060 | *
Epoch 00006 | Time(s) 0.2013 | Loss(train) 1.5817 | Acc(train) 0.6571 | Loss(val) 1.5162 | Acc(val) 0.6440 | *
Epoch 00007 | Time(s) 0.2004 | Loss(train) 1.4627 | Acc(train) 0.6857 | Loss(val) 1.4058 | Acc(val) 0.6720 | *
Epoch 00008 | Time(s) 0.2011 | Loss(train) 1.3437 | Acc(train) 0.7000 | Loss(val) 1.2912 | Acc(val) 0.7060 | *
Epoch 00009 | Time(s) 0.2026 | Loss(train) 1.1950 | Acc(train) 0.7429 | Loss(val) 1.1743 | Acc(val) 0.73

(82.96666666666665, 0.3055050463303937)

Now you have learned how to build a simple GSL using our provided components in OpenGSL. Try other components and datasets freely~