# Q1

Graph C is most likely to be generated by G(n, p), n = 5 and p = 0.8.

In a random graph G(n, p), n means the number of nodes. By n = 5, graph B can be eliminated.

For p, it is the probability of edges to exist. The expected value for the number of edges is p*(nC2) = pn(n-1)/2. In case, the expected value should be 8.
The number of edges in the graphs left is:

A: 3

C: 10

D: 1

Therefore, C, which is a complete graph, is the answer.

# Q2

## Loading Dataset

In [1]:
! pip install dgl



In [2]:
# Import libraries needed
import pickle as pkl
import networkx as nx
import dgl
import numpy as np
import torch
import torch.optim as optim
import torch.nn as nn
import torch.nn.functional as F
from dgl.nn import GINConv
import pandas as pd

Using backend: pytorch


In [3]:
# Mount the google drive to load the dataset
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
dataset = pkl.load(open('/content/drive/My Drive/HKUST/MSBD5008/datasets/hw_dataset.pkl', 'rb'))

## Data Preparation

In [5]:
# Initialize variables for graph
nodes = dataset['nodes']
source_nodes = dataset['source_nodes']
target_nodes = dataset['target_nodes']

In [6]:
# Construct a NetworkX
nx_G = nx.DiGraph()
for i in nodes:
    nx_G.add_node(i)

for i in range(len(source_nodes)):
    nx_G.add_weighted_edges_from([(source_nodes[i], target_nodes[i], 1.0)])

In [7]:
# Set up DGL graph from NetworkX
dgl_G = dgl.from_networkx(nx_G)
print('we have %d nodes.' % dgl_G.number_of_nodes())
print('we have %d edges.' % dgl_G.number_of_edges())

we have 3327 nodes.
we have 12431 edges.


In [8]:
# Add self loop to the graph
dgl_G = dgl.add_self_loop(dgl_G)
print('we have %d nodes.' % dgl_G.number_of_nodes())
print('we have %d edges.' % dgl_G.number_of_edges())

we have 3327 nodes.
we have 15758 edges.


In [9]:
# Initialize other variables in dataset
labels = dataset['labels']
num_classes = dataset['num_classes']
features = dataset['features']
train_mask = dataset['train_mask']
val_mask = dataset['val_mask']

In [10]:
features.shape

(3327, 3703)

## Train and Test

### Function: Train the model

In [11]:
import warnings
warnings.filterwarnings("ignore")

def train(model, features, labels):
    model.train()
    best_val_acc = 0
    best_test_acc = 0   
    for epoch in range(101):
        features = torch.tensor(features, dtype=torch.float).to(device)
        labels = torch.tensor(labels, dtype=torch.long).to(device)
    
        logits = model(features)
        preds = F.log_softmax(logits, 1)
        loss = F.nll_loss(preds[train_mask], labels[train_mask])

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        if epoch % 10 == 0:
            print("Epoch {}: Loss {}".format(epoch, loss.item()))

### Function: Test the model

In [12]:
def test(model, features, labels, mask):
    # set evaluation mode
    model.eval()
    with torch.no_grad():
        
        features = torch.tensor(features, dtype=torch.float).to(device)
        labels = torch.tensor(labels, dtype=torch.long).to(device)
        
        logits = model(features)
        test_mask_logits = logits[mask]
        predict_y = test_mask_logits.max(1)[1]
        accuracy = torch.eq(predict_y, labels[mask]).float().mean()

    return accuracy

## As required in the assignment, I will find the best model by varying:
1.   Number of GINConv layers
2.   Dimension of hidden features (same for different layers)
3.   Activation function
4.   Aggregation type (sum, max or mean)

**There are 2 more variables I would like to test: optimizer and learning rate.**

I will do the experiments one by one and keep others as controlled variables.

The one with the best result will carry on to the next experiment so that
the best model could be found.

The name of class network will be in this format:

*network{no. of GINConv layers}_{hidden_dim}_activation_agg*

Elements will not show if experiments are not yet started.

## Experiment 1: number of GINConv layers (default: 30 hidden features, ReLU, sum)

In [56]:
layer = [2, 3, 4, 5]
layer_acc = []

In [57]:
# Work on GPU if existed
use_cuda= torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

dgl_G = dgl_G.to(device)

### 2 conv layers 

In [58]:
class myGraphNetwork2(nn.Module):
    def __init__(self, g, in_dim, hidden_dim, out_dim, agg):
        super(myGraphNetwork2, self).__init__()
        self.g = g 
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.relu(self.layer1(self.g, h))
        h = self.layer2(self.g, h)
        return h

In [59]:
model = myGraphNetwork2(dgl_G, 3703, 30, num_classes, 'sum')

In [60]:
model = model.to(device)

In [61]:
learning_rate = 0.01
optimizer = optim.SGD(model.parameters(), lr=learning_rate)

In [62]:
train(model, features, labels)

Epoch 0: Loss 1.7886096239089966
Epoch 10: Loss 1.777369499206543
Epoch 20: Loss 1.7690876722335815
Epoch 30: Loss 1.7618142366409302
Epoch 40: Loss 1.7551450729370117
Epoch 50: Loss 1.7488175630569458
Epoch 60: Loss 1.742705225944519
Epoch 70: Loss 1.7367238998413086
Epoch 80: Loss 1.7308404445648193
Epoch 90: Loss 1.7250334024429321
Epoch 100: Loss 1.7192938327789307


In [63]:
accuracy = test(model, features, labels, val_mask)
layer_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.2407


### 3 conv layers 

In [64]:
class myGraphNetwork3(nn.Module):
    def __init__(self, g, in_dim, hidden_dim, out_dim, agg):
        super(myGraphNetwork3, self).__init__()
        self.g = g 
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.relu(self.layer1(self.g, h))
        h = F.relu(self.layer2(self.g, h))
        h = self.layer3(self.g, h)
        return h

In [65]:
# Work on GPU if existed
use_cuda= torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

dgl_G = dgl_G.to(device)

In [66]:
model2 = myGraphNetwork3(dgl_G, 3703, 30, num_classes, 'sum')

In [67]:
model2 = model2.to(device)

In [68]:
learning_rate = 0.01
optimizer = optim.SGD(model2.parameters(), lr=learning_rate)

In [69]:
train(model2, features, labels)

Epoch 0: Loss 1.8441182374954224
Epoch 10: Loss 1.7286224365234375
Epoch 20: Loss 1.6908974647521973
Epoch 30: Loss 1.6603295803070068
Epoch 40: Loss 1.6327534914016724
Epoch 50: Loss 1.6059900522232056
Epoch 60: Loss 1.5793778896331787
Epoch 70: Loss 1.5524905920028687
Epoch 80: Loss 1.5254836082458496
Epoch 90: Loss 1.4975732564926147
Epoch 100: Loss 1.46857488155365


In [70]:
accuracy = test(model2, features, labels, val_mask)
layer_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.5535


### 4 conv layers 

In [71]:
class myGraphNetwork4(nn.Module):
    def __init__(self, g, in_dim, hidden_dim, out_dim, agg):
        super(myGraphNetwork4, self).__init__()
        self.g = g 
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.relu(self.layer1(self.g, h))
        h = F.relu(self.layer2(self.g, h))
        h = F.relu(self.layer3(self.g, h))
        h = self.layer4(self.g, h)
        return h

In [72]:
# Work on GPU if existed
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

dgl_G = dgl_G.to(device)

In [73]:
model3 = myGraphNetwork4(dgl_G, 3703, 30, num_classes, 'sum')

In [74]:
model3 = model3.to(device)

In [75]:
learning_rate = 0.01
optimizer = optim.SGD(model3.parameters(), lr=learning_rate)

In [76]:
train(model3, features, labels)

Epoch 0: Loss 2.2097952365875244
Epoch 10: Loss 1.675114631652832
Epoch 20: Loss 1.6110427379608154
Epoch 30: Loss 1.6107839345932007
Epoch 40: Loss 1.5293054580688477
Epoch 50: Loss 1.4695382118225098
Epoch 60: Loss 1.4225623607635498
Epoch 70: Loss 1.3745285272598267
Epoch 80: Loss 1.32945716381073
Epoch 90: Loss 1.2935636043548584
Epoch 100: Loss 1.260201096534729


In [77]:
accuracy = test(model3, features, labels, val_mask)
layer_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.572


### 5 conv layers 

In [78]:
class myGraphNetwork5(nn.Module):
    def __init__(self, g, in_dim, hidden_dim, out_dim, agg):
        super(myGraphNetwork5, self).__init__()
        self.g = g 
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer5 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.relu(self.layer1(self.g, h))
        h = F.relu(self.layer2(self.g, h))
        h = F.relu(self.layer3(self.g, h))
        h = F.relu(self.layer4(self.g, h))
        h = self.layer5(self.g, h)
        return h

In [79]:
# Work on GPU if existed
use_cuda= torch.cuda.is_available()
device = torch.device("cuda" if use_cuda else "cpu")

dgl_G = dgl_G.to(device)

In [80]:
model4 = myGraphNetwork5(dgl_G, 3703, 30, num_classes, 'sum')

In [81]:
model4 = model4.to(device)

In [82]:
learning_rate = 0.01
optimizer = optim.SGD(model4.parameters(), lr=learning_rate)

In [83]:
train(model4, features, labels)

Epoch 0: Loss 6.389787673950195
Epoch 10: Loss 1.7525861263275146
Epoch 20: Loss 1.6905186176300049
Epoch 30: Loss 1.6812682151794434
Epoch 40: Loss 1.6759556531906128
Epoch 50: Loss 1.6723583936691284
Epoch 60: Loss 1.6695871353149414
Epoch 70: Loss 1.667223572731018
Epoch 80: Loss 1.6654809713363647
Epoch 90: Loss 1.662615418434143
Epoch 100: Loss 1.6569314002990723


In [84]:
accuracy = test(model4, features, labels, val_mask)
layer_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.3086


### Summary of Experiment 1

In [85]:
layer_acc

[array(0.24074075, dtype=float32),
 array(0.55349797, dtype=float32),
 array(0.5720165, dtype=float32),
 array(0.30864197, dtype=float32)]

In [86]:
df_e1 = pd.DataFrame()
df_e1['Number of GINConv Layers'] = layer
df_e1['Validation accuracy'] = layer_acc
df_e1

Unnamed: 0,Number of GINConv Layers,Validation accuracy
0,2,0.24074075
1,3,0.55349797
2,4,0.5720165
3,5,0.30864197


## Experiment 2: dimension of hidden features (default: ReLU, sum)

### Nearest 10

As the runtime for above 100 is too long, I set the maximum as 100.

In [89]:
hidden_dim_acc_10 = []
hidden_dim_10 = np.arange(10, 110, 10)
hidden_dim_10

array([ 10,  20,  30,  40,  50,  60,  70,  80,  90, 100])

In [90]:
for n_dim in hidden_dim_10:
    model5 = myGraphNetwork4(dgl_G, 3703, n_dim, num_classes, 'sum')
    model5 = model5.to(device)
    learning_rate = 0.01
    optimizer = optim.SGD(model5.parameters(), lr=learning_rate)
    train(model5, features, labels)
    accuracy = test(model5, features, labels, val_mask)
    hidden_dim_acc_10.append(accuracy.numpy())
    print("Testing Acc {:.4}".format(accuracy))

Epoch 0: Loss 2.086578607559204
Epoch 10: Loss 1.7954685688018799
Epoch 20: Loss 1.7550817728042603
Epoch 30: Loss 1.7408947944641113
Epoch 40: Loss 1.727906346321106
Epoch 50: Loss 1.7210755348205566
Epoch 60: Loss 1.7112070322036743
Epoch 70: Loss 1.7057372331619263
Epoch 80: Loss 1.6945446729660034
Epoch 90: Loss 1.6848255395889282
Epoch 100: Loss 1.6728558540344238
Testing Acc 0.3704
Epoch 0: Loss 2.4284722805023193
Epoch 10: Loss 1.685064435005188
Epoch 20: Loss 1.6022300720214844
Epoch 30: Loss 1.5541025400161743
Epoch 40: Loss 1.5480533838272095
Epoch 50: Loss 1.4816999435424805
Epoch 60: Loss 1.4534236192703247
Epoch 70: Loss 1.4323256015777588
Epoch 80: Loss 1.385993480682373
Epoch 90: Loss 1.3548885583877563
Epoch 100: Loss 1.322959065437317
Testing Acc 0.4753
Epoch 0: Loss 3.6505801677703857
Epoch 10: Loss 1.635565161705017
Epoch 20: Loss 1.5738003253936768
Epoch 30: Loss 1.5240769386291504
Epoch 40: Loss 1.4744329452514648
Epoch 50: Loss 1.4301813840866089
Epoch 60: Loss 1.

In [91]:
df_e2_10 = pd.DataFrame()
df_e2_10['Dimensions of hidden features'] = hidden_dim_10
df_e2_10['Validation accuracy'] = hidden_dim_acc_10
df_e2_10

Unnamed: 0,Dimensions of hidden features,Validation accuracy
0,10,0.37037036
1,20,0.47530866
2,30,0.5967078
3,40,0.58436215
4,50,0.54320985
5,60,0.6995885
6,70,0.7201646
7,80,0.6851852
8,90,0.5576132
9,100,0.63580245


From 10-100, 70 has the highest validation accuracy. Now, we can prune the others and go deeper to 70. To find the best dimension of hidden features, I will do the experiment once more for 66-75.

(After few times of trial, I found it falls randomly between 50-70. Therefore, I take the last running result as reference)

### Summary of Experiment 2: The best dimension

In [92]:
hidden_dim_acc_1 = []
hidden_dim_1 = np.arange(66, 76)
hidden_dim_1

array([66, 67, 68, 69, 70, 71, 72, 73, 74, 75])

In [93]:
for n_dim in hidden_dim_1:
    model6 = myGraphNetwork4(dgl_G, 3703, n_dim, num_classes, 'sum')
    model6 = model6.to(device)
    learning_rate = 0.01
    optimizer = optim.SGD(model6.parameters(), lr=learning_rate)
    train(model6, features, labels)
    accuracy = test(model6, features, labels, val_mask)
    hidden_dim_acc_1.append(accuracy.numpy())
    print("Testing Acc {:.4}".format(accuracy))

Epoch 0: Loss 2.1366052627563477
Epoch 10: Loss 1.5892966985702515
Epoch 20: Loss 1.53057062625885
Epoch 30: Loss 1.4744327068328857
Epoch 40: Loss 1.4234000444412231
Epoch 50: Loss 1.3832409381866455
Epoch 60: Loss 1.3374643325805664
Epoch 70: Loss 1.277886152267456
Epoch 80: Loss 1.2355821132659912
Epoch 90: Loss 1.1862876415252686
Epoch 100: Loss 1.1394858360290527
Testing Acc 0.7099
Epoch 0: Loss 1.8524013757705688
Epoch 10: Loss 1.5952509641647339
Epoch 20: Loss 1.529133915901184
Epoch 30: Loss 1.4693626165390015
Epoch 40: Loss 1.4118313789367676
Epoch 50: Loss 1.456276774406433
Epoch 60: Loss 1.3361773490905762
Epoch 70: Loss 1.3200523853302002
Epoch 80: Loss 1.3123337030410767
Epoch 90: Loss 1.2619355916976929
Epoch 100: Loss 1.2227413654327393
Testing Acc 0.6728
Epoch 0: Loss 1.8442506790161133
Epoch 10: Loss 1.6256158351898193
Epoch 20: Loss 1.5730265378952026
Epoch 30: Loss 1.5210325717926025
Epoch 40: Loss 1.4637242555618286
Epoch 50: Loss 1.4566714763641357
Epoch 60: Loss 1

In [94]:
df_e2_1 = pd.DataFrame()
df_e2_1['Dimensions of hidden features'] = hidden_dim_1
df_e2_1['Validation accuracy'] = hidden_dim_acc_1
df_e2_1

Unnamed: 0,Dimensions of hidden features,Validation accuracy
0,66,0.70987654
1,67,0.6728395
2,68,0.6687243
3,69,0.70987654
4,70,0.5205761
5,71,0.63580245
6,72,0.6872428
7,73,0.6152263
8,74,0.6152263
9,75,0.2983539


By this experiment, we can find that the best dimension is 66.

## Experiment 3: activation function (default: sum)

In [98]:
act_func = ['ReLU', 'Sigmoid', 'Tanh', 'Mish', 'Leaky ReLU']
act_func_acc = [df_e2_1['Validation accuracy'].max()]

### Sigmoid

In [99]:
class myGraphNetwork4_66_sigmoid(nn.Module):
    def __init__(self, g, in_dim, out_dim, agg):
        super(myGraphNetwork4_66_sigmoid, self).__init__()
        self.g = g 
        hidden_dim = 66
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.sigmoid(self.layer1(self.g, h))
        h = F.sigmoid(self.layer2(self.g, h))
        h = F.sigmoid(self.layer3(self.g, h))
        h = self.layer4(self.g, h)
        return h

In [100]:
model7 = myGraphNetwork4_66_sigmoid(dgl_G, 3703, num_classes, 'sum')
model7 = model7.to(device)

In [101]:
learning_rate = 0.01
optimizer = optim.SGD(model7.parameters(), lr=learning_rate)

In [102]:
train(model7, features, labels)

Epoch 0: Loss 4.13609504699707
Epoch 10: Loss 2.0423004627227783
Epoch 20: Loss 1.9104504585266113
Epoch 30: Loss 1.9287943840026855
Epoch 40: Loss 1.8578470945358276
Epoch 50: Loss 1.8487169742584229
Epoch 60: Loss 1.824794054031372
Epoch 70: Loss 1.7986470460891724
Epoch 80: Loss 1.7705390453338623
Epoch 90: Loss 1.7557330131530762
Epoch 100: Loss 1.7432841062545776


In [103]:
accuracy = test(model7, features, labels, val_mask)
act_func_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.2922


### Tanh

In [104]:
class myGraphNetwork4_66_tanh(nn.Module):
    def __init__(self, g, in_dim, out_dim, agg):
        super(myGraphNetwork4_66_tanh, self).__init__()
        self.g = g 
        hidden_dim = 66
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.tanh(self.layer1(self.g, h))
        h = F.tanh(self.layer2(self.g, h))
        h = F.tanh(self.layer3(self.g, h))
        h = self.layer4(self.g, h)
        return h

In [105]:
model8 = myGraphNetwork4_66_tanh(dgl_G, 3703, num_classes, 'sum')
model8 = model8.to(device)

In [106]:
learning_rate = 0.01
optimizer = optim.SGD(model8.parameters(), lr=learning_rate)

In [107]:
train(model8, features, labels)

Epoch 0: Loss 2.839672565460205
Epoch 10: Loss 3.4934349060058594
Epoch 20: Loss 2.0141804218292236
Epoch 30: Loss 1.4090794324874878
Epoch 40: Loss 1.3391573429107666
Epoch 50: Loss 1.2272076606750488
Epoch 60: Loss 1.1888102293014526
Epoch 70: Loss 1.1186376810073853
Epoch 80: Loss 1.057219386100769
Epoch 90: Loss 1.0392130613327026
Epoch 100: Loss 0.9817972183227539


In [108]:
accuracy = test(model8, features, labels, val_mask)
act_func_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.679


### Mish

In [109]:
class myGraphNetwork4_66_mish(nn.Module):
    def __init__(self, g, in_dim, out_dim, agg):
        super(myGraphNetwork4_66_mish, self).__init__()
        self.g = g 
        hidden_dim = 66
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.mish(self.layer1(self.g, h))
        h = F.mish(self.layer2(self.g, h))
        h = F.mish(self.layer3(self.g, h))
        h = self.layer4(self.g, h)
        return h

In [110]:
model9 = myGraphNetwork4_66_mish(dgl_G, 3703, num_classes, 'sum')
model9 = model9.to(device)

In [111]:
learning_rate = 0.01
optimizer = optim.SGD(model9.parameters(), lr=learning_rate)

In [112]:
train(model9, features, labels)

Epoch 0: Loss 1.8074688911437988
Epoch 10: Loss 1.6055008172988892
Epoch 20: Loss 1.565070390701294
Epoch 30: Loss 1.5329062938690186
Epoch 40: Loss 1.5038478374481201
Epoch 50: Loss 1.4759892225265503
Epoch 60: Loss 1.44847571849823
Epoch 70: Loss 1.420963168144226
Epoch 80: Loss 1.3934073448181152
Epoch 90: Loss 1.3658418655395508
Epoch 100: Loss 1.3383949995040894


In [113]:
accuracy = test(model9, features, labels, val_mask)
act_func_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.535


### Leaky ReLU

In [116]:
class myGraphNetwork4_66_leaky(nn.Module):
    def __init__(self, g, in_dim, out_dim, agg):
        super(myGraphNetwork4_66_leaky, self).__init__()
        self.g = g 
        hidden_dim = 66
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.leaky_relu(self.layer1(self.g, h))
        h = F.leaky_relu(self.layer2(self.g, h))
        h = F.leaky_relu(self.layer3(self.g, h))
        h = self.layer4(self.g, h)
        return h

In [117]:
model10 = myGraphNetwork4_66_leaky(dgl_G, 3703, num_classes, 'sum')
model10 = model10.to(device)

In [118]:
learning_rate = 0.01
optimizer = optim.SGD(model10.parameters(), lr=learning_rate)

In [119]:
train(model10, features, labels)

Epoch 0: Loss 2.2368438243865967
Epoch 10: Loss 1.6162976026535034
Epoch 20: Loss 1.5594018697738647
Epoch 30: Loss 1.4994723796844482
Epoch 40: Loss 1.4398455619812012
Epoch 50: Loss 1.400891900062561
Epoch 60: Loss 1.3364086151123047
Epoch 70: Loss 1.297406792640686
Epoch 80: Loss 1.2799105644226074
Epoch 90: Loss 1.2136507034301758
Epoch 100: Loss 1.3982957601547241


In [120]:
accuracy = test(model10, features, labels, val_mask)
act_func_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.6173


### Summary of Experiment 3

In [121]:
df_e3 = pd.DataFrame()
df_e3['Activation function'] = act_func
df_e3['Validation accuracy'] = act_func_acc
df_e3

Unnamed: 0,Activation function,Validation accuracy
0,ReLU,0.70987654
1,Sigmoid,0.29218107
2,Tanh,0.67901236
3,Mish,0.5349794
4,Leaky ReLU,0.61728394


## Experiment 4: aggregation type

In [122]:
agg = ['sum', 'max', 'mean']
agg_acc = [df_e3['Validation accuracy'].max()]

In [123]:
class myGraphNetwork4_66_relu(nn.Module):
    def __init__(self, g, in_dim, out_dim, agg):
        super(myGraphNetwork4_66_relu, self).__init__()
        self.g = g 
        hidden_dim = 66
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.relu(self.layer1(self.g, h))
        h = F.relu(self.layer2(self.g, h))
        h = F.relu(self.layer3(self.g, h))
        h = self.layer4(self.g, h)
        return h

### max

In [124]:
model11 = myGraphNetwork4_66_relu(dgl_G, 3703, num_classes, 'max')
model11 = model11.to(device)

In [125]:
learning_rate = 0.01
optimizer = optim.SGD(model11.parameters(), lr=learning_rate)

In [126]:
train(model11, features, labels)

Epoch 0: Loss 1.7739717960357666
Epoch 10: Loss 1.7720820903778076
Epoch 20: Loss 1.770378589630127
Epoch 30: Loss 1.7688381671905518
Epoch 40: Loss 1.7674531936645508
Epoch 50: Loss 1.7662018537521362
Epoch 60: Loss 1.7650715112686157
Epoch 70: Loss 1.7640458345413208
Epoch 80: Loss 1.7631157636642456
Epoch 90: Loss 1.7622754573822021
Epoch 100: Loss 1.7615162134170532


In [127]:
accuracy = test(model11, features, labels, val_mask)
agg_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.2099


### mean

In [128]:
model12 = myGraphNetwork4_66_relu(dgl_G, 3703, num_classes, 'mean')
model12 = model12.to(device)

In [129]:
learning_rate = 0.01
optimizer = optim.SGD(model12.parameters(), lr=learning_rate)

In [130]:
train(model12, features, labels)

Epoch 0: Loss 1.7890679836273193
Epoch 10: Loss 1.7858184576034546
Epoch 20: Loss 1.7829492092132568
Epoch 30: Loss 1.7803783416748047
Epoch 40: Loss 1.7780842781066895
Epoch 50: Loss 1.7760207653045654
Epoch 60: Loss 1.7741501331329346
Epoch 70: Loss 1.7724167108535767
Epoch 80: Loss 1.7709460258483887
Epoch 90: Loss 1.769783854484558
Epoch 100: Loss 1.7687232494354248


In [131]:
accuracy = test(model12, features, labels, val_mask)
agg_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.2099


### Summary of Experiment 4

In [132]:
df_e4 = pd.DataFrame()
df_e4['Aggregation type'] = agg
df_e4['Validation accuracy'] = agg_acc
df_e4

Unnamed: 0,Aggregation type,Validation accuracy
0,sum,0.70987654
1,max,0.20987654
2,mean,0.20987654


## Extra - Experiment 5: optimizer

In [190]:
optim_lst = ['SGD', 'SGD with momentum', 'AdaGrad', 'RMSProp', 'Adam']
optim_acc = [df_e4['Validation accuracy'].max()]

In [191]:
class myGraphNetwork4_66_relu_sum(nn.Module):
    def __init__(self, g, in_dim, out_dim):
        super(myGraphNetwork4_66_relu_sum, self).__init__()
        self.g = g 
        hidden_dim = 58
        agg = 'sum'
        self.layer1 = GINConv(nn.Linear(in_dim, hidden_dim), agg)
        self.layer2 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer3 = GINConv(nn.Linear(hidden_dim, hidden_dim), agg)
        self.layer4 = GINConv(nn.Linear(hidden_dim, out_dim), agg)
        
    def forward(self, h):
        h = F.relu(self.layer1(self.g, h))
        h = F.relu(self.layer2(self.g, h))
        h = F.relu(self.layer3(self.g, h))
        h = self.layer4(self.g, h)
        return h

### SGD with momentum

In [192]:
model13 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
model13 = model13.to(device)

In [193]:
learning_rate = 0.01
optimizer = optim.SGD(model13.parameters(), lr=learning_rate, momentum=0.9)

In [194]:
train(model13, features, labels)

Epoch 0: Loss 1.7924765348434448
Epoch 10: Loss 1.9238611459732056
Epoch 20: Loss 1.552559733390808
Epoch 30: Loss 1.311308741569519
Epoch 40: Loss 1.1021270751953125
Epoch 50: Loss 0.9836669564247131
Epoch 60: Loss 1.3756468296051025
Epoch 70: Loss 1.4082181453704834
Epoch 80: Loss 1.196549415588379
Epoch 90: Loss 0.9569140672683716
Epoch 100: Loss 0.8223423957824707


In [195]:
accuracy = test(model13, features, labels, val_mask)
optim_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.7407


### AdaGrad

In [196]:
model14 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
model14 = model14.to(device)

In [197]:
learning_rate = 0.01
optimizer = optim.Adagrad(model14.parameters(), lr=learning_rate)

In [198]:
train(model14, features, labels)

Epoch 0: Loss 1.9704004526138306
Epoch 10: Loss 1.2413208484649658
Epoch 20: Loss 0.902840793132782
Epoch 30: Loss 0.7086089253425598
Epoch 40: Loss 0.38454127311706543
Epoch 50: Loss 0.25842246413230896
Epoch 60: Loss 0.1769508570432663
Epoch 70: Loss 0.13542187213897705
Epoch 80: Loss 0.12466207891702652
Epoch 90: Loss 0.09204290062189102
Epoch 100: Loss 0.08778762817382812


In [199]:
accuracy = test(model14, features, labels, val_mask)
optim_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.6975


### RMSprop

In [200]:
model15 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
model15 = model15.to(device)

In [201]:
learning_rate = 0.01
optimizer = optim.RMSprop(model15.parameters(), lr=learning_rate)

In [202]:
train(model15, features, labels)

Epoch 0: Loss 2.1085548400878906
Epoch 10: Loss 1.7039128541946411
Epoch 20: Loss 1.6387487649917603
Epoch 30: Loss 1.5753117799758911
Epoch 40: Loss 1.4984203577041626
Epoch 50: Loss 1.481108546257019
Epoch 60: Loss 1.7379919290542603
Epoch 70: Loss 1.4482723474502563
Epoch 80: Loss 1.324009895324707
Epoch 90: Loss 1.4063860177993774
Epoch 100: Loss 1.106605052947998


In [203]:
accuracy = test(model15, features, labels, val_mask)
optim_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.5556


### Adam

In [204]:
model16 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
model4_50_tanh_sum = model16.to(device)

In [205]:
learning_rate = 0.01
optimizer = optim.Adam(model16.parameters(), lr=learning_rate)

In [206]:
train(model16, features, labels)

Epoch 0: Loss 1.8946750164031982
Epoch 10: Loss 1.7216644287109375
Epoch 20: Loss 1.2966715097427368
Epoch 30: Loss 1.0323402881622314
Epoch 40: Loss 0.5646226406097412
Epoch 50: Loss 0.28871235251426697
Epoch 60: Loss 0.11546384543180466
Epoch 70: Loss 0.049794431775808334
Epoch 80: Loss 0.02474713698029518
Epoch 90: Loss 0.015269091352820396
Epoch 100: Loss 0.012337084859609604


In [207]:
accuracy = test(model16, features, labels, val_mask)
optim_acc.append(accuracy.numpy())
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.6811


### Summary of Experiment 5

In [208]:
df_e5 = pd.DataFrame()
df_e5['Optimizer'] = optim_lst
df_e5['Validation accuracy'] = optim_acc
df_e5

Unnamed: 0,Optimizer,Validation accuracy
0,SGD,0.70987654
1,SGD with momentum,0.7407407
2,AdaGrad,0.69753087
3,RMSProp,0.5555556
4,Adam,0.68106997


## Extra - Experiment 6: learning rate

### Multiple by 10

In [211]:
lr_lst_10 = [0.1, 0.01, 0.001, 0.0001, 0.00001]
lr_acc_10 = []

In [212]:
for value in lr_lst_10:
    model17 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
    model17 = model17.to(device)
    optimizer = optim.SGD(model17.parameters(), lr=value, momentum=0.9)
    train(model17, features, labels)
    accuracy = test(model17, features, labels, val_mask)
    lr_acc_10.append(accuracy.numpy())
    print("Testing Acc {:.4}".format(accuracy))

Epoch 0: Loss 2.05486798286438
Epoch 10: Loss 1.9975560903549194
Epoch 20: Loss 1.998895287513733
Epoch 30: Loss 1.7801066637039185
Epoch 40: Loss 1.7694134712219238
Epoch 50: Loss 1.7647286653518677
Epoch 60: Loss 1.75750732421875
Epoch 70: Loss 1.7576502561569214
Epoch 80: Loss 1.7568002939224243
Epoch 90: Loss 1.7567548751831055
Epoch 100: Loss 1.756699562072754
Testing Acc 0.2099
Epoch 0: Loss 2.374875783920288
Epoch 10: Loss 2.10815691947937
Epoch 20: Loss 1.605316400527954
Epoch 30: Loss 1.4654438495635986
Epoch 40: Loss 1.2687491178512573
Epoch 50: Loss 1.1149394512176514
Epoch 60: Loss 0.9208343625068665
Epoch 70: Loss 0.8930579423904419
Epoch 80: Loss 0.7473723888397217
Epoch 90: Loss 0.5972433686256409
Epoch 100: Loss 0.5447007417678833
Testing Acc 0.7202
Epoch 0: Loss 2.147435188293457
Epoch 10: Loss 1.655158519744873
Epoch 20: Loss 1.5986064672470093
Epoch 30: Loss 1.5336440801620483
Epoch 40: Loss 1.4798041582107544
Epoch 50: Loss 1.4279979467391968
Epoch 60: Loss 1.376491

In [213]:
df_e6_10 = pd.DataFrame()
df_e6_10['Learning rate'] = lr_lst_10
df_e6_10['Validation accuracy'] = lr_acc_10
df_e6_10

Unnamed: 0,Learning rate,Validation accuracy
0,0.1,0.20987654
1,0.01,0.7201646
2,0.001,0.6522634
3,0.0001,0.39711934
4,1e-05,0.28806585


### Summary of Experiment 6

In [228]:
lr_lst = [0.006, 0.007, 0.008, 0.009, 0.01, 0.011, 0.012, 0.013, 0.014, 0.015]
lr_acc = []

In [229]:
for value in lr_lst:
    model18 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
    model18 = model18.to(device)
    optimizer = optim.SGD(model18.parameters(), lr=value, momentum=0.9)
    train(model18, features, labels)
    accuracy = test(model18, features, labels, val_mask)
    lr_acc.append(accuracy.numpy())
    print("Testing Acc {:.4}".format(accuracy))

Epoch 0: Loss 1.8444477319717407
Epoch 10: Loss 1.5645265579223633
Epoch 20: Loss 1.3306379318237305
Epoch 30: Loss 1.132584571838379
Epoch 40: Loss 1.4066513776779175
Epoch 50: Loss 1.6774499416351318
Epoch 60: Loss 1.3029446601867676
Epoch 70: Loss 1.0802726745605469
Epoch 80: Loss 0.933870792388916
Epoch 90: Loss 1.0357813835144043
Epoch 100: Loss 0.8742549419403076
Testing Acc 0.7284
Epoch 0: Loss 2.057295083999634
Epoch 10: Loss 1.5349503755569458
Epoch 20: Loss 1.3401957750320435
Epoch 30: Loss 1.30058753490448
Epoch 40: Loss 1.8948993682861328
Epoch 50: Loss 1.437394380569458
Epoch 60: Loss 1.151200532913208
Epoch 70: Loss 0.9924880862236023
Epoch 80: Loss 0.8526432514190674
Epoch 90: Loss 0.7443626523017883
Epoch 100: Loss 0.6352582573890686
Testing Acc 0.7469
Epoch 0: Loss 2.6548244953155518
Epoch 10: Loss 1.8621271848678589
Epoch 20: Loss 1.6969033479690552
Epoch 30: Loss 1.4588892459869385
Epoch 40: Loss 1.271472454071045
Epoch 50: Loss 1.1687383651733398
Epoch 60: Loss 1.32

In [230]:
df_e6 = pd.DataFrame()
df_e6['Learning rate'] = lr_lst
df_e6['Validation accuracy'] = lr_acc
df_e6

Unnamed: 0,Learning rate,Validation accuracy
0,0.006,0.72839504
1,0.007,0.74691355
2,0.008,0.74485594
3,0.009,0.74485594
4,0.01,0.6111111
5,0.011,0.73045266
6,0.012,0.6234568
7,0.013,0.6707819
8,0.014,0.66049385
9,0.015,0.5720165


### The model with the best learning rate

In [245]:
model19 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
model19 = model19.to(device)
optimizer = optim.SGD(model19.parameters(), lr=0.007, momentum=0.9)
train(model19, features, labels)
accuracy = test(model19, features, labels, val_mask)
print("Testing Acc {:.4}".format(accuracy))

Epoch 0: Loss 1.7858325242996216
Epoch 10: Loss 1.5501834154129028
Epoch 20: Loss 1.3458166122436523
Epoch 30: Loss 1.2724180221557617
Epoch 40: Loss 1.342422366142273
Epoch 50: Loss 2.224426031112671
Epoch 60: Loss 1.3587607145309448
Epoch 70: Loss 1.1013319492340088
Epoch 80: Loss 1.5636093616485596
Epoch 90: Loss 0.966677188873291
Epoch 100: Loss 0.7579297423362732
Testing Acc 0.7654


## Save & Load Model

### Function: Save checkpoint

In [232]:
def save_checkpoint(best_model_path, model):
    state = {'state_dict': model.state_dict()}
    torch.save(state, best_model_path)
    print('model saved to %s' % best_model_path)

### Function: Load checkpoint

In [233]:
def load_checkpoint(best_model_path, model):
    state = torch.load(best_model_path)
    model.load_state_dict(state['state_dict'])
    print('model loaded from %s' % best_model_path)

### Function: Train and save model

In [234]:
def train_save(model, features, labels, name):
    model.train()
    for epoch in range(100):
            
        features = torch.tensor(features, dtype=torch.float)
        labels = torch.tensor(labels, dtype=torch.long)

        logits = model(features)
        preds = F.log_softmax(logits, 1)
        loss = F.nll_loss(preds, labels)
    
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if epoch % 10 == 0:
            print("Epoch {}: Loss {}".format(epoch, loss.item()))

    # Save model at the last epoch
    save_checkpoint(name, model)
    

### Save the best model

The best model: model19

Parameters:

GinConv layers = 4

Dimensions of hidden features = 66

Activation function = ReLU

Aggregation type = sum

Optimizer = SGD with momentum 0.9

Learning rate = 0.007

In [246]:
model19 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
model19 = model19.to(device)
optimizer = optim.SGD(model19.parameters(), lr=0.007, momentum=0.9)
train_save(model19, features, labels, "best_model.pth")

Epoch 0: Loss 1.877258539199829
Epoch 10: Loss 1.6734429597854614
Epoch 20: Loss 1.513291358947754
Epoch 30: Loss 1.4266899824142456
Epoch 40: Loss 1.350911259651184
Epoch 50: Loss 1.2881391048431396
Epoch 60: Loss 1.3823097944259644
Epoch 70: Loss 1.3270572423934937
Epoch 80: Loss 1.2522754669189453
Epoch 90: Loss 1.1710561513900757
model saved to best_model.pth


In [247]:
test(model19, features, labels, dataset['val_mask'])

tensor(0.7840)

In [248]:
load_checkpoint("best_model.pth", model19)

model loaded from best_model.pth


In [249]:
test(model19, features, labels, dataset['val_mask'])

tensor(0.7840)

## Checkpoint for marker

In [None]:
# Please intput the test_mask in this cell
test_mask = dataset['val_mask']

In [None]:
features = dataset['features']
labels = dataset['labels']

accuracy = test(model19, features, labels, test_mask)
print("Testing Acc {:.4}".format(accuracy))

## Best model with random features

### Generate random feature matrix

In [250]:
np.random.seed(5008)
rand_features = np.random.rand(features.shape[0], features.shape[1])
print(rand_features)
print(rand_features.shape)

[[0.25433381 0.76973336 0.66314725 ... 0.79413662 0.38325593 0.04904157]
 [0.78520706 0.16452047 0.67895884 ... 0.62678496 0.98376768 0.83075035]
 [0.77473306 0.68723945 0.33769582 ... 0.79266182 0.01727338 0.1213659 ]
 ...
 [0.78565393 0.65095739 0.72731653 ... 0.37491551 0.53464057 0.61191728]
 [0.37093873 0.1871571  0.03733613 ... 0.08522268 0.51496229 0.57879899]
 [0.03107564 0.04972252 0.48485482 ... 0.07298347 0.17834103 0.90934589]]
(3327, 3703)


In [251]:
# Check the shape
rand_features.shape == features.shape

True

### Retrain the best model

In [252]:
model19 = myGraphNetwork4_66_relu_sum(dgl_G, 3703, num_classes)
model19 = model19.to(device)
optimizer = optim.SGD(model19.parameters(), lr=0.007, momentum=0.9)
train_save(model19, rand_features, labels, "best_model_random.pth")

Epoch 0: Loss 97.3411865234375
Epoch 10: Loss 1.8097028732299805
Epoch 20: Loss 119734312.0
Epoch 30: Loss 1.8146690130233765
Epoch 40: Loss 1.8129675388336182
Epoch 50: Loss 1.8066030740737915
Epoch 60: Loss 1.7995305061340332
Epoch 70: Loss 1.7932021617889404
Epoch 80: Loss 1.7879424095153809
Epoch 90: Loss 1.7836921215057373
model saved to best_model_random.pth


In [253]:
accuracy = test(model19,  rand_features, labels, dataset['val_mask'])
print("Testing Acc {:.4}".format(accuracy))

Testing Acc 0.2099


### Comparison and comment

From the 2 trainings, we can observe the following result of validation accuracy:

Given features: 78.4%

Random features: 20.99%

Through out the whole Q2, I have done 6 experiments to find out the best fitting model with high node classfication accuracy by adjusting the parameters. And the nodes and edges of network graph is set up based on the given datasets.

With the given features, the model can achieve ~75% validation accuracy, which means the model quite well-fitted to the data. When the features are randomized, the same model can only achieve extremely low validation accuracy of ~20%. **This implies that the structure of the network provided is highly related to the node features**.

For example, this network may be representing connection of people. Nodes are the people and features are their demographic information, for example, sex, age, income, etc. It is reasonable that the people within the same age will have higher probability to be connected and it is the same for income. **The features are factors affecting if edges among nodes exist**. This is the cause of the huge difference in validation accuracy of given and random features training the same model.