## SGC model using torch geometric with Cora-Planetoid

In [34]:
import torch
from torch_geometric.datasets import Planetoid
from torch_geometric.nn import SGConv
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import torch.nn.functional as F

### IMPORT DATA

In [35]:
path = "Cora"
dataset = Planetoid(path, "Cora")
data = dataset[0] #tesnor representation of data
print("Cora: ", data)

Cora:  Data(x=[2708, 1433], edge_index=[2, 10556], y=[2708], train_mask=[2708], val_mask=[2708], test_mask=[2708])


Type: Citation network
Nodes: Papers (documents)
Edges: Citations between papers
Node Features: Word frequency vectors extracted from the text of each paper
Labels: Each paper belongs to one of seven categories:
Case_Based
Genetic_Algorithms
Neural_Networks
Probabilistic_Methods
Reinforcement_Learning
Rule_Learning
Theory

x represents the feature nodes.
y represents the label of each node therefore it's just one vector.
We have 2708 nodes, 1433 features for each node and 10556 edges.

In [36]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'

In [37]:
print(data.y)
unique_classes = torch.unique(data.y)
print(unique_classes)
# Get the number of classes
num_classes = unique_classes.size(0)

print("Number of classes:", num_classes)
print("Number of classes:", dataset.num_classes)

tensor([3, 4, 4,  ..., 3, 3, 3])
tensor([0, 1, 2, 3, 4, 5, 6])
Number of classes: 7
Number of classes: 7


In [38]:
print(data.edge_index.shape)

torch.Size([2, 10556])


### CONSTRUCT THE MODEL

In [39]:
SGC_model = SGConv(in_channels=data.num_features, #number of features
                   out_channels=dataset.num_classes, #dimension of embedding
                   k=1, #number of avergaing iterations that we want
                   cached=True # True if we want to use the same mathematical computation for all the different layers
                   )

#out_channel depends on the type of embedding you want to do and the dimensionality of embedded space 
# -> from 1433 dimensions to the number of classes is the optimal one
#the optimal number of out channels in the case of classification is going to be the number of classes that each node is assigned to
# here we have 7 classes

if we want to take our data into an embedding space representation then we could easily do that by just passing our data into this model and then getting the output.

The model is just doing an averaging over the feature nodes using edge data.

### GET EMBEDDING

We are just averaging over the data features using the edge information so the raw embedding model that we get has the following form (we are skipping the training step)

We are using the SGC_model to perform the forward pass on the input features (data.x) and edge information (data.edge_index), obtaining node embeddings.

In [40]:
print("Shape of the original data: ", data.x.shape)
print("Shape of the raw embedding data: ", SGC_model(data.x, data.edge_index).shape)

Shape of the original data:  torch.Size([2708, 1433])
Shape of the raw embedding data:  torch.Size([2708, 7])


### CONSTRUCT THE MODEL AGAIN FOR NODE CLASSIFICATION

We want to use the same model however we want to optimize the parameters such as weight and bias so that we can get a good classification result.

We are going to repeat the construction of the model but this time we are going to say it's going to be for classification.

Here we are contructing the SCG as a neural network model so we could add more layers, do a back propagation on the parameters given a loss function.

In [44]:
class SGCNet(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = SGConv(
            in_channels=dataset.num_features,
            out_channels=dataset.num_classes,
            K=2,
            cached=True,
        )

    def forward(self):
        x = self.conv1(data.x, data.edge_index) # applying convoulution to data
        return F.log_softmax(x, dim=1)


INIT: What we need to do when constructing a class of nn in a pytorch rep is to define specific layers in init function and then use a forward method to define how these specific layers have to be constructed.

FORWARD: forward model is this part of the embedding that we have -> SGC_model(data.x, data.edge_index)
so we are just passing the data which is already known with its features and edges into our model which is conv1 and we are getting the output.

OUTPUT: the output is just a logarthimic softmax. This is because we are constructing this class specifically for classification. -> Output is 7 or whatever the number of classes is and we want this to be a probability. We want to use this output in order to get a prabability of a specific label that's why we are setting it as the number of the classes. 
If we just wanted an embedding represenation, we could define the output based on whatever number of embeddings we want.

In [45]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
SGC_model, data = SGCNet().to(device), data.to(device)
optimizer = torch.optim.Adam(SGC_model.parameters(), lr=0.2, weight_decay=0.005)

here we don't need to introduce a loader and batch optimization because the simplified graph convolution is very fast compared to Node2Vec which is a probabilistic approach and requires an iterative function where as the SCG is just a simple algebra multiplication in order to get an average.

## What are the Learning Parameters?

In [47]:
#here at first you had SGC_model and the output was juct lin.weight but after adding SGCNet() as your model it became conv1.lin.weight
for i, parameter in SGC_model.named_parameters():
    print("Parameter {}".format(i))
    print("Shape: ", parameter.shape)

Parameter conv1.lin.weight
Shape:  torch.Size([7, 1433])
Parameter conv1.lin.bias
Shape:  torch.Size([7])


## TRAIN FUNCTION

### what we do in this function?
1. get the data, 
2. pass the data onto the model that we have constructed, 
3. get the corresponding probability which is technically output
4. compare those outputs with our true labels
5. optimize the model based on those true labels


In [51]:
def train():
    SGC_model.train() # set the model.training to be true
    optimizer.zero_grad() # reset the gradient
    predicted_y = SGC_model() # our result from the model when it's beeen trained in log softmax prob
    true_y = data.y # true labels
    # find the error differnce between true y and predicted y based on the training data
    # F.nll_loss(SGC_model()[data.train_mask], data.y[data.train_mask]).backward()
    losses = F.nll_loss(predicted_y[data.train_mask], true_y[data.train_mask])
    losses.backward() # backward propagation
    optimizer.step() # update the parameters such that it minimizes the losses


## TEST FUNCTION

In [50]:
def test():
    SGC_model.eval() # set the model.training to be false
    logits = SGC_model() # compute log probability of all data
    accs = []
    for _, mask in data('train_mask', 'val_mask', 'test_mask'):
        pred = logits[mask].max(1)[1] #transforming log probability to actual labels so instead of having 0.2 which is prob we predict the number of class e.x. 3
        acc = pred.eq(data.y[mask]).sum().item() / mask.sum().item()
        accs.append(acc)
    return accs

## PUTTING IT ALL TOGETHER

In [52]:
best_val_acc = test_acc = 0
for epoch in range(1, 101):
    train()
    train_acc, val_acc, tmp_test_acc = test()
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        test_acc = tmp_test_acc
    print(f'Epoch: {epoch:03d}, Train: {train_acc:.4f}, '
          f'Val: {best_val_acc:.4f}, Test: {test_acc:.4f}')

Epoch: 001, Train: 0.9500, Val: 0.6180, Test: 0.6440
Epoch: 002, Train: 0.9500, Val: 0.7580, Test: 0.7790
Epoch: 003, Train: 0.9786, Val: 0.7660, Test: 0.7790
Epoch: 004, Train: 0.9857, Val: 0.7660, Test: 0.7790
Epoch: 005, Train: 0.9714, Val: 0.7660, Test: 0.7790
Epoch: 006, Train: 0.9786, Val: 0.7660, Test: 0.7790
Epoch: 007, Train: 1.0000, Val: 0.7660, Test: 0.7790
Epoch: 008, Train: 1.0000, Val: 0.7780, Test: 0.8110
Epoch: 009, Train: 1.0000, Val: 0.7880, Test: 0.8150
Epoch: 010, Train: 0.9929, Val: 0.7880, Test: 0.8150
Epoch: 011, Train: 0.9857, Val: 0.7880, Test: 0.8150
Epoch: 012, Train: 0.9857, Val: 0.7880, Test: 0.8150
Epoch: 013, Train: 0.9929, Val: 0.7880, Test: 0.8150
Epoch: 014, Train: 1.0000, Val: 0.7880, Test: 0.8150
Epoch: 015, Train: 1.0000, Val: 0.7880, Test: 0.8150
Epoch: 016, Train: 1.0000, Val: 0.7880, Test: 0.8150
Epoch: 017, Train: 0.9929, Val: 0.7880, Test: 0.8150
Epoch: 018, Train: 0.9929, Val: 0.7880, Test: 0.8150
Epoch: 019, Train: 1.0000, Val: 0.7900, Test: 

MUCH FASTER!

A more comprehensive implementation of:
https://github.com/pyg-team/pytorch_geometric/blob/master/examples/sgc.py