<a href="https://colab.research.google.com/github/tejaspradhan/Graph-Neural-Networks/blob/main/code/GNN_Spektral.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Installing Spektral

In [None]:
!pip install spektral

## Importing Libraries

In [33]:
import numpy as np
from spektral.datasets import TUDataset
from spektral.transforms import Degree, GCNFilter
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Dropout
from spektral.layers import GCNConv, GlobalSumPool
from spektral.data import BatchLoader

## Exploring the Dataset

In [5]:
dataset = TUDataset("PROTEINS")

Downloading PROTEINS dataset.


100%|█████████████████████████████████████████| 447k/447k [00:00<00:00, 945kB/s]


Successfully loaded PROTEINS.


In [6]:
dataset
# this dataset has 1113 distinct graphs

TUDataset(n_graphs=1113)

In [34]:
dataset.n_labels

2

Exploring 1 graph

In [24]:
graph = dataset[0]
graph

Graph(n_nodes=108, n_node_features=4, n_edge_features=None, n_labels=2)

Graph(n_nodes=42, n_node_features=4, n_edge_features=None, n_labels=2)

In [11]:
# Adjacency Matrix
graph.a

<42x42 sparse matrix of type '<class 'numpy.float64'>'
	with 162 stored elements in Compressed Sparse Row format>

In [13]:
# Node Feature Matrix
graph.x

array([[23.,  1.,  0.,  0.],
       [10.,  1.,  0.,  0.],
       [25.,  1.,  0.,  0.],
       [ 7.,  1.,  0.,  0.],
       [12.,  1.,  0.,  0.],
       [11.,  1.,  0.,  0.],
       [ 5.,  1.,  0.,  0.],
       [ 7.,  1.,  0.,  0.],
       [ 9.,  1.,  0.,  0.],
       [ 3.,  1.,  0.,  0.],
       [ 6.,  1.,  0.,  0.],
       [22.,  1.,  0.,  0.],
       [ 8.,  1.,  0.,  0.],
       [26.,  1.,  0.,  0.],
       [ 7.,  1.,  0.,  0.],
       [12.,  1.,  0.,  0.],
       [11.,  1.,  0.,  0.],
       [ 5.,  1.,  0.,  0.],
       [ 7.,  1.,  0.,  0.],
       [ 8.,  1.,  0.,  0.],
       [ 3.,  1.,  0.,  0.],
       [ 6.,  1.,  0.,  0.],
       [ 3.,  0.,  1.,  0.],
       [ 9.,  0.,  1.,  0.],
       [10.,  0.,  1.,  0.],
       [ 7.,  0.,  1.,  0.],
       [10.,  0.,  1.,  0.],
       [ 8.,  0.,  1.,  0.],
       [ 5.,  0.,  1.,  0.],
       [ 4.,  0.,  1.,  0.],
       [ 3.,  0.,  1.,  0.],
       [ 3.,  0.,  1.,  0.],
       [ 3.,  0.,  1.,  0.],
       [ 9.,  0.,  1.,  0.],
       [10.,  

In [16]:
# Edge Feature Matrix. No Edge Features for this graph
graph.e

In [17]:
# Label Matrix
graph.y

array([1., 0.])

Datasets also provide methods for applying transforms to each datum:

`apply(transform)` - modifies the dataset in-place, by applying the transform to each graph

`map(transform)` - returns a list obtained by applying the transform to each graph

`filter(function)` - removes from the dataset any graph for which function(graph) is False. This is also an in-place operation.

In [9]:
dataset.filter(lambda g: g.n_nodes < 500) # removing those graphs from the dataset which have >= 500 nodes
dataset 

TUDataset(n_graphs=1111)

## Preprocessing the Dataset for the GNN 

In [26]:
# Finding the maximum degree of the data
max_degree = int(dataset.map(lambda g: g.a.sum(-1).max(), reduce=max))

In [27]:
max_degree 

12

Augmenting the Dataset's Node Features with the Maximum Degree of any Graph 

In [31]:
dataset.apply(Degree(max_degree))

In [30]:
dataset[0].x

array([[23.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [10.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [25.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [ 7.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [12.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [11.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [ 5.,  1.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [ 7.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [ 9.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,
         0.,  0.,  0.,  0.],
       [ 3.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,
         

Doing Feature preprocessing for Graph Convolutional Layer

In [32]:
dataset.apply(GCNFilter())

## Creating the GNN

In [38]:
class GNN(Model):
  def __init__(self, n_hidden, n_labels):
    super().__init__()
    self.graph_conv = GCNConv(n_hidden)
    self.pool = GlobalSumPool()
    self.dropout = Dropout(0.5)
    self.dense = Dense(n_labels, 'softmax')
  
  def call(self, inputs):
    model = self.graph_conv(inputs)
    model = self.dropout(model)
    model = self.pool(model)
    model= self.dense(model)
    ### GCN --> Dropout--> Global Sum Pool --> Dense
    return model

In [39]:
model = GNN(32,dataset.n_labels) # Instantiating the Model

In [44]:
model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

## Training the GNN 

In [45]:
loader = BatchLoader(dataset, batch_size=32)

In [46]:
model.fit(loader.load(), steps_per_epoch=loader.steps_per_epoch, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f93c1a0f190>