# Hands on tutorial of Graph Representation Learning

In this lecture, we will go through the following topics
1. Graph structued data in Python
2. GNN package - Pytorch-Geometric introduction
3. Representation in Graphs - node2vec, GCN
4. GNN with downstrean tasks
5. GNN for text classification


In [99]:
import torch
import os.path as osp
import torch.nn as nn
from torch_geometric.datasets import Planetoid
from torch_geometric.transforms import NormalizeFeatures

## Obtaining node embedding with Graph Neural Network
For a GNN layer, Message Passing is defined as the process of taking node features of the neighbours, transforming them, and "passing" them to the source node. This process is repeated, in parallel, for all nodes in the graph. In that way, all neighbourhoods are examined by the end of this step.

### Node2vec v.s. GNN
Previously, we learned how to obtain node embedding with Node2vec. However, Node2vec uses the embedding layer as the node feature **without considering the node feature**. GNN, on the contrary, adopts a different way to learn node embedding via the so-called **Message Passing** function or **Neighbor Aggregation** operation which obtains embedding by aggregating information from neighboring nodes.

### A simple GNN
Let first build a basic GNN to demonstrate the message passing of GNN.
The node embedding of each node is defined as follows:

\begin{equation}
z^k_i = \sum_{j\in N(i)} z^{k-1}_j
\end{equation}

![](https://i.imgur.com/77VBsTd.png)

In [100]:
from torch_geometric.nn import MessagePassing

In [101]:
class SimpleGNN(MessagePassing):
    def __init__(self,aggr="sum"):
        super().__init__(aggr=aggr)

    def forward(self, x, edge_index):
        return self.propagate(edge_index,x=x)
    
    def message(self, x_j):
        # x_i, x_j has shape [E, in_channels]
        print("x_j(傳送方):\n", x_j)
        return x_j

### Bidrectional edges

In [102]:
from torch_geometric.utils import to_undirected
x = torch.tensor([[0,1,2],[1,4,16],[2,4,8]])
edge_index = [[0,1],[2,1]]
edge_index = torch.tensor(edge_index).T
edge_index = to_undirected(edge_index)

print(f"Feature : \n{x}")
print(f"Edge_index : \n{edge_index}")

Feature : 
tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])
Edge_index : 
tensor([[0, 1, 1, 2],
        [1, 0, 2, 1]])


In [103]:
model = SimpleGNN()
model
model(x,edge_index)

x_j(傳送方):
 tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 1,  4, 16],
        [ 2,  4,  8]])


tensor([[ 1,  4, 16],
        [ 2,  5, 10],
        [ 1,  4, 16]])

### Adding self-connection


In [104]:
x = torch.tensor([[0,1,2],[1,4,16],[2,4,8]])
edge_index = [[0,1],[2,1]] + [[0,0],[1,1],[2,2]]
edge_index = torch.tensor(edge_index).T

print(f"Feature : \n{x}")
print(f"Edge_index : \n{edge_index}")

Feature : 
tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])
Edge_index : 
tensor([[0, 2, 0, 1, 2],
        [1, 1, 0, 1, 2]])


In [105]:
model = SimpleGNN()
model(x,edge_index)

x_j(傳送方):
 tensor([[ 0,  1,  2],
        [ 2,  4,  8],
        [ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])


tensor([[ 0,  1,  2],
        [ 3,  9, 26],
        [ 2,  4,  8]])

###  More common usage: bidirectional edges with self-connection

In [106]:
from torch_geometric.utils import add_self_loops
# bidirectional edges with self-connection
x = torch.tensor([[0,1,2],[1,4,16],[2,4,8]])
edge_index = [[0,1],[2,1]]
print(f"Feature : \n{x}")
print(f"Edge_index : \n{torch.tensor(edge_index).T}")
print()

edge_index = torch.tensor(edge_index).T
edge_index = to_undirected(edge_index)
print(f"After to undirecred : \n{edge_index}")
edge_index, _ = add_self_loops(edge_index)
print(f"After add self loop : \n{edge_index}")
print()

model = SimpleGNN()
model(x,edge_index)

Feature : 
tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])
Edge_index : 
tensor([[0, 2],
        [1, 1]])

After to undirecred : 
tensor([[0, 1, 1, 2],
        [1, 0, 2, 1]])
After add self loop : 
tensor([[0, 1, 1, 2, 0, 1, 2],
        [1, 0, 2, 1, 0, 1, 2]])

x_j(傳送方):
 tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 1,  4, 16],
        [ 2,  4,  8],
        [ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])


tensor([[ 1,  5, 18],
        [ 3,  9, 26],
        [ 3,  8, 24]])

### Differnt pooling function
There are various pooling function you can choose during the message passing stage to control how those information was aggregated. <br>
**Sum, Mean and Max Pooling** are the most common and widely used operation.

In [107]:
# another way to implement self-connection
class SimpleGNN(MessagePassing):
    def __init__(self,**kwagrs):
        super().__init__(**kwagrs)

    def forward(self, x, edge_index):
        return self.propagate(edge_index,x=x)

In [108]:
# bidirectional edges with self-connection
x = torch.tensor([[0,1,2],[1,4,16],[2,4,8]])
edge_index = [[0,1],[2,1]]
print(f"Feature : \n{x}")
edge_index = torch.tensor(edge_index).T
edge_index = to_undirected(edge_index)
edge_index, _ = add_self_loops(edge_index)
print(f"Edge_index : \n{edge_index}")
print()


Feature : 
tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])
Edge_index : 
tensor([[0, 1, 1, 2, 0, 1, 2],
        [1, 0, 2, 1, 0, 1, 2]])



In [109]:
# sum pooling
model = SimpleGNN(aggr="sum")
print("Results of sum pooling...")
print(model(x,edge_index))

Results of sum pooling...
tensor([[ 1,  5, 18],
        [ 3,  9, 26],
        [ 3,  8, 24]])


In [110]:
# mean pooling
model = SimpleGNN(aggr="mean")
print("Results of mean pooling...")
print(model(x,edge_index))

Results of mean pooling...
tensor([[ 0.5000,  2.5000,  9.0000],
        [ 1.0000,  3.0000,  8.6667],
        [ 1.5000,  4.0000, 12.0000]])


In [111]:
# max pooling
model = SimpleGNN(aggr="max")
print("Results of max pooling...")
print(model(x,edge_index))

Results of max pooling...
tensor([[ 1,  4, 16],
        [ 2,  4, 16],
        [ 2,  4, 16]])


### The most powerful techniques of GNN: Exploring multi-hop neighbor by stacking GNN layers

In [112]:
# bidirectional edges with self-connection
x = torch.tensor([[0,1,2],[1,4,16],[2,4,8]])
edge_index = [[0,1],[2,1]]
print(f"Feature : \n{x}")
edge_index = torch.tensor(edge_index).T
edge_index = to_undirected(edge_index)
edge_index, _ = add_self_loops(edge_index)
print(f"Edge_index : \n{edge_index}")
print()

model = SimpleGNN(aggr="sum")
# first-hop neighbor
print("Embedding after first message passing...")
first_embedding = model(x,edge_index)
print(first_embedding)

# second-hop neighbor
print("Embedding after second message passing...")
second_embedding = model(first_embedding,edge_index)
print(second_embedding)

Feature : 
tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])
Edge_index : 
tensor([[0, 1, 1, 2, 0, 1, 2],
        [1, 0, 2, 1, 0, 1, 2]])

Embedding after first message passing...
tensor([[ 1,  5, 18],
        [ 3,  9, 26],
        [ 3,  8, 24]])
Embedding after second message passing...
tensor([[ 4, 14, 44],
        [ 7, 22, 68],
        [ 6, 17, 50]])


In [113]:
# another way to implement multi-hop message passing
class SimpleGNN(MessagePassing):
    def __init__(self,n_layers=1,**kwagrs):
        super().__init__(**kwagrs)
        self.n_layers = n_layers

    def forward(self, x, edge_index):
        for _ in range(self.n_layers):
            x = self.propagate(edge_index,x=x)
        return x

In [114]:
# bidirectional edges with self-connection
x = torch.tensor([[0,1,2],[1,4,16],[2,4,8]])
edge_index = [[0,1],[2,1]]
print(f"Feature : \n{x}")
edge_index = torch.tensor(edge_index).T
edge_index = to_undirected(edge_index)
edge_index, _ = add_self_loops(edge_index)
print(f"Edge_index : \n{edge_index}")
print()

# first-hop neighbor
print("Embedding after first message passing...")
model = SimpleGNN(n_layers=1,aggr="sum")
first_embedding = model(x,edge_index)
print(first_embedding)

# second-hop neighbor
print("Embedding after second message passing...")
model = SimpleGNN(n_layers=2,aggr="sum")
second_embedding = model(x,edge_index)
print(second_embedding)

Feature : 
tensor([[ 0,  1,  2],
        [ 1,  4, 16],
        [ 2,  4,  8]])
Edge_index : 
tensor([[0, 1, 1, 2, 0, 1, 2],
        [1, 0, 2, 1, 0, 1, 2]])

Embedding after first message passing...
tensor([[ 1,  5, 18],
        [ 3,  9, 26],
        [ 3,  8, 24]])
Embedding after second message passing...
tensor([[ 4, 14, 44],
        [ 7, 22, 68],
        [ 6, 17, 50]])


### Neural Networks in GNN

In [115]:
# another way to implement multi-hop message passing
class CoolGNN(MessagePassing):
    def __init__(self,n_layers=1,**kwagrs):
        super().__init__(**kwagrs)
        self.n_layers = n_layers
        self.nn_layers = nn.ModuleList([nn.Linear(3,3) for _ in range(self.n_layers)])

    def forward(self, x, edge_index):
        for i in range(self.n_layers):
            x = self.nn_layers[i](self.propagate(edge_index,x=x))
            x = torch.relu(x)
        return x

In [116]:
# bidirectional edges with self-connection
x = torch.tensor([[0,1,2],[1,4,16],[2,4,8]]).float()
edge_index = [[0,1],[2,1]]
print(f"Feature : \n{x}")
edge_index = torch.tensor(edge_index).T
edge_index = to_undirected(edge_index)
edge_index, _ = add_self_loops(edge_index)
print(f"Edge_index : \n{edge_index}")
print()

# first-hop neighbor
print("Embedding after first message passing...")
model = CoolGNN(n_layers=1,aggr="sum")
first_embedding = model(x,edge_index)
print(first_embedding)

# second-hop neighbor
print("Embedding after second message passing...")
model = CoolGNN(n_layers=2,aggr="sum")
second_embedding = model(x,edge_index)
print(second_embedding)

Feature : 
tensor([[ 0.,  1.,  2.],
        [ 1.,  4., 16.],
        [ 2.,  4.,  8.]])
Edge_index : 
tensor([[0, 1, 1, 2, 0, 1, 2],
        [1, 0, 2, 1, 0, 1, 2]])

Embedding after first message passing...
tensor([[ 3.6738,  6.2097,  8.1594],
        [ 5.7058,  7.5282, 11.9736],
        [ 4.9478,  6.9464, 10.9615]], grad_fn=<ReluBackward0>)
Embedding after second message passing...
tensor([[2.7559, 0.0000, 1.9131],
        [4.4094, 0.0000, 2.6781],
        [3.1373, 0.0000, 2.0895]], grad_fn=<ReluBackward0>)


In [117]:
# using off-the-shelf GNN models
from torch_geometric.nn import GCNConv
model = GCNConv(in_channels=3,out_channels=16)
embedding = model(x,edge_index)
print(embedding)

tensor([[-2.4967,  1.2497, -2.8771,  2.6529, -0.0846, -0.2688,  2.7973,  1.8156,
         -0.1035,  5.0228,  3.1054,  0.1204, -3.3846, -0.0241, -2.5805, -1.5108],
        [-3.1185,  2.0154, -3.4947,  3.4127, -0.0634, -0.3174,  4.0272,  2.2510,
          0.3824,  6.8852,  3.6210,  0.1992, -4.1637, -0.1134, -3.8709, -2.4533],
        [-3.4031,  2.2559, -4.0111,  3.7493,  0.0490, -0.3046,  4.4874,  2.5416,
          0.4344,  7.6852,  3.9839,  0.2340, -4.7661, -0.0743, -4.3400, -2.7084]],
       grad_fn=<AddBackward0>)


---