> RGNN

ref : [official](https://pytorch-geometric.readthedocs.io/en/latest/get_started/colabs.html#official-examples), [vlog](https://medium.com/watcha/gnn-%EC%86%8C%EA%B0%9C-%EA%B8%B0%EC%B4%88%EB%B6%80%ED%84%B0-%EB%85%BC%EB%AC%B8%EA%B9%8C%EC%A7%80-96567b783479), [IEEE](https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4700287)

my own summary 메세지 전달 방식으로 메세지를 전달하고 업데이트하는 과정을 통해 수렴 조건을 만족할때까지 x의 상태를 업데이트하는 학습과정

# Recurrent Graph Neural Network

*Based on* [Banach Fixed Point Thm.](https://en.wikipedia.org/wiki/Banach_fixed-point_theorem)

::: {.callout-tip}
완비 거리 공간 $(x,d)$

축약 사상 $T : X \to X$ 축약 상수^[일정 값까지만 거리로 봄](Lipschitz continuity) $k \in [0,1)$

즉, $d(T(x),T(y) \le k d(x,y)$

1. $T$는 유일한 고정점 $\bar{x} \in X$를 갖는다.

2. 임의의 $x \in X$에 대해 $lim_{n \to \infty} T^n (x) = \bar{x}$
:::


$$x_{v}^{t+1} = f_w(l_v , l_{co(v)}, x^t_{ne(v)}, l_{ne(v)})$$

- $l_v$ 는 node 의 feature
- $l_{co(v)}$ 는 line between nodes의 feature
- $x^t_{ne(v)}$ 는 node와 연결된 node들의 상태
- $l_{ne(v)})$ 는 node와 연결된 line들의 feature

$$o^t_v = g_w(x^t_v, l_v)$$

# Tutorial 9: Recurrent GNNs


In this tutorial we will implement an approximation of the Graph Neural Network Model (without enforcing contraction map) and analyze the GatedGraph Convolution of Pytorch Geometric.

## Import

In [1]:
import os.path as osp
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch_geometric.transforms as T
import torch_geometric
from torch_geometric.datasets import Planetoid, TUDataset
from torch_geometric.data import DataLoader
from torch_geometric.nn.inits import uniform
from torch.nn import Parameter as Param
from torch import Tensor 
from torch_geometric.nn.conv import MessagePassing

In [2]:
torch.manual_seed(42)

<torch._C.Generator at 0x7f64b9c6d470>

In [3]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device = "cpu"

In [4]:
dataset = 'Cora'
transform = T.Compose([
    T.RandomNodeSplit('train_rest', num_val=500, num_test=500),
    T.TargetIndegree(),
])
path = osp.join('data', dataset)
dataset = Planetoid(path, dataset, transform=transform)
data = dataset[0]

In [6]:
dataset = 'Cora'
path = osp.join('data', dataset)
dataset = Planetoid(path, dataset, transform=T.NormalizeFeatures())
data = dataset[0]
data = data.to(device)

$$x_{v}^{t+1} = f_w(l_v , l_{co(v)}, x^t_{ne(v)}, l_{ne(v)})$$

$$o^t_v = g_w(x^t_v, l_v)$$

## Multi Layer Perceptron

The MLP class is used to instantiate the transition and output functions as simple feed forard networks

In [7]:
class MLP(nn.Module):
    def __init__(self, input_dim, hid_dims, out_dim):
        super(MLP, self).__init__()

        self.mlp = nn.Sequential()
        dims = [input_dim] + hid_dims + [out_dim]
        for i in range(len(dims)-1):
            self.mlp.add_module('lay_{}'.format(i),nn.Linear(in_features=dims[i], out_features=dims[i+1]))
            if i+2 < len(dims):
                self.mlp.add_module('act_{}'.format(i), nn.Tanh())
    def reset_parameters(self):
        for i, l in enumerate(self.mlp):
            if type(l) == nn.Linear:
                nn.init.xavier_normal_(l.weight)

    def forward(self, x):
        return self.mlp(x)



## Graph Neural Network MessagePassing

The GNNM calss puts together the state propagations and the readout of the nodes' states.

In [8]:
class GNNM(MessagePassing):
    def __init__(self, n_nodes, out_channels, features_dim, hid_dims, num_layers = 50, eps=1e-3, aggr = 'add',
                 bias = True, **kwargs):
        super(GNNM, self).__init__(aggr=aggr, **kwargs)

        self.node_states = Param(torch.zeros((n_nodes, features_dim)), requires_grad=False)
        self.out_channels = out_channels
        self.eps = eps
        self.num_layers = num_layers
        
        self.transition = MLP(features_dim, hid_dims, features_dim)
        self.readout = MLP(features_dim, hid_dims, out_channels)
        
        self.reset_parameters()
        print(self.transition)
        print(self.readout)

    def reset_parameters(self):
        self.transition.reset_parameters()
        self.readout.reset_parameters()
        
    def forward(self): 
        edge_index = data.edge_index
        edge_weight = data.edge_attr
        node_states = self.node_states
        for i in range(self.num_layers):
            m = self.propagate(edge_index, x=node_states, edge_weight=edge_weight,
                               size=None)
            new_states = self.transition(m)
            with torch.no_grad():
                distance = torch.norm(new_states - node_states, dim=1)
                convergence = distance < self.eps
            node_states = new_states
            if convergence.all():
                break
            
        out = self.readout(node_states)
        
        return F.log_softmax(out, dim=-1)

    def message(self, x_j, edge_weight):
        return x_j if edge_weight is None else edge_weight.view(-1, 1) * x_j

    def message_and_aggregate(self, adj_t, x) :
        return matmul(adj_t, x, reduce=self.aggr)

    def __repr__(self):
        return '{}({}, num_layers={})'.format(self.__class__.__name__,
                                              self.out_channels,
                                              self.num_layers)

In [9]:
model = GNNM(data.num_nodes, dataset.num_classes, 32, [64,64,64,64,64], eps=0.01).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
loss_fn = nn.CrossEntropyLoss()

MLP(
  (mlp): Sequential(
    (lay_0): Linear(in_features=32, out_features=64, bias=True)
    (act_0): Tanh()
    (lay_1): Linear(in_features=64, out_features=64, bias=True)
    (act_1): Tanh()
    (lay_2): Linear(in_features=64, out_features=64, bias=True)
    (act_2): Tanh()
    (lay_3): Linear(in_features=64, out_features=64, bias=True)
    (act_3): Tanh()
    (lay_4): Linear(in_features=64, out_features=64, bias=True)
    (act_4): Tanh()
    (lay_5): Linear(in_features=64, out_features=32, bias=True)
  )
)
MLP(
  (mlp): Sequential(
    (lay_0): Linear(in_features=32, out_features=64, bias=True)
    (act_0): Tanh()
    (lay_1): Linear(in_features=64, out_features=64, bias=True)
    (act_1): Tanh()
    (lay_2): Linear(in_features=64, out_features=64, bias=True)
    (act_2): Tanh()
    (lay_3): Linear(in_features=64, out_features=64, bias=True)
    (act_3): Tanh()
    (lay_4): Linear(in_features=64, out_features=64, bias=True)
    (act_4): Tanh()
    (lay_5): Linear(in_features=64, ou

In [10]:
test_dataset = dataset[:len(dataset) // 10]
train_dataset = dataset[len(dataset) // 10:]
test_loader = DataLoader(test_dataset)
train_loader = DataLoader(train_dataset)



In [11]:
def train():
    model.train()
    optimizer.zero_grad()
    loss_fn(model()[data.train_mask], data.y[data.train_mask]).backward()
    optimizer.step()


def test():
    model.eval()
    logits, accs = model(), []
    for _, mask in data('train_mask', 'val_mask', 'test_mask'):
        pred = logits[mask].max(1)[1]
        acc = pred.eq(data.y[mask]).sum().item() / mask.sum().item()
        accs.append(acc)
    return accs


for epoch in range(1, 51):
    train()
    accs = test()
    train_acc = accs[0]
    val_acc = accs[1]
    test_acc = accs[2]
    print('Epoch: {:03d}, Train Acc: {:.5f}, '
          'Val Acc: {:.5f}, Test Acc: {:.5f}'.format(epoch, train_acc,
                                                       val_acc, test_acc))

Epoch: 001, Train Acc: 0.12857, Val Acc: 0.14200, Test Acc: 0.13700
Epoch: 002, Train Acc: 0.13571, Val Acc: 0.15800, Test Acc: 0.14200
Epoch: 003, Train Acc: 0.08571, Val Acc: 0.09000, Test Acc: 0.06600
Epoch: 004, Train Acc: 0.14286, Val Acc: 0.20800, Test Acc: 0.20800
Epoch: 005, Train Acc: 0.15714, Val Acc: 0.27400, Test Acc: 0.28600
Epoch: 006, Train Acc: 0.11429, Val Acc: 0.24000, Test Acc: 0.24000
Epoch: 007, Train Acc: 0.15000, Val Acc: 0.17800, Test Acc: 0.18000
Epoch: 008, Train Acc: 0.17143, Val Acc: 0.12400, Test Acc: 0.11800
Epoch: 009, Train Acc: 0.17143, Val Acc: 0.07600, Test Acc: 0.07900
Epoch: 010, Train Acc: 0.14286, Val Acc: 0.05800, Test Acc: 0.06700
Epoch: 011, Train Acc: 0.15714, Val Acc: 0.10200, Test Acc: 0.09900
Epoch: 012, Train Acc: 0.18571, Val Acc: 0.12000, Test Acc: 0.11300
Epoch: 013, Train Acc: 0.17857, Val Acc: 0.11600, Test Acc: 0.11400
Epoch: 014, Train Acc: 0.18571, Val Acc: 0.12800, Test Acc: 0.11900
Epoch: 015, Train Acc: 0.22143, Val Acc: 0.10800

## Gated Graph Neural Network

In [12]:
class GatedGraphConv(MessagePassing):
    
    def __init__(self, out_channels, num_layers, aggr = 'add',
                 bias = True, **kwargs):
        super(GatedGraphConv, self).__init__(aggr=aggr, **kwargs)

        self.out_channels = out_channels
        self.num_layers = num_layers

        self.weight = Param(Tensor(num_layers, out_channels, out_channels))
        self.rnn = torch.nn.GRUCell(out_channels, out_channels, bias=bias)

        self.reset_parameters()

    def reset_parameters(self):
        uniform(self.out_channels, self.weight)
        self.rnn.reset_parameters()

    def forward(self, data):
        """"""
        x = data.x
        edge_index = data.edge_index
        edge_weight = data.edge_attr
        if x.size(-1) > self.out_channels:
            raise ValueError('The number of input channels is not allowed to '
                             'be larger than the number of output channels')

        if x.size(-1) < self.out_channels:
            zero = x.new_zeros(x.size(0), self.out_channels - x.size(-1))
            x = torch.cat([x, zero], dim=1)

        for i in range(self.num_layers):
            m = torch.matmul(x, self.weight[i])
            m = self.propagate(edge_index, x=m, edge_weight=edge_weight,
                               size=None)
            x = self.rnn(m, x)

        return x

    def message(self, x_j, edge_weight):
        return x_j if edge_weight is None else edge_weight.view(-1, 1) * x_j

    def message_and_aggregate(self, adj_t, x):
        return matmul(adj_t, x, reduce=self.aggr)

    def __repr__(self):
        return '{}({}, num_layers={})'.format(self.__class__.__name__,
                                              self.out_channels,
                                              self.num_layers)

class GGNN(torch.nn.Module):
    def __init__(self):
        super(GGNN, self).__init__()
        
        self.conv = GatedGraphConv(1433, 3)
        self.mlp = MLP(1433, [32,32,32], dataset.num_classes)
        
    def forward(self):
        x = self.conv(data)
        x = self.mlp(x)
        return F.log_softmax(x, dim=-1)


In [13]:
device = "cpu"
model = GGNN().to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
loss_fn = nn.CrossEntropyLoss()

In [14]:
test_dataset = dataset[:len(dataset) // 10]
train_dataset = dataset[len(dataset) // 10:]
test_loader = DataLoader(test_dataset)
train_loader = DataLoader(train_dataset)

In [15]:
def train():
    model.train()
    optimizer.zero_grad()
    loss_fn(model()[data.train_mask], data.y[data.train_mask]).backward()
    optimizer.step()


def test():
    model.eval()
    logits, accs = model(), []
    for _, mask in data('train_mask', 'val_mask', 'test_mask'):
        pred = logits[mask].max(1)[1]
        acc = pred.eq(data.y[mask]).sum().item() / mask.sum().item()
        accs.append(acc)
    return accs


for epoch in range(1, 51):
    train()
    accs = test()
    train_acc = accs[0]
    val_acc = accs[1]
    test_acc = accs[2]
    print('Epoch: {:03d}, Train Acc: {:.5f}, '
          'Val Acc: {:.5f}, Test Acc: {:.5f}'.format(epoch, train_acc,
                                                       val_acc, test_acc))

Epoch: 001, Train Acc: 0.15000, Val Acc: 0.16400, Test Acc: 0.16700
Epoch: 002, Train Acc: 0.14286, Val Acc: 0.12200, Test Acc: 0.13000
Epoch: 003, Train Acc: 0.32857, Val Acc: 0.22600, Test Acc: 0.23500
Epoch: 004, Train Acc: 0.37143, Val Acc: 0.27600, Test Acc: 0.27800
Epoch: 005, Train Acc: 0.41429, Val Acc: 0.29200, Test Acc: 0.31600
Epoch: 006, Train Acc: 0.50000, Val Acc: 0.33600, Test Acc: 0.36400
Epoch: 007, Train Acc: 0.52857, Val Acc: 0.33200, Test Acc: 0.34000
Epoch: 008, Train Acc: 0.54286, Val Acc: 0.34000, Test Acc: 0.35400
Epoch: 009, Train Acc: 0.55714, Val Acc: 0.34800, Test Acc: 0.36000
Epoch: 010, Train Acc: 0.56429, Val Acc: 0.37200, Test Acc: 0.37300
Epoch: 011, Train Acc: 0.61429, Val Acc: 0.42800, Test Acc: 0.41300
Epoch: 012, Train Acc: 0.64286, Val Acc: 0.45000, Test Acc: 0.44700
Epoch: 013, Train Acc: 0.79286, Val Acc: 0.56000, Test Acc: 0.56200
Epoch: 014, Train Acc: 0.82857, Val Acc: 0.58600, Test Acc: 0.58300
Epoch: 015, Train Acc: 0.77857, Val Acc: 0.57200

## Additoinal review

In [16]:
MessagePassing?

[0;31mInit signature:[0m
[0mMessagePassing[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0maggr[0m[0;34m:[0m [0mUnion[0m[0;34m[[0m[0mstr[0m[0;34m,[0m [0mList[0m[0;34m[[0m[0mstr[0m[0;34m][0m[0;34m,[0m [0mtorch_geometric[0m[0;34m.[0m[0mnn[0m[0;34m.[0m[0maggr[0m[0;34m.[0m[0mbase[0m[0;34m.[0m[0mAggregation[0m[0;34m,[0m [0mNoneType[0m[0;34m][0m [0;34m=[0m [0;34m'add'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m*[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0maggr_kwargs[0m[0;34m:[0m [0mUnion[0m[0;34m[[0m[0mDict[0m[0;34m[[0m[0mstr[0m[0;34m,[0m [0mAny[0m[0;34m][0m[0;34m,[0m [0mNoneType[0m[0;34m][0m [0;34m=[0m [0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mflow[0m[0;34m:[0m [0mstr[0m [0;34m=[0m [0;34m'source_to_target'[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mnode_dim[0m[0;34m:[0m [0mint[0m [0;34m=[0m [0;34m-[0m[0;36m2[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mdecomposed_layers

matmul

In [25]:
import numpy as np

# 두 개의 행렬 생성
A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

# matmul을 사용하여 두 행렬의 곱셈 수행
C = np.matmul(A, B)

# 결과 출력
print(C)
# [[1*5+2*7  2*5+4*7
#   3*5+4*7  3*6+4*8]]

[[19 22]
 [43 50]]
