## Graph AutoEncoders (GAE) &
## Variational Graph Autoencoders (VGAE)

https://www.youtube.com/watch?v=qA6U4nIK62E
Paper for Ref: https://arxiv.org/pdf/1611.07308.pdf

In [None]:
import os
import torch
!pip install torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric -f https://data.pyg.org/whl/torch-1.12.0+cpu.html

1.12.1+cu113
[K     |████████████████████████████████| 7.9 MB 2.9 MB/s 
[K     |████████████████████████████████| 3.5 MB 2.9 MB/s 
[?25h  Building wheel for torch-geometric (setup.py) ... [?25l[?25hdone


In [None]:
import torch
from torch_geometric.datasets import Planetoid
import torch_geometric.transforms as T
from torch_geometric.nn import GCNConv
from torch_geometric.utils import train_test_split_edges

# 1. Graph AutoEncoders (GAE)



**CLASS GAE[encoder, decoder=None]**

The Graph Auto-Encoder model from the "Variational Graph Auto-Encoder" paper based on user-defined enconder and decoder models.

PARAMETERS
- encoder(Module) - The encoder module
- decoder (Module, optional) - The decoder module. If set to **None**. will default to **torch_geometric.nn.models.InnerProductDecoder**.



1.1 Load Data

In [None]:
dataset = Planetoid("\..", "CiteSeer", transform=T.NormalizeFeatures())
dataset.data

Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.x
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.tx
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.allx
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.y
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.ty
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.ally
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.graph
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.citeseer.test.index
Processing...
Done!


Data(x=[3327, 3703], edge_index=[2, 9104], y=[3327], train_mask=[3327], val_mask=[3327], test_mask=[3327])

In [None]:
data = dataset[0]
# reset the train_mask and validation_mask
data.train_mask = data.val_mask = data.test_mask = None
data

Data(x=[3327, 3703], edge_index=[2, 9104], y=[3327])

In [None]:
# this function split the edges of the data producing the same dataset with negative and positive edges
data = train_test_split_edges(data)



In [None]:
data

Data(x=[3327, 3703], y=[3327], val_pos_edge_index=[2, 227], test_pos_edge_index=[2, 455], train_pos_edge_index=[2, 7740], train_neg_adj_mask=[3327, 3327], val_neg_edge_index=[2, 227], test_neg_edge_index=[2, 455])

1.2 Define the Encoder

In [None]:
class GCNEncoder(torch.nn.Module):
    def __init__(self, in_channels, out_channels):
        super(GCNEncoder, self).__init__()
        # graph convolutional network
        self.conv1 = GCNConv(in_channels, 2 * out_channels, cached=True) # cached only for transductive learning
        self.conv2 = GCNConv(2 * out_channels, out_channels, cached=True) # cached only for transductive learning

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index).relu()
        return self.conv2(x, edge_index)

In [None]:
x

tensor([[0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.],
        [0., 0., 0.,  ..., 0., 0., 0.]])

1.3 Define the Autoencoder

In [None]:
from torch_geometric.nn import GAE

In [None]:
# parameters
out_channels = 2
num_features = dataset.num_features
epochs = 100

# model
model = GAE(GCNEncoder(num_features, out_channels))

# move to GPU (if available)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
x = data.x.to(device)
train_pos_edge_index = data.train_pos_edge_index.to(device)

# inizialize the optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

In [None]:
model

GAE(
  (encoder): GCNEncoder(
    (conv1): GCNConv(3703, 4)
    (conv2): GCNConv(4, 2)
  )
  (decoder): InnerProductDecoder()
)

In [None]:
def train():
    model.train()
    optimizer.zero_grad()
    z = model.encode(x, train_pos_edge_index)
    loss = model.recon_loss(z, train_pos_edge_index)
    #if args.variational:
    #   loss = loss + (1 / data.num_nodes) * model.kl_loss()
    loss.backward()
    optimizer.step()
    return float(loss)


def test(pos_edge_index, neg_edge_index):
    model.eval()
    with torch.no_grad():
        z = model.encode(x, train_pos_edge_index)
    return model.test(z, pos_edge_index, neg_edge_index)

In [None]:
for epoch in range(1, epochs + 1):
    loss = train()

    auc, ap = test(data.test_pos_edge_index, data.test_neg_edge_index)
    print('Epoch: {:03d}, AUC: {:.4f}, AP: {:.4f}'.format(epoch, auc, ap))

Epoch: 001, AUC: 0.6242, AP: 0.6535
Epoch: 002, AUC: 0.6663, AP: 0.7081
Epoch: 003, AUC: 0.6633, AP: 0.7040
Epoch: 004, AUC: 0.6683, AP: 0.7069
Epoch: 005, AUC: 0.6730, AP: 0.7114
Epoch: 006, AUC: 0.6762, AP: 0.7152
Epoch: 007, AUC: 0.6782, AP: 0.7190
Epoch: 008, AUC: 0.6809, AP: 0.7231
Epoch: 009, AUC: 0.6816, AP: 0.7243
Epoch: 010, AUC: 0.6817, AP: 0.7246
Epoch: 011, AUC: 0.6813, AP: 0.7244
Epoch: 012, AUC: 0.6812, AP: 0.7250
Epoch: 013, AUC: 0.6814, AP: 0.7264
Epoch: 014, AUC: 0.6822, AP: 0.7287
Epoch: 015, AUC: 0.6828, AP: 0.7308
Epoch: 016, AUC: 0.6838, AP: 0.7329
Epoch: 017, AUC: 0.6841, AP: 0.7344
Epoch: 018, AUC: 0.6846, AP: 0.7362
Epoch: 019, AUC: 0.6848, AP: 0.7378
Epoch: 020, AUC: 0.6851, AP: 0.7397
Epoch: 021, AUC: 0.6854, AP: 0.7414
Epoch: 022, AUC: 0.6861, AP: 0.7437
Epoch: 023, AUC: 0.6864, AP: 0.7453
Epoch: 024, AUC: 0.6871, AP: 0.7472
Epoch: 025, AUC: 0.6882, AP: 0.7491
Epoch: 026, AUC: 0.6886, AP: 0.7504
Epoch: 027, AUC: 0.6897, AP: 0.7521
Epoch: 028, AUC: 0.6907, AP:

In [None]:
Z = model.encode(x, train_pos_edge_index)
Z

tensor([[ 0.8494, -0.4407],
        [-0.3673, -1.1666],
        [ 0.3398,  1.1432],
        ...,
        [ 0.4383, -1.2611],
        [ 0.9726,  0.3775],
        [ 1.4920,  0.6976]], grad_fn=<AddBackward0>)

Are the results (AUC) and (AP) easy to read and compare?

1.4 Use Tensorboard

In [None]:
from torch.utils.tensorboard import SummaryWriter

In [None]:
# parameters
out_channels = 2
num_features = dataset.num_features
epochs = 100

# model
model = GAE(GCNEncoder(num_features, out_channels))

# move to GPU (if available)
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
x = data.x.to(device)
train_pos_edge_index = data.train_pos_edge_index.to(device)

# inizialize the optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

1.5 Import Tensorboard

In [None]:
writer = SummaryWriter('runs/GAE1_experiment_'+'2d_100_epochs')

In [None]:
for epoch in range(1, epochs + 1):
    loss = train()
    auc, ap = test(data.test_pos_edge_index, data.test_neg_edge_index)
    print('Epoch: {:03d}, AUC: {:.4f}, AP: {:.4f}'.format(epoch, auc, ap))
    
    
    writer.add_scalar('auc train',auc,epoch) # new line
    writer.add_scalar('ap train',ap,epoch)   # new line

Epoch: 001, AUC: 0.6516, AP: 0.6974
Epoch: 002, AUC: 0.6589, AP: 0.7021
Epoch: 003, AUC: 0.6611, AP: 0.7021
Epoch: 004, AUC: 0.6650, AP: 0.7050
Epoch: 005, AUC: 0.6690, AP: 0.7096
Epoch: 006, AUC: 0.6729, AP: 0.7140
Epoch: 007, AUC: 0.6743, AP: 0.7162
Epoch: 008, AUC: 0.6753, AP: 0.7181
Epoch: 009, AUC: 0.6762, AP: 0.7212
Epoch: 010, AUC: 0.6770, AP: 0.7240
Epoch: 011, AUC: 0.6777, AP: 0.7275
Epoch: 012, AUC: 0.6783, AP: 0.7313
Epoch: 013, AUC: 0.6790, AP: 0.7350
Epoch: 014, AUC: 0.6792, AP: 0.7383
Epoch: 015, AUC: 0.6781, AP: 0.7405
Epoch: 016, AUC: 0.6782, AP: 0.7429
Epoch: 017, AUC: 0.6777, AP: 0.7442
Epoch: 018, AUC: 0.6774, AP: 0.7456
Epoch: 019, AUC: 0.6772, AP: 0.7464
Epoch: 020, AUC: 0.6768, AP: 0.7466
Epoch: 021, AUC: 0.6768, AP: 0.7474
Epoch: 022, AUC: 0.6771, AP: 0.7482
Epoch: 023, AUC: 0.6777, AP: 0.7494
Epoch: 024, AUC: 0.6790, AP: 0.7507
Epoch: 025, AUC: 0.6813, AP: 0.7522
Epoch: 026, AUC: 0.6854, AP: 0.7546
Epoch: 027, AUC: 0.6909, AP: 0.7571
Epoch: 028, AUC: 0.6995, AP:

# 2. Graph Variational AutoEncoder (GVAE)

In [None]:
from torch_geometric.nn import VGAE

In [None]:
dataset = Planetoid("\..", "CiteSeer", transform=T.NormalizeFeatures())
data = dataset[0]
data.train_mask = data.val_mask = data.test_mask = data.y = None
data = train_test_split_edges(data)


class VariationalGCNEncoder(torch.nn.Module):
    def __init__(self, in_channels, out_channels):
        super(VariationalGCNEncoder, self).__init__()
        self.conv1 = GCNConv(in_channels, 2 * out_channels, cached=True) # cached only for transductive learning
        self.conv_mu = GCNConv(2 * out_channels, out_channels, cached=True)
        self.conv_logstd = GCNConv(2 * out_channels, out_channels, cached=True)

    def forward(self, x, edge_index):
        x = self.conv1(x, edge_index).relu()
        return self.conv_mu(x, edge_index), self.conv_logstd(x, edge_index)



In [None]:
out_channels = 2
num_features = dataset.num_features
epochs = 300


model = VGAE(VariationalGCNEncoder(num_features, out_channels))  # new line

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)
x = data.x.to(device)
train_pos_edge_index = data.train_pos_edge_index.to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

In [None]:
model

VGAE(
  (encoder): VariationalGCNEncoder(
    (conv1): GCNConv(3703, 4)
    (conv_mu): GCNConv(4, 2)
    (conv_logstd): GCNConv(4, 2)
  )
  (decoder): InnerProductDecoder()
)

In [None]:
def train():
    model.train()
    optimizer.zero_grad()
    z = model.encode(x, train_pos_edge_index)
    loss = model.recon_loss(z, train_pos_edge_index)
    
    loss = loss + (1 / data.num_nodes) * model.kl_loss()  # new line
    loss.backward()
    optimizer.step()
    return float(loss)


def test(pos_edge_index, neg_edge_index):
    model.eval()
    with torch.no_grad():
        z = model.encode(x, train_pos_edge_index)
    return model.test(z, pos_edge_index, neg_edge_index)

In [None]:
writer = SummaryWriter('runs/VGAE_experiment_'+'2d_100_epochs')

for epoch in range(1, epochs + 1):
    loss = train()
    auc, ap = test(data.test_pos_edge_index, data.test_neg_edge_index)
    print('Epoch: {:03d}, AUC: {:.4f}, AP: {:.4f}'.format(epoch, auc, ap))
    
    
    writer.add_scalar('auc train',auc,epoch) # new line
    writer.add_scalar('ap train',ap,epoch)   # new line

Epoch: 001, AUC: 0.6081, AP: 0.6439
Epoch: 002, AUC: 0.6156, AP: 0.6465
Epoch: 003, AUC: 0.6173, AP: 0.6459
Epoch: 004, AUC: 0.6177, AP: 0.6459
Epoch: 005, AUC: 0.6179, AP: 0.6458
Epoch: 006, AUC: 0.6182, AP: 0.6458
Epoch: 007, AUC: 0.6184, AP: 0.6459
Epoch: 008, AUC: 0.6186, AP: 0.6461
Epoch: 009, AUC: 0.6190, AP: 0.6462
Epoch: 010, AUC: 0.6193, AP: 0.6465
Epoch: 011, AUC: 0.6196, AP: 0.6470
Epoch: 012, AUC: 0.6199, AP: 0.6475
Epoch: 013, AUC: 0.6201, AP: 0.6479
Epoch: 014, AUC: 0.6204, AP: 0.6484
Epoch: 015, AUC: 0.6205, AP: 0.6488
Epoch: 016, AUC: 0.6210, AP: 0.6495
Epoch: 017, AUC: 0.6216, AP: 0.6504
Epoch: 018, AUC: 0.6218, AP: 0.6510
Epoch: 019, AUC: 0.6222, AP: 0.6520
Epoch: 020, AUC: 0.6228, AP: 0.6530
Epoch: 021, AUC: 0.6232, AP: 0.6540
Epoch: 022, AUC: 0.6234, AP: 0.6544
Epoch: 023, AUC: 0.6236, AP: 0.6550
Epoch: 024, AUC: 0.6240, AP: 0.6557
Epoch: 025, AUC: 0.6241, AP: 0.6560
Epoch: 026, AUC: 0.6242, AP: 0.6565
Epoch: 027, AUC: 0.6246, AP: 0.6571
Epoch: 028, AUC: 0.6247, AP:

In [None]:
%cd {PROJECT_PATH}

[Errno 2] No such file or directory: '{PROJECT_PATH}'
/content


In [None]:
from google.colab import drive
from os.path import join

ROOT = '/content/drive'                                       # default for the drive
PROJ = 'MyDrive/CodingProjects'                               # path to your project on Drive

GIT_USERNAME = "rafaelascaciota"                              # replace with yours
GIT_TOKEN = "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"        # definitely replace with yours
GIT_REPOSITORY = "CartPole"                                   # ...nah

drive.mount(ROOT, force_remount=True)                         # we mount the drive at /content/drive

PROJECT_PATH = join(ROOT, PROJ)
!mkdir "{PROJECT_PATH}"NEW                                    # in case we haven't created it already   

GIT_PATH = "https://{GIT_TOKEN}@github.com/{GIT_USERNAME}/{GIT_REPOSITORY}.git"
!mkdir ./temp
!git clone "{GIT_PATH}"
!mv ./temp/* "{PROJECT_PATH}"
!rm -rf ./temp
!rsync -aP --exclude=data/ "{PROJECT_PATH}"/*  ./

Mounted at /content/drive
mkdir: cannot create directory ‘/content/drive/MyDrive/CodingProjectsNEW’: File exists
Cloning into '{GIT_REPOSITORY}'...
fatal: unable to access 'https://{GIT_TOKEN}@github.com/{GIT_USERNAME}/{GIT_REPOSITORY}.git/': The requested URL returned error: 400
mv: cannot stat './temp/*': No such file or directory
sending incremental file list
CartPoleDATA.ipynb
         16,283 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=2/3)
GAE.VGAE.ipynb
         41,711 100%   99.11kB/s    0:00:00 (xfr#2, to-chk=1/3)
ImgGraph.ipynb
          3,098 100%    3.92kB/s    0:00:00 (xfr#3, to-chk=0/3)
