# Graph Neural Networks
## What are Graph Neural Networks (GNNs)?

In [2]:
#import the basics
import os
import torch
import torch_geometric as tg
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
%matplotlib inline

In [3]:
# Let's verify what device we are working with
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print("You are using device: %s" % device)

You are using device: cuda


Graph Neural Networks are a type of "geometric deep learning" models that use pairwise message passing. They typically have an architecture consisting of 3 types of layers. From [wikipedia](https://en.wikipedia.org/wiki/Graph_neural_network):
1. Permutation equivariant: a permutation equivariant layer maps a representation of a graph into an updated representation of the same graph. In the literature, permutation equivariant layers are implemented via **pairwise message passing between graph nodes**. Intuitively, in a message passing layer, nodes update their representations by aggregating the messages received from their immediate neighbours. As such, each message passing layer increases the receptive field of the GNN by one hop.
2. Local pooling: a local pooling layer coarsens the graph via downsampling. Local pooling is used to increase the receptive field of a GNN, in a similar fashion to pooling layers in convolutional neural networks. Examples include k-nearest neighbours pooling, top-k pooling, and self-attention pooling.
3. Global pooling: a global pooling layer, also known as readout layer, provides fixed-size representation of the whole graph. The global pooling layer must be permutation invariant, such that permutations in the ordering of graph nodes and edges do not alter the final output. Examples include element-wise sum, mean or maximum.

### What is message passing?
From [wikipedia](https://en.wikipedia.org/wiki/Graph_neural_network#Message_passing_layers):
<br>
![img](./img/notebook/messagePassing.png)

# Data
Heterogeneous graphs are perfect for recommendation systems. Let's examine a data set from pytorch geometric to understand some basics about the data.

### Datasets:
"AmazonBook" - A subset of the AmazonBook rating dataset from the "LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation" paper.

In [108]:
from torch_geometric.datasets import AmazonBook
dataset = AmazonBook(root="./data/AmazonBook")

print(f"Dataset: {dataset}")
print(f"Number of graphs: {len(dataset)}")
print(f"Number of features: {dataset.num_features}")

data = dataset[0]
data.to(device)

print(f"data = Dataset[0]: {data}")

print(f"Number of features of data: {data.num_features}")
print(f"Number of nodes of data: {data.num_nodes}")
print(f"Number of edges of data: {data.num_edges}")
print(f"data is directed?: {data.is_directed()}")
print(f"data has isolated nodes?: {data.has_isolated_nodes()}")
print(f"data node types: {data.node_types}")
print(f"data edge types: {data.edge_types}")


Dataset: AmazonBook()
Number of graphs: 1
Number of features: {'user': 0, 'book': 0}
data = Dataset[0]: HeteroData(
  user={ num_nodes=52643 },
  book={ num_nodes=91599 },
  (user, rates, book)={
    edge_index=[2, 2380730],
    edge_label_index=[2, 603378],
  },
  (book, rated_by, user)={ edge_index=[2, 2380730] }
)
Number of features of data: {'user': 0, 'book': 0}
Number of nodes of data: 144242
Number of edges of data: 4761460
data is directed?: False
data has isolated nodes?: False
data node types: ['user', 'book']
data edge types: [('user', 'rates', 'book'), ('book', 'rated_by', 'user')]


## Data pre-processing

Helpful resource: 
- Link Prediction on MovieLens.ipynb - https://colab.research.google.com/drive/1xpzn1Nvai1ygd_P5Yambc_oe4VBPK_ZT?usp=sharing#scrollTo=JMGYv83WzSRr
- Neural Graph Collaborative Filtering - https://dl.acm.org/doi/pdf/10.1145/3331184.3331267

Similar experimental setting as *Neural Graph Collaborative Filtering*, used also in *LightGCN: Simplifying and powering graph convolutional networks for recommendation*
> Amazon-book: Amazon-review is a widely used dataset for
product recommendation. We select Amazon-book from the
collection. Similarly, we use the 10-core setting to ensure that each
user and item have at least ten interactions.

> For each dataset, we randomly select 80% of historical
interactions of each user to constitute the training set, and treat
the remaining as the test set. From the training set, we randomly
select 10% of interactions as validation set to tune hyper-parameters.
For each observed user-item interaction, we treat it as a positive
instance, and then conduct the negative sampling strategy to pair
it with one negative item that the user did not consume before.

In [124]:
import torch_geometric.transforms as T

transform = T.Compose([T.ToDevice(device), T.RandomLinkSplit(is_undirected=True, num_test=0.2, num_val=.1, edge_types=data.edge_types)])
train_data, val_data, test_data = transform(data)

In [130]:
train_data

HeteroData(
  user={ num_nodes=52643 },
  book={ num_nodes=91599 },
  (user, rates, book)={
    edge_index=[2, 1666511],
    edge_label_index=[2, 3333022],
    edge_label=[3333022],
  },
  (book, rated_by, user)={
    edge_index=[2, 1666511],
    edge_label=[3333022],
    edge_label_index=[2, 3333022],
  }
)

In [126]:
val_data

HeteroData(
  user={ num_nodes=52643 },
  book={ num_nodes=91599 },
  (user, rates, book)={
    edge_index=[2, 1666511],
    edge_label_index=[2, 476146],
    edge_label=[476146],
  },
  (book, rated_by, user)={
    edge_index=[2, 1666511],
    edge_label=[476146],
    edge_label_index=[2, 476146],
  }
)

In [128]:
test_data

HeteroData(
  user={ num_nodes=52643 },
  book={ num_nodes=91599 },
  (user, rates, book)={
    edge_index=[2, 1904584],
    edge_label_index=[2, 952292],
    edge_label=[952292],
  },
  (book, rated_by, user)={
    edge_index=[2, 1904584],
    edge_label=[952292],
    edge_label_index=[2, 952292],
  }
)