# Graph Neural Networks (GNNs)

So far, we learned how to store and analyse information in numerical, time-series, image formats. In some circumstances, however, we face problems where objects are define in terms of their connections to other things. In such cases, it is more useful to store information in **graph** format that can be imagined as a set of objects and the connections between them.

Consequently, researchers have developed special type of neural networks created to analyse graph data called **graph neural networks** or **GNNs** for short.

The following sections will cover a general principles behind graphs as a data storage format as well as GNNs.

### Graphs

Let's start with some basic definitions.

![graph](https://miro.medium.com/max/200/0*NuUDVCcsRDbnQFP8.png)

A graph (sample above) represents the relations (edges or links) between a collection of points (nodes or vertex). These structural parts allows to store the following attributes:

- **Vertex** attributes: node characteristics, number of neighbours.
- **Edge** attributes: edge identity, edge weight.
- **Global** attributes: number of nodes in the graph, shortest length, etc.

In addition to that, we can also specify the directionality of edges (directed or undirected).

![directions](https://distill.pub/2021/gnn-intro/directed_undirected.e4b1689d.png)

### Graph examples

At this points, we defined graphs in a very abstract manner; however, such flexibility allows the creation of many different data structures and general representation of complex data.

#### Image representation

Commonly, we think of images as multidimensional rectangular grids representing arrays. Graphs suggests a different way of representing images: each pixel becomes a node that is connected to adjacent pixels via edges. Each non-border pixel has 8 neighbours and the information stored at each node is 3-dimensional vector representation of the RGB pixel value.

#### Text representation

In the previous tutorials, we already learned how to analyse text as a sequential data format. Due to such textual data nature, it can be digitised with a help of graphs: we can associate indices to each character, word, or token and represent text as a sequence of these indices.

#### Heterogeneous data

In addition to more tradition data formats, graphs can also represent heterogeneously structured data. In such cases, the number of neighbours to each node is variable (it was fixed in image and text).

- **Molecules representation**. In molecules, all particles are interacting with each other, forming different strength and length bonds. Consequently, it is useful to describe 3D molecule structure as a graphs where nodes are atoms and edges are bonds.

- **Social networks** aims to study patterns in collective behaviour of people, institutions or organizations. It is possible to model groups of people by modelling individuals as nodes and their relationships as edges.


### Types of problems

Thre are three general types of prediction tasks on graphs: **graph-level**, **node-level**, **edge-level**.

- In a **graph-level** task, we predict the property of the entire graph based on its components.

- In **node-level**, we try to predict the properties of individual nodes within the same graph.

- **Edge-level** tasks focuses in predicting the relationships between the nodes.


### Graph neural networks

Throughout the tutorial, we will be using multiple GNN architectures, however, just for now let's analyse the simplest possible implementation of GNN that will use multilayer perceptron (MCP) for node analysis.

#### Data formating

So far, we have looked into examples of how real world data can be represented in a graph format. However, such format still needs to be simplified into structure that could be easily read and analysed by our model.

For this, we are going to represent graphs as **adjacency lists**. They are basically lists that describe the connectivity of edge $e_k$ between nodes $n_i$ and $n_j$ as a tuple in the $k$ entry of an adjacency list.

![adjacency list](https://i.imgur.com/JwA2sxn.png)

#### GNN model

GNN model performs optimizable transformations on all attributes in the graph perserving graph symmetries. The vanilla GNN model we are going to discuss, will take graph (in adjacency list format) as an input and will transform these embeddings without changing the connectivity of the graph input.

The simple GNN model will use a separate multilayer perceptron (MLP) on each component of a graph, thus forming a **GNN layer**. In other words, we pass each node, edge vectors and entire graph through MLC and get their learned vector representation.

![simple GNN](https://distill.pub/2021/gnn-intro/arch_independent.0efb8ae7.png)

#### Making prediction

So far we learned how we can construct a simple GNN layer, however, we have not yet discussed how we can use it to make a prediction.

For the simplicity, let's assume our task is a binary classification task. If the graph contains the node information, it might seem that using a linear classifier for each node embedding.

Even though, it might work in simple cases, this approach cannot be generalised to all situations. For instance, we might have required amount of data for edges, but no data for the nodes. In such case, the node classification using a linear classifier on each node embedding would not be possible.

This brings us to the process of **pooling** which contains two steps:
- Each embedding is concatenated into a matrix
- Gathered embeddings are aggregated (commonly by summing)

![prediction](https://distill.pub/2021/gnn-intro/Overall.e3af58ab.png)


This becomes the basis of the GNN model architecture. As you can imagine, however, we will need to implement some changes in the model architecture to get more accurate results.

#### Improvements

We are not going to analyse improvements to out vanilla GNN system to much in depth, however, it is still useful to know possible techniques.

- **Message passing**. In short, messaging passing describes a process in which neighboring nodes or edges exchange information between each other to update each other's embeddings. Each node gather the neighboring node embeddings, pass them through aggregation function and directed to neural network.

- **Edge representations**. In addition to applying message passing to nodes, we can also apply them to neighboring edges to include the edge information in our node classification.

- **Global representations**. Global representation allows to take into account further node embeddings that might have useful information for the particular node. To achieve such operation, we can create a new **master node** that works provides *third-view-type* analysis.