## Introduction by example

Some questions you should answer:
- What does `contiguous()` do?
- Load a dataset and randomly split it.
- If there is a line that says
```
batch
    >>> DataBatch(batch=[1082], edge_index=[2, 4066], x=[1082, 21], y=[32])

```
what does each number mean?

## Understanding the Message Passing Model

In [None]:
import torch
from torch_geometric.data import Data

edge_index = torch.tensor([[0, 1, 1, 2],
                           [1, 0, 2, 1]], dtype=torch.long)
x = torch.tensor([[-1], [0], [1]], dtype=torch.float)

data = Data(x=x, edge_index=edge_index)

Answer the following questions from the code in the following cell:
- What does `.contiguous()` do to `edge_index`?
- What are `row` and `col` in the following code?
- What does `degree` do?
- Why do we use `deg = degree(col, x.size(0), dtype=x.dtype)` but not `deg = degree(row, x.size(0), dtype=x.dtype)`?
- What do we do in `deg_inv_sqrt[col]` and `deg_inv_sqrt[row]`?
- What do we do in `norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]`?
- What would we do to reverse `norm.view(-1, 1)`? Notice that this is not done in place.
- If $i = 0$, what is `x_j` in the `message` function? If `self.lin` were the identity matrix, can you write it?
- If you wanted to define an `update` function as a `torch.nn.Linear` layer, what would be the shape? Do it.

In [31]:
import torch
from torch_geometric.nn import MessagePassing
from torch_geometric.utils import add_self_loops, degree

class GCNConv(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super().__init__(aggr='add')  # "Add" aggregation (Step 5).
        self.lin = torch.nn.Linear(in_channels, out_channels)

    def forward(self, x, edge_index):
        # x has shape [N, in_channels]
        # edge_index has shape [2, E]

        # Step 1: Add self-loops to the adjacency matrix.
        edge_index, _ = add_self_loops(edge_index, num_nodes=x.size(0))

        # Step 2: Linearly transform node feature matrix.
        x = self.lin(x)

        # Step 3: Compute normalization.
        row, col = edge_index
        deg = degree(col, x.size(0), dtype=x.dtype)
        deg_inv_sqrt = deg.pow(-0.5)
        deg_inv_sqrt[deg_inv_sqrt == float('inf')] = 0
        norm = deg_inv_sqrt[row] * deg_inv_sqrt[col]

        # Step 4-5: Start propagating messages.
        return self.propagate(edge_index, x=x, norm=norm)

    def message(self, x_j, norm):
        # x_j has shape [E, out_channels]

        # Step 4: Normalize node features.
        return norm.view(-1, 1) * x_j

From the following code, answer these questions:
- If $i = 0$, what is `x_i` and `x_j-x_i`?
- What does `torch.cat([x_i, x_j - x_i], dim=1)` do? Why `dim = 1`?
- What would the inherited `message` function do if we had not overwritten it?

In [32]:
import torch
from torch.nn import Sequential as Seq, Linear, ReLU
from torch_geometric.nn import MessagePassing

class EdgeConv(MessagePassing):
    def __init__(self, in_channels, out_channels):
        super().__init__(aggr='max') #  "Max" aggregation.
        self.mlp = Seq(Linear(2 * in_channels, out_channels),
                       ReLU(),
                       Linear(out_channels, out_channels))

    def forward(self, x, edge_index):
        # x has shape [N, in_channels]
        # edge_index has shape [2, E]

        return self.propagate(edge_index, x=x)

    def message(self, x_i, x_j):
        # x_i has shape [E, in_channels]. x_i makes reference to the source node features for each edge in edge_index.
        # x_j has shape [E, in_channels]. x_j makes reference to the target node features for each edge in edge_index.

        tmp = torch.cat([x_i, x_j - x_i], dim=1)  # tmp has shape [E, 2 * in_channels]
        return self.mlp(tmp)

## Creating your own InMemoryDataset

Create an in-memory dataset from a list of NetworkX graphs.
Notice this should be equivalent to 
```
data_list = [Data(...), ..., Data(...)]
loader = DataLoader(data_list, batch_size=32)
```
but is more flexible

### Answer

In [None]:
class MyDataset(InMemoryDataset):
    def __init__(self, root, data_list, transform=None):
        self.data_list = data_list
        super().__init__(root, transform)
        self.data, self.slices = torch.load(self.processed_paths[0])

    @property
    def processed_file_names(self):
        return 'data.pt'

    def process(self):
        torch.save(self.collate(self.data_list), self.processed_paths[0])

What does `self.processed_paths` do?