# Hyperbolic Learning in Action: Practice

In this notebook, we are going to train, evaluate, and compare three Convolutional Neural Networks (CNNs):

1. an ordinary, fully Euclidean one;
2. one with the last layer in hyperbolic space;
3. a fully hyperbolic network.

We will use:

- the CIFAR-10 and CIFAR-100 datasets, whereas the first is chosen for its simplicity and the second because it exhibits *hierarchical* structure;
- the hyperbolic learning library `HypLL` for the hyperbolic layers, due to its ease of use.

We will visualize data representations in the Euclidean and hyperbolic space.

## Setup

**If you are on Colab or Kaggle, get GPU acceleration**
- Colab:
    1. Click on the dropdown arrow on the right of the menu bar above the notebook, next to "Connect".
    2. Select "Change runtime type".
    3. Choose "T4 GPU" under "Hardware accelerator".
- Kaggle:
    1. Expand the section "Session options" on the right menu sidebar.
    2. Select "GPU P100" under "Accelerator".

### Environment

Check if the notebook is already in the code repository.

In [None]:
import os

path_parts = os.getcwd().split(os.sep)
repository_path = ""
try:
    repository_index = path_parts.index("hyperbolic-learning-tutorial-code")
    repository_path = os.sep.join(path_parts[: repository_index + 1])
except ValueError:
    pass

Get the repository if needed.

In [None]:
if repository_path == "":
    !git clone https://github.com/Digital-Dermatology/hyperbolic-learning-tutorial-code.git
    %cd hyperbolic-learning-tutorial-code
    repository_path = "hyperbolic-learning-tutorial-code"
else:
    %cd {repository_path}

Install requirements.

In [None]:
!pip install --upgrade pip && pip install -r requirements.txt

Add the project's root to the Python path for custom functions.

In [None]:
import sys

sys.path.append(os.path.join(repository_path, "src"))

Set the `torch` device and seeds for reproducibility.

In [None]:
import torch
from src.utils.torch_utils import get_available_device, set_seeds

device = torch.device(get_available_device())
set_seeds(42)

### Data

Get the datasets.

Since this is a demonstration, and it does not use hyperparameter tuning, it is ok to work only with one split for training and one for evaluation, i.e. testing.

In [None]:
import torchvision

transform = torchvision.transforms.Compose(
    [
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize(
            mean=(0.5, 0.5, 0.5),
            std=(0.5, 0.5, 0.5),
        ),
    ]
)
train_dataset = torchvision.datasets.CIFAR10(
    root="data", train=True, download=True, transform=transform
)
test_dataset = torchvision.datasets.CIFAR10(
    root="data", train=False, download=True, transform=transform
)

classes = train_dataset.classes
assert test_dataset.classes == classes
num_classes = len(classes)
print(f"Classes in the dataset: {classes}")

Prepare the data loaders.

The batch size and the number of workers may be adjusted as needed.

In [None]:
batch_size = 128
num_workers = 0

train_dataloader = torch.utils.data.DataLoader(
    train_dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers
)
test_dataloader = torch.utils.data.DataLoader(
    test_dataset, batch_size=batch_size, shuffle=False, num_workers=num_workers
)

## Euclidean Network

### Architecture

Start with a simple Euclidean convolutional network.

To compare with hyperbolic networks without too much pain:

- it has no batch normalization nor skip connections;
- fully connected layers are used at the end instead of e.g. global pooling;
- no transfer learning is used.

Right before classification, reduce the input dimension to 2 to enable embedding visualization.
This will lead to poor performance in Euclidean space, but it will be less of a problem for hyperbolic networks.
The constraint can be relaxed if a dimensionality reduction method such as PCA, t-SNE, or UMAP is used.

In [None]:
from torch.nn import Conv2d, Flatten, Linear, MaxPool2d, ReLU, Sequential

last_channels = 3
conv_channels = (32, 64, 128)
fc_channels = (128, 32, 2)
image_size = (32, 32)
pool_kernel_size = 2
pool_stride = 2
conv_kernel_size = 3

pool = MaxPool2d(kernel_size=pool_kernel_size, stride=pool_stride)
activation = ReLU()
current_image_size = torch.tensor(image_size)
layers = []
for channels in conv_channels:
    layers.append(
        Conv2d(in_channels=last_channels, out_channels=channels, kernel_size=3)
    )
    current_image_size -= conv_kernel_size - 1
    layers.append(activation)
    layers.append(pool)
    current_image_size //= pool_stride
    last_channels = channels
layers.append(Flatten())
last_channels *= current_image_size.prod()
for channels in fc_channels:
    layers.append(
        Linear(in_features=last_channels, out_features=channels)
    )
    layers.append(activation)
    last_channels = channels
layers = layers[:-1]  # remove the last activation
layers.append(Linear(in_features=last_channels, out_features=len(classes)))
euclidean_network = Sequential(*layers)
euclidean_network = euclidean_network.to(device)
euclidean_network

In [None]:
with torch.no_grad():
    for data, labels in test_dataloader:
        outputs = euclidean_network(data.to(device))
        break

In [None]:
outputs.shape

In [None]:
from torchinfo import summary
summary(euclidean_network)

### Evaluation

Define the metrics for evaluation.

In [None]:
from torchmetrics import MetricCollection
from torchmetrics.classification import MulticlassAccuracy, MulticlassMatthewsCorrCoef

metrics = MetricCollection(
    [
        MulticlassAccuracy(num_classes=num_classes),
        MulticlassMatthewsCorrCoef(num_classes=num_classes),
    ]
)
metrics = metrics.to(device)

Evaluate before training.

In [None]:
def print_metrics(metrics: MetricCollection, prefix: str = "") -> None:
    print(
        prefix,
        {k.replace("Multiclass", ""): v.item() for k, v in metrics.compute().items()},
    )

In [None]:
metrics.reset()
with torch.no_grad():
    for data, labels in test_dataloader:
        outputs = euclidean_network(data.to(device))
        metrics(outputs, labels.to(device))
print_metrics(metrics, "Metrics before training:")

### Training

In [None]:
from torch.optim import Adam
from tqdm import tqdm
criterion = torch.nn.CrossEntropyLoss()
criterion.to(device)
optimizer = Adam(euclidean_network.parameters(), lr=1e-3)
num_epochs = 10

for epoch in range(num_epochs):
    print(f"Epoch {epoch + 1} of {num_epochs}")
    metrics.reset()
    for data, labels in tqdm(train_dataloader):
        optimizer.zero_grad()
        outputs = euclidean_network(data.to(device))
        labels = labels.to(device)
        metrics(outputs, labels)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
    print_metrics(metrics, "Train: ")
    metrics.reset()
    with torch.no_grad():
        for data, labels in test_dataloader:
            outputs = euclidean_network(data.to(device))
            metrics(outputs, labels.to(device))
    print_metrics(metrics, "Test: ")

### Visualization

Get the embeddings to visualize them.

In [None]:
embeddings, predictions, labels = [], [], []
with torch.no_grad():
    for data, labels_batch in test_dataloader:
        embeddings_batch = euclidean_network[:-1](data.to(device))
        predictions_batch = euclidean_network[-1](embeddings_batch).argmax(dim=-1)
        embeddings.append(embeddings_batch)
        predictions.append(predictions_batch)
        labels.append(labels_batch)
embeddings = torch.cat(embeddings, dim=0)
predictions = torch.cat(predictions, dim=0)
labels = torch.cat(labels, dim=0)

Save the plot to HTML to avoid overloading the notebook.

In [None]:
import pandas as pd
import plotly.express as px
df = pd.DataFrame(embeddings.cpu().numpy())
df["prediction"] = predictions.cpu().numpy()
df["label"] = [classes[i] for i in labels.cpu().numpy()]
fig = px.scatter(data_frame=df, x=0, y=1, color="label")
fig.write_html("euclidean.html")

## Last hyperbolic layer

Now it is time to roll up sleeves. Enjoy hacking!

1. Define the hyperbolic manifold using [`hypll.manifolds.poincare_ball.PoincareBall`](https://hyperbolic-learning-library.readthedocs.io/en/latest/_autosummary/hypll.manifolds.poincare_ball.manifold.html) with a trainable curvature parameter [`hypll.manifolds.poincare_ball.Curvature`](https://hyperbolic-learning-library.readthedocs.io/en/latest/_autosummary/hypll.manifolds.poincare_ball.curvature.html).

In [None]:
manifold = ...

2. Starting with the Euclidean network, just before classification, lift the representation to hyperbolic space by constructing a [`hypll.tensors.TangentTensor`](https://hyperbolic-learning-library.readthedocs.io/en/latest/_autosummary/hypll.tensors.tangent_tensor.html) and using `PoincareBall`'s exponential map.
    - Hint: you may also use the convenience layer [`src.layers.to_manifold.ToManifold`](https://github.com/Digital-Dermatology/hyperbolic-learning-tutorial-code/blob/main/src/layers/to_manifold.py) or take inspiration from it.
3. Obtain the logits by replacing the linear classification layer of the Euclidean network with the calculation of the distances from (learned) hyperbolic hyperplanes.
    - This operation, known as Hyperbolic Multinomial Logistic Regression, is implemented in [`src.layers.hmlr.HMLR`](https://github.com/Digital-Dermatology/hyperbolic-learning-tutorial-code/blob/main/src/layers/hmlr.py), feel free to use it directly or as a guide.

In [None]:
last_hyperbolic_network = ...

4. Replace the Adam optimizer with Riemannian Adam from [`hypll.optim.RiemannianAdam`](https://hyperbolic-learning-library.readthedocs.io/en/latest/_autosummary/hypll.optim.adam.html).

In [None]:
riemannian_optimizer = ...

5. Train the network for 10 epochs.

In [None]:
for epoch in range(num_epochs):
    ...

6. Visualize the embeddings with their labels.

In [None]:
embeddings, predictions, labels = [], [], []
with torch.no_grad():
    for data, labels_batch in test_dataloader:
        embeddings_batch = ...
        predictions_batch = ...

In [None]:
df = ...

7. Compare the training time, final performance, and representations with the euclidean ones!

## Fully hyperbolic network

Exercise 2:

1. Define the hyperbolic manifold as in Exercise 1.
2. Immediately after getting data from the `DataLoader`, lift it to the `PoincareBall` as in the previous exercise.
3. Build a fully hyperbolic backbone using the layers `HLinear`, `HConv2D`, `HPool2D`, and `HReLU` from `hypll.nn`.
4. Add the classification layer at the end using `src.layers.hmlr.HMLR`.

## Optional: CIFAR-100

If you got this far, well done!!

You should repeat the exercise with CIFAR-100, which has a more hierarchical structure, to see the benefits of hyperbolic learning for real.