# [Solved] Lab 4 bis: **Invariance** in a *shallow FCN* under data augmentation

Advanced Topics in Machine Learning -- Fall 2023, UniTS

<a target="_blank" href="https://colab.research.google.com/github/ganselmif/adv-ml-units/blob/main/solutions/AdvML_UniTS_2023_Lab_04bis_FCN_Invariance_Solved.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"/></a>

#### High-level overview

In this *Lab*, we will understand how the effect of *data augmentation* (whose effect on learned **weights** has been analyzed in the previous lab) translates to the **representation** learned by the model.

Specifically, we define *representation* the (ordered) set of activations of a *neural network* model, which is dependent on the input, and can be seen as the way the model *sees* the data as a result of the learning process.

To accomplish this goal, we will:

- Load the weights resulting from the training of the model described in the previous lab;
- Learn how to extract the activations of a given layer of the model, in response to a given input;
- Evaluate such activations on mutually-rotated versions of the same input, and compare such activations to assess their *invariance*  with respect to the transformation.

#### Preliminary: adapt and re-run the previous notebook

Before starting to delve into this lab, you should:
- Go back to the previous *Lab* notebook;
- Add the (single line of) code required to save the model weights after training;
- Re-run the notebook, to make sure that the weights are saved correctly;


In [1]:
import torch as th
import torch.nn as nn
import torch.nn.functional as F

from torch.utils.data import DataLoader

from torchvision import datasets
from torchvision import transforms

#### (Re-)definition of the model

Define the exact same model you used in the previous lab, and instantiate it.

In [2]:
# Model definition
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()
        self.fc = nn.Linear(28 * 28, 10)

    def forward(self, x):
        x = x.flatten(start_dim=1)
        x = self.fc(x)
        x = F.log_softmax(x, dim=1)  # More numerically stable than softmax
        return x


# Model instantiation
model = MyModel()

#### Weights loading

Load into the instance of your model the weights you just saved from the adapted notebook.

In [3]:
_ = model.load_state_dict(th.load("./models/rotation_invariant_slfcn.pth"))
model.eval()

MyModel(
  (fc): Linear(in_features=784, out_features=10, bias=True)
)

#### Data preparation

To test for a given transformation invariance, you should have pairs of (test) data obtained from the same image: one original, and one transformed.

**Hint**: if you want to offload the task to already implemented `torchvision.transforms`, notice (to your advantage) that -- since we are just testing the model -- the dataset needs not to be in shuffled order!


In [4]:
# Hyperparameters
BATCH_SIZE = 1024

In [5]:
# Defining transforms
augmentation = transforms.RandomAffine(degrees=(0, 180), translate=None, scale=None)
to_tensor = transforms.ToTensor()
normalization = transforms.Normalize(mean=0.1307, std=0.3081)

# Defining testing data-sets/loaders
test_dataset = datasets.MNIST(
    root="./data",
    train=False,
    # Original: no augmentation
    transform=transforms.Compose([to_tensor, normalization]),
    download=True,
)
test_dataset_rot = datasets.MNIST(
    root="./data",
    train=False,
    # Transformed: augmented
    transform=transforms.Compose([augmentation, to_tensor, normalization]),
    download=True,
)

test_loader = DataLoader(dataset=test_dataset, batch_size=BATCH_SIZE, shuffle=False)
test_loader_rot = DataLoader(dataset=test_dataset, batch_size=BATCH_SIZE, shuffle=False)

#### Activation extraction

Write a function that extracts the activations of a given layer of the model, in response to a given input. Try to remain as generic as possible, since you may need to re-use it in the future.

**Hint**: Look up in the documentation the purpose and features of `hook`s. If you are in trouble, just ask!


In [6]:
# Definition of a hook that outputs layer-specific activations
def get_activations(_x, _model, _name):
    activations = {}

    def get_activation_hook(name):
        def hook(_model, _input, _output):
            _ = _model, _input
            activations[name] = _output.detach()

        return hook

    layer = getattr(_model, _name)
    layer.register_forward_hook(get_activation_hook("name"))
    _ = _model(_x)
    activation = activations["name"]
    return activation

#### Invariance evaluation

Recall the definition of *invariance* of (the result of) function $f$ with respect to transformation $g(\cdot\;; \alpha)$ parametrized by $\alpha$:

$$f(g(x; \alpha))=f(x)\;\;\;\; \forall\alpha$$

With the function and data just defined, compare the activations of the model on the original and transformed versions of the same image. Comment on the results.


In [7]:
differences = []
differences_control = []

control_img = None

for i, (images, label) in enumerate(test_loader):
    # The order is the same due to the fact that both dataloaders are not shuffled!
    (images_rot, label_rot) = test_loader_rot.__iter__().__next__()

    if i == 0:
        control_img = th.randn_like(images)

    act = get_activations(images, model, "fc")
    act_rot = get_activations(images_rot, model, "fc")
    act_control = get_activations(control_img, model, "fc")

    # When the non-augmented dataset is over, the rest of the batch is eventually filled with augmentations
    # This check (and fix) avoids dimension mismatches
    if act.shape[0] != act_rot.shape[0]:
        act_rot = act_rot[: act.shape[0]]
        act_control = act_control[: act.shape[0]]

    differences.append(act - act_rot)
    differences_control.append(act - act_control)

    control_img = images_rot

avg_norm = th.linalg.norm(th.cat(differences, dim=0), dim=1).mean()
avg_norm_control = th.linalg.norm(th.cat(differences_control, dim=0), dim=1).mean()

print(f"Average Euclidean norm of the difference: {avg_norm.item()}")
print(f"Average Euclidean norm of the difference, control: {avg_norm_control.item()}")

Average Euclidean norm of the difference: 11.152886390686035
Average Euclidean norm of the difference, control: 12.165810585021973
