# Homework 2

Reconstruct in PyTorch the first experiment in [Learning representations by back-propagating errors](https://www.nature.com/articles/323533a0) with learning rule in eq.8 (gradient descent without momentum). Try to be as close as possible to the original protocol, except for what regards the learning rule.
  - Read the paper, if you did not do it yet (don’t worry if you don’t understand the other experiments in detail)
  - Create the data, the model and everything is needed (do not use dataloaders if you don’t know how yet how they work)
  - Train the model
  - Inspect the weights you obtained and check if they provide a solution to the problem
  - Compare the solution to the solution reported in the paper

We will reproduce the experiment in Fig. 1 of the paper.
We want to detect mirror symmetry in the input vectors. Since we have 6 nodes in the inputs units our vectors will have 6 elements each, which will be either $1$ or $0$, so we will have $2^6 = 64$ possible input vectors.
Since we have two hidden units, we will have one layer with two nodes, both having the bias.

In [29]:
import torch
from scripts.train_utils import AverageMeter

from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

from itertools import product

In [30]:
# generate vectors
X = torch.Tensor([item for item in product([0, 1], repeat=6)])

In [31]:
length = 64
flag = 1
y = torch.zeros((length, 1))
for j in range(length):
    for i in range(3):
        if X[j][i] != X[j][5-i]:
            flag = 0
    if flag == 1:
        y[j] = 1
        print(X[j])
    flag = 1

y.T

tensor([0., 0., 0., 0., 0., 0.])
tensor([0., 0., 1., 1., 0., 0.])
tensor([0., 1., 0., 0., 1., 0.])
tensor([0., 1., 1., 1., 1., 0.])
tensor([1., 0., 0., 0., 0., 1.])
tensor([1., 0., 1., 1., 0., 1.])
tensor([1., 1., 0., 0., 1., 1.])
tensor([1., 1., 1., 1., 1., 1.])


tensor([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.,
         1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 1., 0., 0.,
         0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])

In [32]:
train = TensorDataset(X, y)
trainloader = DataLoader(train, batch_size=length, shuffle=False)
epsilon = 0.1  # learning rate

In [33]:
class MLP(torch.nn.Module):
    
    def __init__(self):
        super().__init__()
        self.layer1 = torch.nn.Linear(in_features =  6, out_features = 2, bias = True)
        self.layer2 = torch.nn.Linear(in_features =  2, out_features = 1, bias = True)
        
    def forward(self, X):
        out = self.layer1(X)
        out = torch.sigmoid(out)
        out = self.layer2(out)
        out = torch.sigmoid(out)
        return out

Training the model:

In [34]:
num_epochs = 1000

model = MLP()
loss_fn = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=epsilon)

In [35]:
# custom accuracy
def accuracy(y_hat, y):
    '''
    y_hat is the model output - a Tensor of shape (n x num_classes)
    y is the ground truth
    '''
    classes_prediction = y_hat.argmax(dim=1)
    match_ground_truth = classes_prediction == y # tensor of booleans
    correct_matches = match_ground_truth.sum()
    return (correct_matches / y_hat.shape[0]).item()

In [36]:
def train_epoch(model, dataloader, loss_fn, optimizer, loss_meter, accuracy_meter):
    for X, y in dataloader:
        # 1. reset the gradients previously accumulated by the optimizer
        optimizer.zero_grad()
        # 2. get the predictions from the current state of the model
        #    this is the forward pass
        y_hat = model(X)
        # 3. calculate the loss on the current mini-batch
        loss = loss_fn(y_hat, y)
        # 4. execute the backward pass given the current loss
        loss.backward()
        # 5. update the value of the params
        optimizer.step()
        # 6. calculate the accuracy for this mini-batch
        acc = accuracy(y_hat, y)
        # 7. update the loss and accuracy AverageMeter
        loss_meter.update(val=loss.item(), n=X.shape[0])
        accuracy_meter.update(val=acc, n=X.shape[0])

def train_model(model, dataloader, loss_fn, optimizer, num_epochs):
    model.train()
    for epoch in range(num_epochs):
        loss_meter = AverageMeter()
        accuracy_meter = AverageMeter()
        train_epoch(model, dataloader, loss_fn, optimizer, loss_meter, accuracy_meter)
        print(f"Epoch {epoch+1} completed. Loss - total: {loss_meter.sum} - average: {loss_meter.avg}; Accuracy: {accuracy_meter.avg}")
    return loss_meter.sum, accuracy_meter.avg

In [37]:
loss, acc = train_model(model, trainloader, loss_fn, optimizer, num_epochs)
print(f"Training completed - final accuracy {acc} and loss {loss}")

88072967529 - average: 0.10939668864011765; Accuracy: 56.0
Epoch 802 completed. Loss - total: 7.001384258270264 - average: 0.10939662903547287; Accuracy: 56.0
Epoch 803 completed. Loss - total: 7.001380920410156 - average: 0.10939657688140869; Accuracy: 56.0
Epoch 804 completed. Loss - total: 7.001378059387207 - average: 0.10939653217792511; Accuracy: 56.0
Epoch 805 completed. Loss - total: 7.001375198364258 - average: 0.10939648747444153; Accuracy: 56.0
Epoch 806 completed. Loss - total: 7.001372337341309 - average: 0.10939644277095795; Accuracy: 56.0
Epoch 807 completed. Loss - total: 7.001368999481201 - average: 0.10939639061689377; Accuracy: 56.0
Epoch 808 completed. Loss - total: 7.00136661529541 - average: 0.10939635336399078; Accuracy: 56.0
Epoch 809 completed. Loss - total: 7.001363277435303 - average: 0.1093963012099266; Accuracy: 56.0
Epoch 810 completed. Loss - total: 7.001359939575195 - average: 0.10939624905586243; Accuracy: 56.0
Epoch 811 completed. Loss - total: 7.001356

In [38]:
model.state_dict()


OrderedDict([('layer1.weight',
              tensor([[ 0.3007, -0.4176, -0.0744,  0.2343,  0.0907, -0.0131],
                      [-0.2388,  0.1766, -0.0717,  0.4165,  0.2885, -0.2163]])),
             ('layer1.bias', tensor([0.2537, 0.1453])),
             ('layer2.weight', tensor([[-0.1945, -0.5203]])),
             ('layer2.bias', tensor([-1.5240]))])