In [1]:
import torch
from torch import nn
import torch.nn.functional as F
import torch.utils.data as data_utils
from itertools import product

## Homework 2 - Mirror symmetry detection
*Lorenzo Basile*

The aim of this notebook is to build and train a $2$-layer perceptron to detect mirror symmetry in sequences of $6$ bits, as explained in [this](https://www.iro.umontreal.ca/~vincentp/ift3395/lectures/backprop_old.pdf) paper.  
The architecture is very easy: $6$ input bits are fed into a first linear layer made of $2$ neurons, then the output of these neurons is processed by a single output neuron. All three neurons in the network have biases and sigmoid activation.

In [2]:
class NN(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.layer1 = nn.Linear(in_features=6, out_features=2, bias=True)
        self.layer2 = nn.Linear(in_features=2, out_features=1, bias=True)
    def forward(self, x):
        out = torch.sigmoid(self.layer1(x))
        out = torch.sigmoid(self.layer2(out))
        return out
net=NN()

To be consistent with the original paper, all parameters are initialized uniformly in $[-0.3, 0.3]$.

In [3]:
for p in net.parameters():
    torch.nn.init.uniform_(p, a=-0.3, b=0.3)
    print(p)

Parameter containing:
tensor([[-0.2928, -0.0865,  0.0185, -0.0434, -0.2614,  0.0473],
        [-0.2341, -0.0445, -0.2855, -0.1742, -0.1498, -0.1514]],
       requires_grad=True)
Parameter containing:
tensor([0.2702, 0.0345], requires_grad=True)
Parameter containing:
tensor([[ 0.1845, -0.0547]], requires_grad=True)
Parameter containing:
tensor([-0.1955], requires_grad=True)


The following function is useful to define mirror symmetry: a sequence (`torch.tensor`) is mirrored if it is equal to itself flipped.

In [4]:
def mirrored(seq):
    if torch.equal(seq, torch.flip(seq, [0])):
        return 1.0
    else:
        return 0.0

Data and labels are generated here: each data point is assigned label `1` if it has mirror symmetry, `0` otherwise.

In [5]:
x = [i for i in product(range(2), repeat=6)]
x=torch.tensor(x, dtype=torch.float32)
y=torch.zeros((len(x),1))
for i in range(len(x)):
    y[i]=mirrored(x[i])

To make training easier, a `DataLoader` object is created. `batch_size` is set to $64$ (one mini-batch contains the whole dataset) to take away the stochasticity in the SGD algorithm.

In [6]:
train = data_utils.TensorDataset(x, y)
train_loader = data_utils.DataLoader(train, batch_size=64, shuffle=False)

Training is here performed in the same way presented in the paper, using `lr=0.1`, `momentum=0.9`and `MSELoss`.

In [7]:
optimizer = torch.optim.SGD(net.parameters(), lr=0.1, momentum=0.9)
loss = torch.nn.MSELoss()
for epoch in range(10000):
    for i, data in enumerate(train_loader):
        seq=data[0]
        label=data[1]
        out=net(seq)
        l=loss(out, label)
        optimizer.zero_grad()
        l.backward()
        optimizer.step()

This training loop, despite being relatively long, is not sufficient to reach high accuracy: all mirror samples are misclassified.

In [8]:
print("Misclassified samples: ", torch.sum(torch.abs((net(x)>0.5).float()-y)).item())

Misclassified samples:  8.0


In fact, parameters do not show the symmetry properties advertised in the paper.

In [9]:
for p in net.parameters():
    print(p)

Parameter containing:
tensor([[-0.1932, -0.1070,  0.0657, -0.0578, -0.1712,  0.0142],
        [ 0.0114, -0.0198, -0.1391, -0.0313,  0.0106, -0.1275]],
       requires_grad=True)
Parameter containing:
tensor([0.2684, 0.1342], requires_grad=True)
Parameter containing:
tensor([[-0.3596, -0.4916]], requires_grad=True)
Parameter containing:
tensor([-1.5183], requires_grad=True)
