## A.I. Assignment 5

## Learning Goals

By the end of this lab, you should be able to:
* Get more familiar with tensors in pytorch 
* Create a simple multilayer perceptron model with pytorch
* Visualise the parameters


### Task

Build a fully connected feed forward network that adds two bits. Determine the a propper achitecture for this network (what database you use for this problem? how many layers? how many neurons on each layer? what is the activation function? what is the loss function? etc)

Create at least 3 such networks and compare their performance (how accurate they are?, how farst they are trained to get at 1 accuracy?)

Display for the best one the weights for each layer.


In [1]:
import torch
import torch.nn as nn
import torch.optim as optim

In [2]:
class SimpleNet(nn.Module):
    def __init__(self, hidden_size):
        super(SimpleNet, self).__init__()
        self.fc1 = nn.Linear(2, hidden_size)  # Input to hidden layer
        self.fc2 = nn.Linear(hidden_size, 2)  # Hidden to output layer

    def forward(self, x):
        x = torch.relu(self.fc1(x))  # ReLU activation on hidden layer
        x = torch.sigmoid(self.fc2(x))  # Sigmoid activation on output layer
        return x

In [4]:
# Create the dataset for bit addition
inputs = torch.tensor([[0, 0], [0, 1], [1, 0], [1, 1]], dtype=torch.float32)
# The outputs are the binary representations of the sums: 0, 1, 1, 2
targets = torch.tensor([[0, 0], [0, 1], [0, 1], [1, 0]], dtype=torch.float32)

In [6]:
print(inputs, '\n', targets)

tensor([[0., 0.],
        [0., 1.],
        [1., 0.],
        [1., 1.]]) 
 tensor([[0., 0.],
        [0., 1.],
        [0., 1.],
        [1., 0.]])


In [8]:
hidden_sizes = [2, 4, 8]
nets = [SimpleNet(h) for h in hidden_sizes]

criterions = [nn.BCELoss() for _ in hidden_sizes]
optimizers = [optim.SGD(net.parameters(), lr=0.1) for net in nets]

In [9]:
for net, optimizer, criterion in zip(nets, optimizers, criterions):
    for epoch in range(1000):  # More epochs for such a simple task
        for input, target in zip(inputs, targets):
            optimizer.zero_grad()
            output = net(input)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()

    # After training, evaluate the accuracy
    with torch.no_grad():
        predictions = net(inputs)  # Get predictions for all inputs
        predicted_classes = predictions.round()  # Round to get binary results
        correct = (predicted_classes == targets).all(axis=1).sum().item()  # Count correct predictions
        accuracy = correct / len(targets)
        print(f'Network with {net.fc1.out_features} hidden neurons, Accuracy: {accuracy}')

    # If this network is the best, print the weights
    if accuracy == 1:
        print(f'Weights for best network (Hidden Size: {net.fc1.out_features}):')
        print('Input to Hidden layer weights:', net.fc1.weight.data)
        print('Hidden to Output layer weights:', net.fc2.weight.data)

Network with 2 hidden neurons, Accuracy: 0.75
Network with 4 hidden neurons, Accuracy: 1.0
Weights for best network (Hidden Size: 4):
Input to Hidden layer weights: tensor([[ 2.7770,  1.7384],
        [ 2.9621, -2.9627],
        [ 0.0780,  0.4133],
        [-2.4697,  2.6182]])
Hidden to Output layer weights: tensor([[ 3.4243, -2.2748, -0.4655, -1.3612],
        [-1.3613,  3.4848, -0.4568,  3.2856]])
Network with 8 hidden neurons, Accuracy: 1.0
Weights for best network (Hidden Size: 8):
Input to Hidden layer weights: tensor([[ 1.6887,  3.0466],
        [ 1.5347, -0.7049],
        [-2.2679, -1.6738],
        [-0.1763,  0.2534],
        [ 0.5267, -0.5622],
        [-0.7716, -1.1482],
        [ 0.3277, -0.9243],
        [-2.9787,  2.9780]])
Hidden to Output layer weights: tensor([[ 2.6711e+00, -1.5718e-03, -1.4445e+00,  3.2675e-01,  2.9000e-01,
         -2.3540e+00, -1.1211e+00, -1.8241e+00],
        [-2.7431e+00,  1.6880e+00, -3.2240e+00, -3.0509e-01,  1.2041e-01,
          8.2744e-02,  8