
## Homework

1. Complete the Python implementation of the backpropagation exercise in the **Backpropagation** section here above (cell `# try it in Python as homework!`)
    - Create the calculations for obtaining $y$ in PyTorch **using only PyTorch methods and routines**
    - Calculate the gradient
    - Check the values of the gradients and see if it is correct w.r.t. the manual calculations
2. Given the multilayer perceptron defined during the exercises from lab 1:
    - Create 10 random datapoints (with any function you wish, it can be `rand`, `randn`...) and feed them into the network
    - Given the output, calculate the Cross-Entropy loss with respect to the ground truth $[1,2,3,4,1,2,3,4,1,2]$ (classes from 1 to 4). Cross-Entropy loss:
        
        $$ CE(\mathbf{y}, \hat{\mathbf{y}}) = - \sum_{i=1}^{10} y_i \log(\hat{y}_i)$$
        
        where $y_i$ is the one-hot encoding of the $i$-th datapoint. For instance, $y_1 = [1,0,0,0]$.
        **_Note: there is an extremely handy PyTorch function for getting a one-hot encoding out of a vector, so don't try anything fancy._**
    - Backpropagate the error along the network and inspect the gradient of the parameters connecting the input layer and the first hidden layer.
3. Execute the python script `utils/randomized_backpropagation_formula.py`. This creates a formula $f(\mathbf{x})$ with randomized operators and values. Create the computational graph from this formula, do (by hand) the forward pass, then calculate (by hand) $\nabla f(\mathbf{x})$ using the backward gradient computation. Do the same calculation on PyTorch to check the correctness of your calculations. _Note: The formula created by this script is linked to your name and surname, which you have to input before_. The solution to this exercise _should_ be submitted as a scan/good quality picture of a piece of paper (or you can do it on a touch screen and submit the image...), but other formats are acceptable as well.


## Ex 1

_From the lab lesson_

**Backpropagation**

Let us suppose we have the following calculation

$\mathbf{x} = [1,~2,~-1,~3,~5]$

$ y = f(\mathbf{x}) = \log\{[\exp (x_1 * x_2 )]^2 + \sin (x_3 + x_4 + x_5) \cdot x_5\}$


Find

$\nabla f(\mathbf{x})$

In [8]:
import torch
x = torch.tensor([1.,2., -1., 3., 5.], requires_grad=True)
y = torch.log(torch.exp(x[0]*x[1])**2 + torch.sin(x[2]+x[3]+x[4])*x[4])

y.backward()

print("gradient",x.grad)
print("y", y)

gradient tensor([3.7730, 1.8865, 0.0651, 0.0651, 0.0765])
y tensor(4.0584, grad_fn=<LogBackward0>)


**Analytical calculation of the gradient in the other attachment**

## Ex 2

In [2]:
# MlpClass of the previous assignment
class MyMLP(torch.nn.Module):
    def __init__(self,bias = True):
        super().__init__()
        self.layers = torch.nn.Sequential(
            torch.nn.Linear(5,11,bias=bias),
            torch.nn.ReLU(),
            torch.nn.Linear(11,16,bias=bias),
            torch.nn.ReLU(),
            torch.nn.Linear(16,13,bias=bias),
            torch.nn.ReLU(),
            torch.nn.Linear(13,8,bias=bias),
            torch.nn.ReLU(),
            torch.nn.Linear(8,4,bias=bias),
            torch.nn.Softmax(dim=1)
        )
    def forward(self,x):
        return self.layers.forward(x)

In [7]:
from torch.nn.functional import one_hot
import numpy as np
n = 10
x = torch.rand([n,5])
# note, one hot expects labels from 0 to n-1 with n classes
# actually a little trick for obtaining correct labels
# y = one_hot(torch.tensor(labels), num_classes=4)
# this command actually returns an error 
# i did not find a more elegant solution
labels = torch.tensor([1,2,3,4,1,2,3,4,1,2]) - 1
y = one_hot(labels)
print(y)
model = MyMLP(bias = False)

tensor([[1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0],
        [0, 0, 0, 1],
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0],
        [0, 0, 0, 1],
        [1, 0, 0, 0],
        [0, 1, 0, 0]])


In [13]:
def CrossEntropyLoss(y_model : torch.tensor, y_true : torch.tensor):
    assert y_model.shape[0] == y_true.shape[0], f"Mismatched number of examples got: y_model -> {y_model.shape[0]} y_true -> {y_true.shape[0]} "
    assert y_model.shape[1] == y_true.shape[1], f"Mismatched number of classes got: y_model -> {y_model.shape[1]} y_true -> {y_true.shape[1]} "

    return -torch.sum( y_true * torch.log(y_model))


In [14]:
y_model = model.forward(x)
print(y_model)

tensor([[0.2496, 0.2492, 0.2528, 0.2484],
        [0.2493, 0.2490, 0.2537, 0.2480],
        [0.2498, 0.2496, 0.2513, 0.2493],
        [0.2502, 0.2487, 0.2534, 0.2477],
        [0.2493, 0.2490, 0.2535, 0.2482],
        [0.2497, 0.2492, 0.2528, 0.2482],
        [0.2502, 0.2491, 0.2524, 0.2483],
        [0.2500, 0.2488, 0.2532, 0.2480],
        [0.2496, 0.2491, 0.2524, 0.2489],
        [0.2502, 0.2493, 0.2515, 0.2490]], grad_fn=<SoftmaxBackward0>)


In [15]:
loss = CrossEntropyLoss(y_model,y)
print(f"Loss: {loss}")

Loss: 13.881470680236816


In [16]:
#backward gradient calculation on the loss
loss.backward()


In [9]:
#getting the first layer parameters
first_layer = list(model.parameters())[0]
print(first_layer.grad)

None


## Ex 3

Formula given by the script, input `Francesco Tomba`

$f(X) =  \tan((\text{ReLU}(x_1 * x_2) - \text{atan}(x_3 - x_4)) / x_5)$

Values given by the script: 

$\vec{X} = [ -2, 2, 3, -2, 1]$ 

**First part in the attachment**

In [11]:
x = torch.tensor([-2., 2., 3., -2., 1], requires_grad=True)
y = torch.tan((x[0]*x[1]).relu() - (x[2] - x[3]).atan() / x[4])

y.backward()
print("y",y)
print("gradient",x.grad)

y tensor(-5.0000, grad_fn=<TanBackward0>)
gradient tensor([ 0.0000,  0.0000, -1.0000,  1.0000, 35.7084])
