In [1]:
import torch

# Task 1
You are fiven the following neural network model paremetrized by weight vector $\mathbf{w}$ and bias $b$. Model takes as an input vector $\mathbf{x}$ and outputs a scalar $y$.
$$
y = \sin{(\mathbf{w}^T\mathbf{x})} - b,
$$
where:
$$
\mathbf{x} = \begin{bmatrix} 2 & 1 \end{bmatrix}, \mathbf{w} = \begin{bmatrix} \frac{\pi}{2} & \pi \end{bmatrix}, b = 1, \hat{y} = 2.
$$

- 1) Draw a computational graph of forward pass of this small neural network
- 2) Compute feedforward pass with initial weights $\mathbf{w}$ and input data feature vector $\mathbf{x}$
- 3) Calculate gradients of output y with respect to weights $\mathbf{w}$, i.e. $\frac{\partial y}{\partial \mathbf{w}}$
- 4) Use $\text{L}_{2}$ loss (Mean square error) to compute loss value between forward prediction $y$ and label $\hat{y}$. Add loss into computational graph.
- 5) Use chain rule to compute the gradient $\frac{\partial\mathcal{L}}{\partial\mathbf{w}}$ and update weights with learning rate parameter $\alpha = 0.5$

Do all these computations first with pen and paper and then implement them in python code.

In [10]:
def task1():
    ### Define initial parameters
    x = torch.tensor([2,1],dtype=torch.float32)
    w = torch.tensor([torch.pi/2, torch.pi],dtype=torch.float32,requires_grad=True)
    b = 1
    y_label = 2
    
    """ Note: Think about dimensions of initial parameters and order of operations """
    
    # model forward pass y = sin(w.T @ x) - b        ---> dot product @
    y = torch.sin(w.T @ x) - b
    
    # calculate loss and make backward pass
    L2 = (y_label - y)**2
    
    """ Note: Beware of backward passes when calculating it for both y and L. You need to do it separately """
    
    # Update weights with learning rate alpha
    alpha = 0.5
    L2.backward()
    with torch.inference_mode():
        w -= alpha * w.grad

    print(f"2): Feed-forward pass result is : {y.data}")
    print(f"3): Weight gradients are : {w.grad}")
    print(f"4): L2 loss result is : {L2.data}")
    print(f"5): Updated weights are : {w.data}")

In [11]:
task1()

2): Feed-forward pass result is : -0.9999998211860657
3): Weight gradients are : tensor([-12.0000,  -6.0000])
4): L2 loss result is : 8.999999046325684
5): Updated weights are : tensor([7.5708, 6.1416])


# Task 2
You are given input feature map $\mathbf{X}$ and convolutional kernel $\mathbf{w}$:
$$
\mathbf{x} = \begin{bmatrix} 1 & 0 & 2 \\ 2 & 1 & -1 \\ 0 & 0 & 2 \end{bmatrix}, \mathbf{w} = \begin{bmatrix} 1 & -1 \\ 0 & 2 \end{bmatrix}.
$$
Stride denotes length of convolutional stride, padding denotes symetric zero-padding.

Compute outputs of following layers:
- 1) $\text{conv}(\mathbf{x}, \mathbf{w}, \text{stride}=1, \text{padding}=0)$
- 2) $\text{conv}(\mathbf{x}, \mathbf{w}, \text{stride}=3, \text{padding}=1)$
- 3) $\text{max}(\mathbf{x}, 2\times2)$

Do all these computations first with pen and paper and then implement them in python code.

In [18]:
def task2():
    x = torch.tensor([[1,0,2],[2,1,-1],[0,0,2]], dtype=torch.float32)
    w = torch.tensor([[1,-1],[0,2]],dtype=torch.float32)

    conv1 = torch.nn.Conv2d(1,1,2,bias=False)
    conv1.weight = torch.nn.Parameter(w.unsqueeze(0).unsqueeze(0))
    y = conv1(x.unsqueeze(0).unsqueeze(0))

    conv2 = torch.nn.Conv2d(1,1,2,3,1,bias=False)
    conv2.weight = torch.nn.Parameter(w.unsqueeze(0).unsqueeze(0))
    y2 = conv2(x.unsqueeze(0).unsqueeze(0))

    max1 = torch.nn.MaxPool2d(2,1)
    y3 = max1(x.unsqueeze(0).unsqueeze(0))

    print(f"1): Convolution result is : {y.data}")
    print(f"2): Convolution result is : {y2.data}")
    print(f"3): MaxPool result is : {y3.data}")

In [19]:
task2()

1): Convolution result is : tensor([[[[ 3., -4.],
          [ 1.,  6.]]]])
2): Convolution result is : tensor([[[[2., 0.],
          [0., 2.]]]])
3): MaxPool result is : tensor([[[[2., 2.],
          [2., 2.]]]])


You are given a simple network model, which consists of one convolutional layer and one maxpool layer. Structure is defined as follows: 
$$
f(\mathbf{x}, \mathbf{w}) = \text{max}(\text{conv}(\mathbf{x}, \mathbf{w}, \text{stride}=1, \text{padding}=0), 1\times2),
$$
where:
$$
\mathbf{x} = \begin{bmatrix} 2 & 1 & 2 \end{bmatrix}, \mathbf{w} = \begin{bmatrix} 1 & 0 \end{bmatrix}.
$$
$\mathbf{x}$ is an input feature map, $\mathbf{w}$ is a convolutional kernel.

- 1) Draw a computational graph and ocmpute the forward pass of this small neural network
- 2) Compute gradients of output with respect to weights $\mathbf{w}$, i.e. $\frac{\partial f(\mathbf{x}, \mathbf{w})}{\partial \mathbf{w}}$
- 3) Update weights with learning rate parameter $\alpha = 0.5$

Do all these computations first with pen and paper and then implement them in python code.

In [20]:
def task3():
    x = torch.tensor([2,1,2], dtype=torch.float32)
    w = torch.tensor([1,0], dtype=torch.float32)

    conv1 = torch.nn.Conv1d(1,1,2, bias=False)
    conv1.weight = torch.nn.Parameter(w.unsqueeze(0).unsqueeze(0))
    max1 = torch.nn.MaxPool1d(2,1)
    y = max1(conv1(x.unsqueeze(0).unsqueeze(0)))

    optimizer = torch.optim.SGD(conv1.parameters(),lr=0.5)
    y.backward()

    print(f"1): Convolution result is : {y.data}")
    print(f"2): Weight gradients are : {conv1.weight.grad}")
    print(f"3): Updated weights are : {conv1.weight.data}")

In [21]:
task3()

1): Convolution result is : tensor([[[2.]]])
2): Weight gradients are : tensor([[[2., 1.]]])
3): Updated weights are : tensor([[[1., 0.]]])
