# Deep Learning Course: Lab Exercises

In this lab exercise you will become familiar with the PyTorch library in order to solve deep learning problems. The goals of this assignments are as follows:

- familiarize with PyTorch Tensors

- understand how feedforward backpropagation works in neural networks


First time using a Jupyter Notebook or Google Colab? Check [this Jupyter Notebook 101](https://www.kaggle.com/code/jhoward/jupyter-notebook-101).
During all the courses you will be asked more than just applying the lectures: check the documentation, ask what you want to do on your favorite search engine or ask the TAs. The Deep Learning community is really open to new practionners.

# Setup

For this exercise the only thing you need is this notebook.

You may use your own Python environment or use Google Colab. If you choose to use Google Colab, you can upload this notebook to your Google Drive and open it with Google Colab (right click on the file and choose "Open with" -> "Google Colab").

To set up the environment on your own machine, you need to install PyTorch. You can find the instructions [here](https://pytorch.org/get-started/locally/).

For more information about Python environment, you may take a look at [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html) or [virtualenv](https://virtualenv.pypa.io/en/latest/).


# Note

Apart from the Questions, there are instruction comments throughout the notebook as well as comments inside the code cells beginning with two hashtags (##). In addition, there are #**START CODE  /  #**END CODE comments indicating the start and end of your code sections. Pay attention not to delete these comments.

# Questions

# Q1 - PyTorch Tensors

a) Get familiar with PyTorch Tensors and construct different types of them. You may take a look at the [documentation](https://pytorch.org/docs/stable/tensors.html).

In [2]:
import torch

##Construct a 5x3 matrix, uninitialized
# *****START CODE
x = torch.ones((5,3))
# *****END CODE
print(x)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


In [3]:
##Construct a randomly initialized matrix from a normal distribution
# *****START CODE

x = torch.randn((5,3))
# *****END CODE
print(x)

tensor([[-2.1149, -0.3527,  1.4650],
        [ 0.6013, -0.2879, -0.0568],
        [ 0.7903, -1.3198, -0.7169],
        [-1.1922,  1.3885, -1.4138],
        [ 0.7308, -0.9422,  0.2039]])


In [4]:
##Construct a matrix filled with zeros and of dtype int64
# *****START CODE

x = torch.zeros((5,3),dtype= torch.int64)
# *****END CODE
print(x)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])


In [5]:
##Construct a tensor directly from data
# *****START CODE
data =  [1,2],[3,4]
x = torch.tensor(data)
# *****END CODE
print(x)

tensor([[1, 2],
        [3, 4]])


#Q2 Backpropagation from scratch

- Create random input and output PyTorch tensors and train a simple network from scratch.

  Warning: You should NOT use any forward/backward commands from PyTorch             library.

In [6]:
import torch.nn.functional as F

## N is batch size; 批量处理大小，表示一次传递给神经网络的样本数量
# D_in is input dimension 输入的特征数量
## H is hidden dimenion; 
# D_out is output dimension

torch.manual_seed(10086)
N, D_in, H, D_out = 64, 1000, 100, 10

## Create random input (x) and output (y) data
# *****START CODE
x = torch.randn(N,D_in)
y = torch.randn(N,D_out)
print('x', x.shape)
print('y', y.shape)
# *****END CODE
print('y')
    

x torch.Size([64, 1000])
y torch.Size([64, 10])
y


In [20]:
##Randomly initialize weights from a normal distribution, skipping bias
##Hint: You need 2 weight tensors; one for the raw input tensor (x) and one for the hidden dimension
# *****START CODE
w1 = torch.randn(D_in, H)
w2 = torch.randn(H,D_out )
# *****END CODE
print('w1', w1.shape)
print('w2', w2.shape)


w1 torch.Size([1000, 100])
w2 torch.Size([100, 10])


In [8]:
##define the learning rate
learning_rate = 1e-6


First, implement the forward pass. Try to compute the predicted y_pred value. You can take a look [here](https://pytorch.org/docs/stable/nn.functional.html#non-linear-activations-weighted-sum-nonlinearity) for more information about activation functions in PyTorch.

In [21]:
## Calculate the output of the hidden dimension
## Hint: make use of torch.matmul()
# *****START CODE
import torch.nn.functional as F
# x : N x D_in
# w1 : D_in x H
# h : N x H
#h = torch.matmul(x, w1)
h = x @ w1                # output of the hidden dimension 
# *****END CODE
print('x', x.shape)
print('h', h.shape)

x torch.Size([64, 1000])
h torch.Size([64, 100])


In [22]:
## Pass the output of the hidden dimension to the ReLU activation function
# *****START CODE
h_relu =  F.relu(h)          # output of the ReLU function
# *****END CODE  
print(h_relu.shape)

torch.Size([64, 100])


In [23]:
## Calculate the final output of the network
# *****START CODE
y_pred = torch.matmul(h_relu, w2 )       # final output of the network
# *****END CODE
print('y_pred', y_pred.shape)

y_pred torch.Size([64, 10])


Calculate the loss.

In [24]:
## Compute loss
loss = ((y_pred - y) ** 2).mean()
print(loss)

tensor(33264.1211)


Now, implement the backward pass.
You need to minimize the loss with respect to each weight using the chain rule of differentiation.

In [28]:
## Compute the gradient of w2 with respect to the loss
# *****START CODE
d_loss_dy_predict = 2*(y_pred-y)

dy_predict_dw2 = h_relu

print('d_loss_d_y_pred', d_loss_dy_predict.shape)
print('d_y_pred_d_w2', dy_predict_dw2.shape)

d_loss_d_w2 = dy_predict_dw2.T @ d_loss_dy_predict

print('w2', w2.shape)
print('d_loss_d_w2', d_loss_d_w2.shape)

# *****END CODE


d_loss_d_y_pred torch.Size([64, 10])
d_y_pred_d_w2 torch.Size([64, 100])
w2 torch.Size([100, 10])
d_loss_d_w2 torch.Size([100, 10])


In [29]:
## Compute the gradient of w1 with respect to the loss (consider the derivative of ReLU equal to 1)
# *****START CODE

d_loss_d_y_pred = 2.0 * (y_pred - y)
d_y_pred_d_h = w2
d_h_d_w1 = x

print('d_loss_d_y_pred', d_loss_d_y_pred.shape)
print('d_y_pred_d_h', d_y_pred_d_h.shape)
print('d_h_d_w1', d_h_d_w1.shape)

d_loss_d_w1 = d_loss_d_y_pred @ d_y_pred_d_h.t()
print('d_loss_d_w1 first step', d_loss_d_w1.shape)
d_loss_d_w1 = d_h_d_w1.t() @ d_loss_d_w1
print('d_loss_d_w1', d_loss_d_w1.shape)
print('w1', w1.shape)
# *****END CODE


d_loss_d_y_pred torch.Size([64, 10])
d_y_pred_d_h torch.Size([100, 10])
d_h_d_w1 torch.Size([64, 1000])
d_loss_d_w1 first step torch.Size([64, 100])
d_loss_d_w1 torch.Size([1000, 100])
w1 torch.Size([1000, 100])


In [30]:
## Update weights
# *****START CODE
w1 = w1 - learning_rate * d_loss_d_w1
w2 = w2 - learning_rate * d_loss_d_w2
# *****END CODE
# *****END CODE

Repeat the above process for a number of epochs and notice how the value of the loss changes.

In [31]:
##specify the number of epochs
# *****START CODE
epochs = 100
# *****END CODE

for t in range(epochs):
  # *****START CODE

  # forward pass
  h = x @ w1
  h_relu = F.relu(h)
  y_pred = h_relu @ w2
  
  # compute the loss
  loss = ((y_pred - y) ** 2).mean()

  # compute the gradient wrt w2
  d_loss_d_y_pred = 2.0 * (y_pred - y)
  d_y_pred_d_w2 = h_relu
  d_loss_d_w2 = d_y_pred_d_w2.T @ d_loss_d_y_pred

  # compute the gradient wrt w1
  d_loss_d_y_pred = 2.0 * (y_pred - y)
  d_y_pred_d_h = w2
  d_h_d_w1 = x
  d_loss_d_w1 = d_loss_d_y_pred @ d_y_pred_d_h.t()
  d_loss_d_w1 = d_h_d_w1.t() @ d_loss_d_w1

  # update the weights
  w1 = w1 - learning_rate * d_loss_d_w1
  w2 = w2 - learning_rate * d_loss_d_w2

  print('Epoch', t, 'loss', loss.item())

  # *****END CODE




Epoch 0 loss 23794.025390625
Epoch 1 loss 19929.552734375
Epoch 2 loss 19164.29296875
Epoch 3 loss 20235.326171875
Epoch 4 loss 22296.8203125
Epoch 5 loss 24155.28515625
Epoch 6 loss 24867.455078125
Epoch 7 loss 23369.33203125
Epoch 8 loss 19976.48046875
Epoch 9 loss 15346.3408203125
Epoch 10 loss 10900.568359375
Epoch 11 loss 7265.74072265625
Epoch 12 loss 4724.6865234375
Epoch 13 loss 3056.13720703125
Epoch 14 loss 2020.9068603515625
Epoch 15 loss 1381.6683349609375
Epoch 16 loss 987.9893798828125
Epoch 17 loss 738.98291015625
Epoch 18 loss 576.8760375976562
Epoch 19 loss 466.63946533203125
Epoch 20 loss 388.38018798828125
Epoch 21 loss 330.255615234375
Epoch 22 loss 285.3699645996094
Epoch 23 loss 249.4702911376953
Epoch 24 loss 219.98355102539062
Epoch 25 loss 195.23509216308594
Epoch 26 loss 174.0675506591797
Epoch 27 loss 155.8298797607422
Epoch 28 loss 139.9313201904297
Epoch 29 loss 125.99918365478516
Epoch 30 loss 113.6941146850586
Epoch 31 loss 102.8029556274414
Epoch 32 loss