## PyTorch exercises

### Tensors

1. Make a tensor of size (2, 17)
2. Make a torch.FloatTensor of size (3, 1)
3. Make a torch.LongTensor of size (5, 2, 1)
  - fill the entire tensor with 7s
4. Make a torch.ByteTensor of size (5,)
  - fill the middle 3 indices with ones such that it records [0, 1, 1, 1, 0]
5. Perform a matrix multiplication of two tensors of size (2, 4) and (4, 2). Then do it in-place.
6. Do element-wise multiplication of two randomly filled $(n_1,n_2,n_3)$ tensors. Then store the result in an Numpy array.

### Forward-prop/backward-prop
1. Create a Tensor that `requires_grad` of size (5, 5).
2. Sum the values in the Tensor.
3. Multiply the tensor by 2 and assign the result to a new python variable (i.e. `x = result`)
4. Sum the variable's elements and assign to a new python variable
5. Print the gradients of all the variables
6. Now perform a backward pass on the last variable (NOTE: for each new python variable that you define, call `.retain_grad()`)
7. Print all gradients again

### Deep-forward NNs
1. Look at Lab 3. In Exercise 12 there, you had to build an $L$-layer neural network with the following structure: *[LINEAR -> RELU]$\times$(L-1) -> LINEAR -> SIGMOID*. Reimplement the manual code in PyTorch.
2. Compare test accuracy using different optimizers: SGD, Adam, Momentum.

In [None]:
import torch
import numpy as np

# 1. Tensor of size (2, 17)
tensor_1 = torch.tensor(torch.randn(2, 17))




  tensor_1 = torch.tensor(torch.randn(2, 17))


In [5]:
# 2. FloatTensor of size (3, 1)
float_tensor = torch.FloatTensor(3, 1)


In [6]:
# 3. LongTensor of size (5, 2, 1) filled with 7s
long_tensor = torch.LongTensor(5, 2, 1).fill_(7)


In [7]:
# 4. ByteTensor of size (5,) with [0, 1, 1, 1, 0]
byte_tensor = torch.ByteTensor([0, 1, 1, 1, 0])


In [None]:
# 5. Matrix multiplication (2, 4) x (4, 2), then in-place
A = torch.randn(2, 4)
B = torch.randn(4, 2)
matmul_result = torch.mm(A, B)
# In-place version (stored in a pre-allocated tensor)
C = torch.empty(2, 2)
torch.mm(A, B, out=C)


In [8]:
# 6. Element-wise multiplication of random (n1, n2, n3) tensors → store as NumPy
n1, n2, n3 = 3, 4, 2
t1 = torch.randn(n1, n2, n3)
t2 = torch.randn(n1, n2, n3)
elementwise_product_np = (t1 * t2).numpy()

In [9]:
# 1. Tensor with requires_grad
x = torch.randn(5, 5, requires_grad=True)
x.retain_grad()  # retain gradient for x



In [10]:
# 2. Sum the values in x
x_sum = x.sum()
x_sum.retain_grad()


In [11]:
# 3. Multiply by 2 → assign to new variable
x_times_2 = x * 2
x_times_2.retain_grad()


In [12]:
# 4. Sum elements of new variable
final_sum = x_times_2.sum()
final_sum.retain_grad()


In [13]:
# 5. Print gradients before backward
print("Before backward:")
print("x.grad:", x.grad)
print("x_sum.grad:", x_sum.grad)
print("x_times_2.grad:", x_times_2.grad)
print("final_sum.grad:", final_sum.grad)


Before backward:
x.grad: None
x_sum.grad: None
x_times_2.grad: None
final_sum.grad: None


In [14]:
# 6. Perform backward pass
final_sum.backward()


In [15]:
# 7. Print gradients after backward
print("\nAfter backward:")
print("x.grad:\n", x.grad)
print("x_sum.grad:\n", x_sum.grad)
print("x_times_2.grad:\n", x_times_2.grad)
print("final_sum.grad:\n", final_sum.grad)


After backward:
x.grad:
 tensor([[2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.],
        [2., 2., 2., 2., 2.]])
x_sum.grad:
 None
x_times_2.grad:
 tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])
final_sum.grad:
 tensor(1.)
