# Introduction to PyTorch

In this first chapter, we introduce basic concepts of neural networks and deep learning using PyTorch library.

# (1) Intriduction to PyTorch

<img src="image/Screenshot 2021-01-25 152946.png">

## Neural networks

<img src="image/Screenshot 2021-01-25 153036.png">

## Why PyTorch?

<img src="image/Screenshot 2021-01-25 153132.png">

- "PyThonic" - easy to use
- Strong GPU support - models run fast
- Many algorithms are already implemented
- Automatic differentiation - more in next lesson
- Similar to Numpy

## Matrix Multiplication

$\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}.\begin{bmatrix} 7 & 8 \\ 9 & 10 \\ 11 & 12\end{bmatrix} = \begin{bmatrix} 58 & 64 \\ 139 & 154\end{bmatrix}$

## PyTorch compared to NumPy

```
import torch
torch.tensor([[2, 3, 5], [1, 2, 9]])
```

```
a = torch.rand(2, 2)
a.shape
```

```
import numpy as np
np.array([[2, 3, 5], [1, 2, 9]])
```

```
a = np.random.randn(3, 5)
a.shpae
```

## Matrix operations

```
a = torch.rand((2, 2))
b = torch.rand((2, 2))
```

```
torch.matmul(a, b)
```

```
a = np.random.rand(2, 2)
b = np.random.rand(2, 2)
```

```
np.dot(a, b)
```

```
a * b
```

```
np.multiply(a, b)
```

## Zeros and Ones

```
a_torch.zeros(2, 2)
```

```
b_torch.ones(2, 2)
```

```
c_torch = torch.eye(2)
```

```
a_numpy = np.zeros((2, 2))
```

```
b_numpy = np.ones((2, 2))
```

```
c_numpy = np.identity(2)

```

## PyTorch to NumPy and vice versa

```
d_torch = torch.from_numpy(c_numpy)
```

```
d = c_torch.numpy()
```

# Summary
```
torch.matmul(a, b)  # multiples torch tensor a and b

*                   # element-wise multiplication between two torch tensor

torch.eye(n)        # create an identity torch tensor with shape

torch.zeros(n, m)   # creates a torch tensor of zeros with shape (n, m)

torch.ones(n, m)    # creates a torch tensor of ones with shape (n, m)

torch.rand(n, m)    # create a random torch tensor with shape (n, m)

torch.tensor(l)     # creates a torch tensor based on list l
```

# Exercuse I: Creating tensors in PyTorch

Random tensors are very important in neural networks. Parameters of the neural networks typically are initialized with random weights (random tensors).

Let us start practicing building tensors in PyTorch library. As you know, tensors are arrays with an arbitrary number of dimensions, corresponding to NumPy's ndarrays. You are going to create a random tensor of sizes 3 by 3 and set it to variable `your_first_tensor`. Then, you will need to print it. Finally, calculate its size in variable `tensor_size` and print its value.

NB: In case you have trouble solving the problems, you can always refer to slides in the bottom right of the screen.

### Instructions 


- Import PyTorch main library.
- Create the variable `your_first_tensor` and set it to a random torch tensor of size 3 by 3.
- Calculate its shape (dimension sizes) and set it to variable `tensor_size`.
- Print the values of `your_first_tensor` and `tensor_size`.


In [None]:
# Import torch
import torch

# Create random tensor of size 3 by 3
your_first_tensor = torch.rand(3, 3)

# Calculate the shape of the tensor
tensor_size = your_first_tensor.shape

# Print the values of the tensor and its shape
print(your_first_tensor)
print(tensor_size)

# Exercise II: Matrix multiplication

There are many important types of matrices which have their uses in neural networks. Some important matrices are matrices of ones (where each entry is set to 1) and the identity matrix (where the diagonal is set to 1 while all other values are 0). The identity matrix is very important in linear algebra: any matrix multiplied with identity matrix is simply the original matrix.

Let us experiment with these two types of matrices. You are going to build a matrix of ones with shape 3 by 3 called `tensor_of_ones` and an identity matrix of the same shape, called `identity_tensor`. We are going to see what happens when we multiply these two matrices, and what happens if we do an element-wise multiplication of them.

### Instructions


- Create a matrix of ones with shape 3 by 3, and put it on variable `tensor_of_ones`.
- Create an identity matrix with shape 3 by 3, and put it on variable `identity_tensor`.
- Do a matrix multiplication of `tensor_of_ones` with `identity_tensor` and print its value.
- Do an element-wise multiplication of `tensor_of_ones` with `identity_tensor` and print its value.


In [None]:
# Create a matrix of ones with shape 3 by 3
tensor_of_ones = torch.ones(3, 3)

# Create an identity matrix with shape 3 by 3
identity_tensor = torch.eye(3)

# Do a matrix multiplication of tensor_of_ones with identity_tensor
matrices_multiplied = torch.matmul(tensor_of_ones, identity_tensor)
print(matrices_multiplied)

# Do an element-wise multiplication of tensor_of_ones with identity_tensor
element_multiplication = tensor_of_ones * identity_tensor
print(element_multiplication)

# (2) Forward propagation

<img src="image/Screenshot 2021-01-26 153724.png">

## PyTorch implementation

<img src="image/Screenshot 2021-01-26 153836.png">

```
import torch

a = torch.Tensor([2])
b = torch.Tensor([-4])
c = torch.Tensor([-2])
d = torch.Tensor([2])
```

```
e = a + b
f = c * b
```

```
g = e * f
print(e, f, g)
```

# Exercise III: Forward pass

Let's have something resembling more a neural network. The computational graph has been given below. You are going to initialize 3 large random tensors, and then do the operations as given in the computational graph. The final operation is the mean of the tensor, given by torch.mean(your_tensor).

<img src="image/graph_exercise.jpg">

### Instructions


- Initialize random tensors `x`, `y` and `z`, each having shape `(1000, 1000)`.
- Multiply `x` with `y`, putting the result in tensor `q`.
- Do an elementwise multiplication of tensor `z` with tensor `q`, putting the results in `f`

In [None]:
# Initialize tensors x, y and z
x = torch.rand(1000, 1000)
y = torch.rand(1000, 1000)
z = torch.rand(1000, 1000)

# Multiply x with y
q = torch.matmul(x, y)

# Multiply elementwise z with q
f = z * q

mean_f = torch.mean(f)
print(mean_f)

# (3) Backpropagation by auto-differentiation

## Deriavatives

<img src="image/Screenshot 2021-01-26 154814.png">

## Derivative Rules

| Interaction | Overall Change |
| :- | :- |
| Addition | $(f + g)' = f' + g'$ |
| Multiplication | $(f \cdot g)' = f \cdot dg + g \cdot df$|
| Powers | $(x^n)' = \frac{d}{dx}x^n = nx^{n-1}$ |
| Inverse | $(\frac{1}{x})' = -\frac{1}{x^2}$ |
| Division | $(\frac{f}{g})' = (df \cdot \frac{1}{g}) + (\frac{-1}{g^2}dg \cdot f)$ |

$\frac{d}{dx}[(f(x))^n] = n(f(x))^{n-1} \cdot f'(x)$

$\frac{d}{dx}[f(g(x))] = f'(g(x))g'(x)$

## Derivative Example - Backward Pass

<img src="image/Screenshot 2021-01-26 162201.png">

## Backpropagation in PyTorch

```
import torch

x = torch.tensor(-3., requires_grad=True)
y = torch.tnesor(5., requires_grad=True)
z = torch.tensor(-2., requires_grad=True)

q = x + y
f = q * z

f.backward()

print("Gradient of z is: " + str(z.grad))
print("Gradient of y is: " + str(y.grad))
print("Gradient of x is: " + str(x.grad))
```

# Exercise IV: Backpropagation by hand

<img src="image/der_example.jpg">

Given the computational graph above, we want to calculate the derivatives for the leaf nodes (x, y and z). To get you started we already calculated the results of the forward pass (in red) in addition to calculating the derivatives of f and q.

The rules for derivative computations have been given in the table below:

| Interaction | Overall Change |
| :- | :- |
| Addition | $(f + g)' = f' + g'$ |
| Multiplication | $(f \cdot g)' = f \cdot dg + g \cdot df$|
| Powers | $(x^n)' = \frac{d}{dx}x^n = nx^{n-1}$ |
| Inverse | $(\frac{1}{x})' = -\frac{1}{x^2}$ |
| Division | $(\frac{f}{g})' = (df \cdot \frac{1}{g}) + (\frac{-1}{g^2}dg \cdot f)$ |

### Possible Answers

- The Derivative of x is 5, the derivative of y is 5, the derivative of z is 1. (T)
- The Derivative of x is 5, the derivative of y is 5, the derivative of z is 5. (F)
- The Derivative of x is 8, the derivative of y is -3, the derivative of z is 0. (F)
- Derivatives are lame, integrals are cool. (T)

# Exercise V: Backpropagation using PyTorch

Here, you are going to use automatic differentiation of PyTorch in order to compute the derivatives of `x`, `y` and `z` from the previous exercise.

### Instructions

- Initialize tensors `x`, `y` and `z` to values 4, -3 and 5.
- Put the sum of tensors `x` and `y` in `q`, put the product of `q` and `z` in `f`.
- Calculate the derivatives of the computational graph.
- Print the gradients of the `x`, `y` and `z` tensors.


In [None]:
# Initialize x, y and z to values 4, -3 and 5
x = torch.tensor(4., requires_grad=True)
y = torch.tensor(-3., requires_grad=True)
z = torch.tensor(5., requires_grad=True)

# Set q to sum of x and y, set f to product of q with z
q = x + y
f = q * z

# Compute the derivatives
f.backward()

# Print the gradients
print("Gradient of x is: " + str(x.grad))
print("Gradient of y is: " + str(y.grad))
print("Gradient of z is: " + str(z.grad))

# Exercise VI: Calculating gradients in PyTorch

Remember the exercise in forward pass? Now that you know how to calculate derivatives, let's make a step forward and start calculating the gradients (derivatives of tensors) of the computational graph you built back then. We have already initialized for you three random tensors of shape `(1000, 1000)` called `x`, `y` and `z`. First, we multiply tensors `x` and `y`, then we do an elementwise multiplication of their product with tensor `z`, and then we compute its `mean`. In the end, we compute the derivatives.

The main difference from the previous exercise is the scale of the tensors. While before, tensors `x`, `y` and `z` had just 1 number, now they each have 1 million numbers.

<img src="image/graph_exercise.jpg">

### Instructions

- Multiply tensors `x` and `y`, put the product in tensor `q`.
- Do an elementwise multiplication of tensors `z` with `q`.
- Calculate the gradients.


In [None]:
# Multiply tensors x and y
q = torch.matmul(x, y)

# Elementwise multiply tensors z with q
f = z * q

mean_f = torch.mean(f)

# Calculate the gradients
mean_f.backward()

# (4) Introduction to Neural Networks

## Ither classifiers

- k-Nearest Neighbour
- Logistic/Linear Regression
- Random Forests
- Gradient Boosted Trees
- Support Vector Machines
- ...

## Fully connected neural networks

<img src="image/Screenshot 2021-01-26 153724.png">

```
import torch

input_layer = torch.rand(10)

w1 = torch.rand(10, 20)
w2 = torch.rand(20, 20)
w3  = torch.rand(20,4)
h1 = torch.matmul(input_layer, w1)
h2 = torch.matmil(h1, w2)
output_layer = torch.matmul(h2, w3)
print(output_layer)
```

## Building a neural network - PyTorch style

```
import torch
import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(10, 20)
        self.fc2 = nn.Linear(20, 20)
        self.output = nn.Linear(20, 4)
    def forward(self, x):
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.output(x)
        return x
```

```
input_layer = torch.rand(10)
net = Net()
result = net(input_layer)
```

## Exercise VII: Your first neural network

You are going to build a neural network in PyTorch, using the hard way. Your input will be images of size `(28, 28)`, so images containing `784` pixels. Your network will contain an `input_layer` (provided for you), a hidden layer with `200` units, and an output layer with `10` classes. The input layer has already been created for you. You are going to create the weights, and then do matrix multiplications, getting the results from the network.

### Instuctions

- Initialize with random numbers two matrices of weights, called `weight_1` and weight_2.
- Set the result of `input_layer` times `weight_1` to `hidden_1`. Set the result of `hidden_1` times `weight_2` to `output_layer`.


In [None]:
# Initialize the weights of the neural network
weight_1 = torch.rand(784, 200)
weight_2 = torch.rand(200, 10)

# Multiply input_layer with weight_1
hidden_1 = torch.matmul(input_layer, weight_1)

# Multiply hidden_1 with weight_2
output_layer = torch.matmul(hidden_1, weight_2)
print(output_layer)

# Exercise VIII: Your first PyTorch neural network

You are going to build the same neural network you built in the previous exercise, but now using the PyTorch way. As a reminder, you have 784 units in the input layer, 200 hidden units and 10 units for the output layer.

### Instructions

- Instantiate two linear layers calling them `self.fc1` and `self.fc2`. Determine their correct dimensions.
- Implement the `.forward()` method, using the two layers you defined and returning `x`.


In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        
        # Instantiate all 2 linear layers  
        self.fc1 = nn.Linear(784, 200)
        self.fc2 = nn.Linear(200, 10)

    def forward(self, x):
      
        # Use the instantiated layers and return x
        x = self.fc1(x)
        x = self.fc2(x)
        return x