# Pytorch & Deep Learning

![pytorch image](https://miro.medium.com/max/919/1*Z4L6D1RiQauGmB3TGK_wJg.gif)

Pytorch was released in 2017 as an open source project by Facebook's AI Research team. It is a framework that is extremely popular. It is a framework for building and training neural nets. Pytorch takes tensors, and simplifies the move to GPUs for faster processing needed for training neural networks. 

Pytorch also provides a much loved module that enables automatic calcs for gradients (for backprop), so no more having to calculate lots of partial derivatives. Yay!

## Pytorch Neural Nets and Tensors

For recap, we calculate the output of a network by:

$ y = f( \sum_{i} w_{i}x_{i} + b ) $

### But what are tensors?

You can think of neural network calcs as a bunch of linear algebra calcs on tensors. Tensors, are generalised formats of matrics:
- 1D Tensor -> vector
- 2D Tensor -> Matrix
- 3D tensor -> Array

Tensors are fundemental **data structures**.

## Lets build a simple Neural Network using Pytorch

In [2]:
import torch 
torch.__version__

'1.7.1'

We will begin by creating a function for our activation function, followed by creating the structures for our features, weights and bias. We will use `torch.randn`, which fills our sized tensor with values from a normal distribution with 0 mean and 1 standard deviation.

In [5]:
torch.manual_seed(7)

def sigmoid(x):
    return 1 / (1 + torch.exp(-x))

features = torch.randn(size=(1,5))
weights  = torch.rand_like(features)
bias = torch.randn(size=(1,1))

Now, lets do a simple feed forward pass, using the equation for determining $y$ detailed above. The first step is to always calculate the dot product between our features (input layer) and our weights.

In [23]:
# summ = 0
# for i in range(features.shape[1]):
#     summ += (features[0][i]* weights[0][i])
# summ = summ + bias
# summ = sigmoid(summ)
# print(summ)

y = sigmoid((torch.sum(features*weights) + bias))
print(y)

tensor([[0.6140]])


Lets instead calculate using an in-built function.

In [6]:
torch.mm(features, weights)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x5 and 1x5)

As you'd expect, we are having a shape mismatch. This is because in order to multiply two matrices, we need them to be in theright shapes (no of cols of the first tensor must be the same as the no of rows of tensor 2). There are three solutions to this problem:
- `matr.reshape(a,b)`: returns a new tensor with the same data as matr, but with size (a,b). Sometimes it clones it to a new part in memory however
- `matr.reize_(a,b)`: returns same tensor with new shape. But, if new shape is smaller, it removes elements from tensor and if it is bigger, those new elements will be uninitialised. the underscore shows it will happen in-place
- `matr.view(a,b)`: returns a new tensor with the same data as matr, but with the new size (a,b)

In [24]:
y = sigmoid(torch.mm(features,weights.view(5,1)) + bias)
y

tensor([[0.6140]])

Now, we will enhance our network by adding a hidden layer. Now, with a hidden layer, this will add some very tiny little small complexity, but luckily linear algebtra and pytorch abstract away any mathematical complexity.

Our neural network will have:
- input layer size (1,5) -> features
- hidden layer -> 2 nodes
- output layer -> 1 node 

Hence we will calculate:

$ y = f_{2}( f_{1}( x^{->}W_{1} ) W_{2} ) $

In [26]:
features = torch.randn(size=(1,3))
print('Our feature input: ', features)
# define network
input_nodes = features.shape[1] #3
hidden_nodes = 2
output_nodes = 1
print(f'We have {input_nodes} input nodes, {hidden_nodes} hidden nodes, and {output_nodes} output nodes.')
# define weights
w_0_1 = torch.randn(size=(input_nodes, hidden_nodes))
w_1_2 = torch.randn(size=(hidden_nodes, output_nodes))
# define bias
b_0_1 = torch.randn(size=(1,hidden_nodes))
b_1_2 = torch.randn(size=(1, output_nodes))

Our feature input:  tensor([[ 1.2026, -0.0063, -0.2413]])
We have 3 input nodes, 2 hidden nodes, and 1 output nodes.


A very simple network output can be calculated by:

In [27]:
h1 = sigmoid(torch.mm(features, w_0_1) + b_0_1)
y = sigmoid(torch.mm(h1, w_1_2) + b_1_2)
y

tensor([[0.6247]])

## Man, is this really how you can build large networks?
I know, you're probably thinking this is such a tedious way to build neural networks. If we are doing linear algebra per layer, what happens when we start building neural networks with 100s of layers!

Well, do not worry. Pytorch has a great framework that provides an easy way to build large neural networks.

In [29]:
import numpy as np 
import matplotlib.pyplot as plt 