# Deep Learning: Reference Sheet
***
![Subset Diagram](https://github.com/Tomjohnsonellis/strive-work/blob/main/deep-learning/img/subsets.png?raw=true)
<br>***Artificial Intelligence:*** Programs with the ability to learn and reason like humans
<br>***Machine Learning:*** Algorithms with the ability to "learn" without being explicitly programmed. Mathematical processes count for this, linear regression, backpropogation etc.
<br>***Deep Learning:*** A subset of machine learning, essentially when neural networks has 3+ layers, we would call it a deep learning model.
***
## Common Neural Network Terms
- Activation Function
<br>*What?* Applied to the output of a neuron to determine how "activated" it is, for example [ReLU](https://deepai.org/machine-learning-glossary-and-terms/rectified-linear-units) or [Sigmoid](https://deepai.org/machine-learning-glossary-and-terms/sigmoid-function).
<br>![Activation Functions](https://github.com/Tomjohnsonellis/strive-work/blob/main/deep-learning/img/activations.png?raw=true)
<br>*Why?* These numerically show what neurons are being used for the current inputs, fundamental in training networks!
<br>*[How?](https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html)*
```python
# Typically used in a sequential model on a layer's outputs and then passed to the next layer
torch.nn.ReLU(inplace=False)
```
- [Softmax](https://deepai.org/machine-learning-glossary-and-terms/softmax-layer)
<br>*What?* Takes a raw output vector and converts it to numbers that add up to 1.
<br>![Softmax Equation](https://github.com/Tomjohnsonellis/strive-work/blob/main/deep-learning/img/softmax.png?raw=true)
<br>*Why?* The raw outputs from a NN will probably be arbitray numbers, so this function converts them to a usable result for probabilities.
<br>*[How?](https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html)*
```python
torch.nn.Softmax(outputs_of_a_neural_network, dim=dimension_to_softmax)
# Or
softmax = torch.nn.Softmax()
softmaxed_outputs = softmax(outputs_of_a_neural_network)
```
- Dropout
<br>*What?* Randomly deactivates a percentage of neurons when training a NN. Different neurons each step
<br>![Dropout](https://github.com/Tomjohnsonellis/strive-work/blob/main/deep-learning/img/dropout.gif?raw=true)
<br>*Why?* This helps prevent models from overfitting to the training data. Without dropout, optimisations (like gradient descent) are applied to the entire network at each step, this can lead to neurons fixing the mistakes of other neurons - resulting in complicated co-adaptations which don't generalise well.
<br>*[How?](https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html)*
<br>[>>Example Model<<](https://wandb.ai/authors/ayusht/reports/Dropout-in-PyTorch-An-Example--VmlldzoxNTgwOTE)
```python
self.dropout = nn.Dropout(percentage_of_neurons_to_drop)
```



- Batch Normalisation
<br>*What?* In addition to normalising the inputs to the network, we normalise the inputs to *each layer* with the mean/variance of the current batch's values.
<br>*Why?* Faster and more stable training - [Paper](https://arxiv.org/pdf/1502.03167.pdf) - In short, it helps combat Internal Covariate Shift, which is where the distribution of inputs changes from layer-to-layer, which is bad.
<br>*How?* [2D or 3D inputs](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm1d.html) / [4D inputs](https://pytorch.org/docs/stable/generated/torch.nn.BatchNorm2d.html)
<br>[>>Article<<](https://www.machinecurve.com/index.php/2021/03/29/batch-normalization-with-pytorch/)
```python
# Typically used in a sequential model on the outputs of a layer, then passed to the activation function
nn.BatchNorm1d(number_of_outputs_from_layer)
```

- Loss Function
<br>*What?* A function to calculate the error of a model (*How* incorrect it is)
<br>*Why?* This is how we determine that a model is learning, you know, *the entire point of machine learning*.
<br>*How?*
```python
# Mean-Squared Error, used in regression https://pytorch.org/docs/stable/generated/torch.nn.MSELoss.html
score_criteria = nn.MSELoss()
loss = score_criteria(model_predictions, actual_values)
# Negative Log-Likelihood Loss, used in classification (remember to use LogSoftmax!) https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html
score_criteria = nn.NLLLoss()
loss = score_criteria(model_predictions, actual_values)
# Cross Entropy Loss, classification but doesn't need a softmax on the outputs https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html
score_criteria = nn.CrossEntropyLoss()
loss = score_criteria(model_predictions, actual_values)
```
- Optimizer
<br>*What?* An algorithm that updates our model to reduce error
<br>*Why?* This is how we improve our models!
<br>*[How?](https://pytorch.org/docs/stable/optim.html)*
```python
optimizer = torch.optim.SGD(model.parameters(), lr=a_learning_rate)
optimizer = torch.optim.Adam(
# Used as part of the training loop
loss = score_criteria(model_predictions, actual_values)
loss.backward()
optimizer.step()
optimizer.zero_grad()

```

In [6]:
import torch
x = torch.Tensor([ [1], [-1], [0], [-2.5], [4]])
f = torch.nn.ReLU()
print(x)
print(f(x))

print(x)
g = torch.nn.Dropout(p=0.25)
print(g(x))

tensor([[ 1.0000],
        [-1.0000],
        [ 0.0000],
        [-2.5000],
        [ 4.0000]])
tensor([[1.],
        [0.],
        [0.],
        [0.],
        [4.]])
tensor([[ 1.0000],
        [-1.0000],
        [ 0.0000],
        [-2.5000],
        [ 4.0000]])
tensor([[ 0.0000],
        [-1.3333],
        [ 0.0000],
        [-3.3333],
        [ 5.3333]])
