# 3a. Neural networks with PyTorch using nn.Module

PyTorch nn module offers building blocks (layers, functions, containers) to build neural networks. These are well presented in [torch.nn documentation](https://pytorch.org/docs/stable/nn.html "torch.nn module documentation"). This notebook should give you a good outlook of the module capabilities.

In [None]:
import torch
import matplotlib.pyplot as plt
from torch import nn

import helper

%matplotlib inline
import seaborn as sns
sns.set(style="darkgrid")

## Neural network layers

Building a neural network with tensors only would require a lot of work. To make this process easier, deep learning frameworks offer built-in building blocks for various architectures. PyTorch is not an exception.
**torch.nn** module offers linear, convolutional, recurrent and many more types of layers that you can use in your future projects. 

In this notebook we will focus on linear and dropout layers.


### Linear layers

Linear layers are the layers which apply linear transformation to the input: 
<img src="https://latex.codecogs.com/gif.latex?y=xA^{T}&plus;b" title="y=xA^{T}+b" />

In order to create a linear layer in PyTorch, you need to determine:

 - **in_features** - input size,
 - **out_features** - output size,
 - **bias** - specify whether bias is added to the layer.

In [None]:
linear_layer = torch.nn.Linear(in_features=100, out_features=20)
print(linear_layer)

### Dropout layers

Dropout layers are used to prevent overfitting and randomly zeroes some of the elements coming from its input. This has been proved to be effective.

In PyTorch dropout layers are defined as follows:

- **p** - probability of zeroing element,
- **inplace** - specify whether operation should be performed in-place. 

In [None]:
dropout_layer = torch.nn.Dropout(p = 0.3)
print(dropout_layer)

## Activation functions

Activation functions are mathematical equations which determine whether neurons should get activated. They help with normalizing outputs of the neurons.

These functions need to be efficient since they are called for every neuron - often thousands or millions of times. At the same time they should not lead to vanishing or exploding gradients.

Some of the most popular activation functions are tanh, sigmoid and ReLU, presented below. For more information about their properties, refer to [7 types of Activation Function article](https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/).


Tanh | Sigmoid | ReLU
:--: | :-----: | :--:
<img src="https://pytorch.org/docs/stable/_images/Tanh.png"/> | <img src="https://pytorch.org/docs/stable/_images/Sigmoid.png"/> | <img src="https://pytorch.org/docs/stable/_images/ReLU.png"/>

<p style='text-align: right;'> source: <a href="https://pytorch.org/docs/">PyTorch docs</p>


### PyTorch activation functions

Thankfully, one does not have to implement activation functions on their own. PyTorch and other frameworks offer built-in activation function in their modules.

This section will show how to use most popular ones.

In [None]:
# Prepare input tensor
input_tensor = torch.arange(-10.0,11.0)
print(input_tensor.shape)

Activation functions can be defined as layers with:

In [None]:
# Calculate tanh output
tanh_activation = nn.Tanh()
tanh_out = tanh_activation(input_tensor)
print(tanh_out.shape)

Or applied directly like regular functions:

In [None]:
# Calculate tanh output
tanh_out = torch.tanh(input_tensor)
print(tanh_out.shape)

In [None]:
# Calculate sigmoid output
sigmoid_out = torch.sigmoid(input_tensor)
print(sigmoid_out.shape)

In [None]:
# Calculate ReLU output
relu_out = torch.relu(input_tensor)
print(relu_out.shape)

In [None]:
# Plot the results
plots = {'Tanh': tanh_out.numpy(), 'Sigmoid': sigmoid_out.numpy(), 'ReLU': relu_out.numpy()}
helper.plot_multiple(input_tensor.numpy(), plots)

## Loading data

Load MNIST data and read sizes of training data and labels. You can see how the training data looks like.

In [None]:
from torchvision import datasets, transforms
from workshop import data

# Transforms define which steps will be applied to each sample
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5],
                         std=[0.5]),
])

# Download and load the training data
trainset = datasets.MNIST(data.DATA_PATH, download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

In [None]:
dataiter = iter(trainloader)
images, labels = dataiter.next()
print(images.shape)
print(labels.shape)

helper.view_data(trainloader)

## Exercise 1:

Define a network with a following configuration:

* one hidden linear layer with sigmoid activation function
* one output linear layer with softmax activation function

In [None]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        
        # TODO: Define hidden layer
                
        # TODO: Define output layer

        # TODO: Define activation functions
        
    def forward(self, x):
        x = self.hidden(x)
        x = self.hidden_activation(x)
        x = self.output(x)
        x = self.output_activation(x)
        return x

In [None]:
network = Network()
helper.test_network(network, trainloader)

## Exercise 2:

Define a network with an architecture of your choice.

In [None]:
class Network(nn.Module):
    def __init__(self):
        super().__init__()
        
        # TODO: Define all network layers 
        # Remember to define at least:
        # - input to hidden (input size: )
        # - hidden to output (output size: 10)
        
        # TODO: Define all activation functions
        # - use softmax for last layer
        
    def forward(self, x):
        # TODO: Pass input tensor through all defined layers
        return x

In [None]:
network = Network()
helper.test_network(network, trainloader)

## References:

- [PyTorch NN module documentation](https://pytorch.org/docs/stable/nn.html)
- [7 Types of Activation Functions: How to Choose?](https://missinglink.ai/guides/neural-network-concepts/7-types-neural-network-activation-functions-right/)