# BUILD THE NEURAL NETWORK

This [tutorial](https://pytorch.org/tutorials/beginner/basics/buildmodel_tutorial.html) is also given a webpage.

From `torch.nn` we get all the building blocks required for building a neural network.
Its probably best just to try, but basically a neural network in this is an object.

In [2]:
import os
import torch
from torch import nn
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

## Get Device for Training

We would very much like our Neural Networks to run on the GPU, because it is significantly faster. So that is what we wil start by trying to accomplish:

In [3]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f'Using {device} device')

Using cpu device


## Define the Class

Then define the class (object) which is our neural network:

In [4]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        # I believe that this "Flatten" function just flattens the
        # data to a tensor which can be parsed through a network.
        self.flatten = nn.Flatten()
        
        # Then this is actually the network.
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28*28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10),
        )

    # Here is the function that takes the data through the network.
    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits


# Here we just look at how it looks (yes.. very philosophical)    
model = NeuralNetwork().to(device)
print(model)

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
)


To use our neural network (model) we pass it the input data. Do not simply call `model.forward()` on the data, because some background (torch fancy) operations needs to be run.

So we instead call `model()` on the data.

In [5]:
X = torch.rand(1, 28, 28, device=device)

# I am unsure what "logits" is supossed to mean but it is basically the output
# from our network.
logits = model(X)
pred_probab = nn.Softmax(dim=1)(logits)

# The softmax function here just rescales it to the domain [0,1].
y_pred = pred_probab.argmax(1)
print(f"Predicted class: {y_pred}")

Predicted class: tensor([6])


## Model Layers

What do the different layers of the model be doing?

### `nn.Flatten()`
As I *commented* at the code for the model. The flatten layer is simply put to flatten our data. In this case it makes great sense as the data is pictures. 

Here is given an example of the dimensionality of the data after flattening.

In [6]:
# start out by creating a VERY random image
input_image = torch.rand(3, 28, 28)
print("The size of the input image, before the 'flatten':")
print(input_image.size())

# just renaming the function
flatten = nn.Flatten()
flat_image = flatten(input_image)
print("\nThe size of the image after 'flattening':")
print(flat_image.size())

The size of the input image, before the 'flatten':
torch.Size([3, 28, 28])

The size of the image after 'flattening':
torch.Size([3, 784])


### The ReLU and Linear layer

I think that I understand what the Rectified Linear Unit and Linear functions does, roughly enough that I do not need to describe it here.

Alas, moving on...

### nn.Softmax()

The model returns raw data in the domain [-infty, infty] (from negative infinity to positive infinity). This needs to be in the domain [0,1] to show the class which the model predicts each observation lies within. This is accomplished by the Softmax function. 

## Model parameters

This propably won't be very useful at any point, but might as well mark it down.

After the model has trained how do we find the different weight which it has assigned?
Well.. look no further than this snippet of code:

In [7]:
print("Model structure: ", model, "\n\n")

for name, param in model.named_parameters():
    print(f"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \n")

Model structure:  NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=784, out_features=512, bias=True)
    (1): ReLU()
    (2): Linear(in_features=512, out_features=512, bias=True)
    (3): ReLU()
    (4): Linear(in_features=512, out_features=10, bias=True)
  )
) 


Layer: linear_relu_stack.0.weight | Size: torch.Size([512, 784]) | Values : tensor([[-0.0005, -0.0134, -0.0007,  ...,  0.0205, -0.0305, -0.0312],
        [ 0.0282,  0.0194, -0.0033,  ...,  0.0139,  0.0097,  0.0239]],
       grad_fn=<SliceBackward0>) 

Layer: linear_relu_stack.0.bias | Size: torch.Size([512]) | Values : tensor([0.0017, 0.0245], grad_fn=<SliceBackward0>) 

Layer: linear_relu_stack.2.weight | Size: torch.Size([512, 512]) | Values : tensor([[ 0.0137,  0.0308,  0.0328,  ..., -0.0093,  0.0179, -0.0079],
        [ 0.0141, -0.0300,  0.0300,  ...,  0.0326,  0.0406, -0.0295]],
       grad_fn=<SliceBackward0>) 

Layer: linear_relu_stack.2.bias | 