# Introduction

<p align="justify">We can now import the necessary packages and tools to check out how to create and configure some custom linear and convolutional neural networks with the tools I wrote.</p>

In [1]:
import torch
import numpy as np
from torchinfo import summary
from delphi.networks.LinearNets import SimpleLinearModel
from torch.utils.data import DataLoader, Subset, Dataset

## Creating a simple linear neural network

Let us first create a ```SimpleLinearModel``` network. What we do in the code cell below is creating a linear network that a vector of 784 values and provides us with 10 classification probabilities.

In [2]:
# For the simplest version of a linear neural network this is all you have to do:
model = SimpleLinearModel(784, 10)

# to see what the network is made of we can use the 'summary' function.
# If we submit a 1-by-784 (batch, inputsize) to the function we get some
# additonal information from 'summary' which tells us a bit about our network.
print(summary(model, (1, 784)))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleLinearModel                        [1, 10]                   --
├─Sequential: 1-1                        --                        10,336
│    └─Sequential: 2-1                   [1, 128]                  --
│    │    └─Linear: 3-1                  [1, 128]                  100,480
│    │    └─ReLU: 3-2                    [1, 128]                  --
├─Dropout: 1-2                           [1, 128]                  --
├─Sequential: 1-1                        --                        10,336
│    └─Sequential: 2-2                   [1, 64]                   --
│    │    └─Linear: 3-3                  [1, 64]                   8,256
│    │    └─ReLU: 3-4                    [1, 64]                   --
├─Dropout: 1-4                           [1, 64]                   --
├─Sequential: 1-1                        --                        10,336
│    └─Sequential: 2-3                   [1, 32]                 

What you see here is the default configuration of the ```SimpleLinearModel``` class. From the above output we can already gather some important information about the just created network:

* It has 3 successive linear layers
    * with decreasing number of neurons per layer [128, 64, 32]
* Each layer is followed by a ReLU activation function
* Apparently after every linear layer Dropout is used
* The output is probably transformed by a Softmax function

We also see how many trainable parameters this network has. The ```summary``` function is a neat little tool from which you can also estimate how much space your network requires. If you have limited amount of (G)RAM available to you this may help you also define your batch size for the training process.

If you like to customize your network follow the steps below:

## Customizing the network

Let's say we would like to have more layers than the default settings provide. We also would like to use a custom train function associated with our network. Before we do that we also need to know what parameters are actually required to *build* the network. Each default network that I provide with my code comes with a ```classname._REQUIRED_PARAMETERS``` variable. The code cell below shows you what the ```SimpleLinearModel``` requires at minimum. 
    
```{warning}
If you do not set lin_neurons in your custom config the class will always resort to its default values!
```

In [3]:
print(SimpleLinearModel._REQUIRED_PARAMS)

['lin_neurons']


In the next code cell below you can see that I created a function ```my_train``` which simply prints something five times. Furthermore, I changed the number of linear neurons I want per layer. Notice that there is two more layers in this configuration. The ```SimpleLinearModel``` class takes care of that for you.

After initializing the new model I again show the network summary as proof that something changed and I execute the ```model_configured.fit()``` function too to demonstrate that the network actually uses ```my_train``` instead of ```standard_train```.

In [4]:
def my_train(model, dataloader):
    r"""
    Notice that we have an additional variable 'dataloader' in this function.
    When you decide to write a custom training function you are currently required to 
    have a dataloader variable of type torch.data.utils.DataLoader!
    
    """
    for i in range(5):
        print(model.config['my_variable'])

In [5]:
my_config = {
    'lin_neurons': [256, 128, 64, 32, 16],
    'my_variable': "Something for a custom train_fn",
}

model_configured = SimpleLinearModel(784, 10, my_config, my_train)
# show the model summary
print(summary(model_configured, (1,784)))

Layer (type:depth-idx)                   Output Shape              Param #
SimpleLinearModel                        [1, 10]                   --
├─Sequential: 1-1                        --                        43,760
│    └─Sequential: 2-1                   [1, 256]                  --
│    │    └─Linear: 3-1                  [1, 256]                  200,960
│    │    └─ReLU: 3-2                    [1, 256]                  --
├─Dropout: 1-2                           [1, 256]                  --
├─Sequential: 1-1                        --                        43,760
│    └─Sequential: 2-2                   [1, 128]                  --
│    │    └─Linear: 3-3                  [1, 128]                  32,896
│    │    └─ReLU: 3-4                    [1, 128]                  --
├─Dropout: 1-3                           [1, 128]                  --
├─Sequential: 1-1                        --                        43,760
│    └─Sequential: 2-3                   [1, 64]                

In [6]:
# the new config
print("The config:")
[print(k,":", v) for k, v in zip(model_configured.config.keys(), model_configured.config.values())];

The config:
lin_neurons : [256, 128, 64, 32, 16]
my_variable : Something for a custom train_fn
input_vals : 784
n_classes : 10
train_fn : <function my_train at 0x7fae5267f820>


In [7]:
# the output of the custom train function
print("The output of the .fit function:")
model_configured.fit(DataLoader(Dataset()))

The output of the .fit function:
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn


To briefly demonstrate what happens if you provide a config to the ```SimpeLinearModel``` class without setting the ```lin_neuron``` key: 

What I will do here is pretty much the same as above. I will use the same ```my_train``` function as well as the ```my_variable``` key (however with a different value) in the config dict. Take a close look at the output!

In [8]:
# create the model, this time just providing the dict definition directly in the constructor call
model_configured2 = SimpleLinearModel(784, 10, {'my_variable': 'Something different this time'}, my_train)

# show us what the config looks like
print("The config:")
[print(k,":", v) for k, v in zip(model_configured2.config.keys(), model_configured2.config.values())];

# call the .fit() function again. We should see something different from the first output now.
print("\nThe output of the .fit function:")
model_configured2.fit(DataLoader(Dataset()))

The config:
lin_neurons : [128, 64, 32]
my_variable : Something different this time
input_vals : 784
n_classes : 10
train_fn : <function my_train at 0x7fae5267f820>

The output of the .fit function:
Something different this time
Something different this time
Something different this time
Something different this time
Something different this time


So, I hope you noticed that the values in ```lin_neurons``` are now equal to those of the default ```SimpleLinearModel```.

What we saw here is that we can pretty much add as many variables to our configuration as we want. However, we can only change the network directly if we change the parameters that are required by the network (i.e., ```_REQUIRED_PARAMS```). 

Next we will take a look at how we can save and load the models.

## Saving and loading a model

Saving the model is quite straight forward. However, there are multiple ways for you to do it:

1. You call ```model.save('path/to/model_name')``` and the so-called *state_dict* (see: [What is a state_dict in PyTorch](https://pytorch.org/tutorials/recipes/recipes/what_is_state_dict.html)) of the network is saved into ```path/to/model_name/state_dict.pth``` alongside a ```config.yaml``` file. 
2. You call ```model.save('path/to/model_name', save_full=True)```

We will now look at the different ways one can save and load the models with the code blocks below.

In [9]:
# 1. We save the state_dict of our configured model
model_configured.save('my_configured_model')

# loading the model. Note that in this case you effectifely create a "new" network and then fill its weights
# and biases with the state_dict you saved before. This requires you to implement the SimpleLinearModel class!
loaded_model = SimpleLinearModel('my_configured_model')

# notice that you can still use the my_train function you supplied by calling the .fit method
loaded_model.fit(DataLoader(Dataset()))

Saving my_configured_model/state_dict.pth
Loading from config file my_configured_model/config.yaml
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn


In [10]:
# 2. We save the entirety of our configured model
model_configured.save('my_configured_model', save_full=True)

# Using loading approach does not require you to import the SimpleLinearModel class
# in a different python file for example. 
loaded_model = torch.load('my_configured_model/model.pth')

# same here, you can still use your custom train function
print('\nLoaded with torch.load')
loaded_model.fit(DataLoader(Dataset()))

Saving entire model my_configured_model/model.pth

Loaded with torch.load
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn
Something for a custom train_fn


## Exercises

* Try to create a ```Simple2dCnnClassifier```. The class can be found in the ```_core.networks.ConvNets``` module 
* Find out what the required parameters of the ```Simple2dCnnClassifier``` are and make some changes to them
* Add your own custom training function to the network (without any actual training)
* Save and load your custom network

<p align="justify">That's it for the introduction. Next up we will see how you can use these neural nets to distinguish between the handwritten digits of the MNIST dataset.</p>