# Regression with a (very very simple) pytorch neural network

**Note :** to use this notebook in Google Colab, create a new cell with
the following line and run it.

``` shell
!pip install git+https://gitlab.in2p3.fr/jbarnier/ateliers_deep_learning.git
```

In [None]:
import numpy as np
import plotnine as pn
import torch
from sklearn import preprocessing
from torchinfo import summary

from adl.sklearn import skl_regression

pn.theme_set(pn.theme_minimal())

In the previous notebooks, we used gradient descent to solve simple
linear regression problems. In this notebook we introduce a way to do
the same thing but using a (very simple) neural network defined with
pytorch syntax.

We will reuse our fake data about temperature and ice cream sales.

In [None]:
temperature = [-1.5, 0.2, 3.4, 4.1, 7.8, 13.4, 18.0, 21.5, 32.0, 33.5]
icecream = [100.5, 110.2, 133.5, 141.2, 172.8, 225.1, 251.0, 278.9, 366.7, 369.9]

As seen previously, we scale the `temperature` values in order to
improve the training process.

In [None]:
temperature_s = preprocessing.scale(temperature, with_mean=True)

We then compute the “real” optimal slope, optimal intercept and minimal
loss with `scikit-learn`.

In [None]:
reg = skl_regression(temperature_s, icecream)
print(f"slope: {reg['slope']:.2f}, intercept: {reg['intercept']:.2f}, mse: {reg['mse']:.4f}")


Finally, we transform our input and target values into tensors. One
difference here is that we have to reshape our data: pytorch requires to
have each observation and target in its own array, so for example the
temperatures `[100.5, 110.2, 133.5]` must be converted to
`[[100.5], [110.2], [133.5]]`. In other words, our input and target data
are now arrays with one column instead of vectors.

In [None]:
x = torch.tensor(temperature_s).float().view(-1, 1)
y = torch.tensor(icecream).float().view(-1, 1)

## Regression with pytorch and a single neuron neural network

In the previous notebooks, we created our model by just creating a
simple `forward` function, like this:

``` python
def forward(x):
    return w * x + b
```

This approach is adequate for a simple model like we have now, but for
more complex models such as neural networks, we will need to use
pytorch’s built-in functions to define them.

In fact, a simple linear regression with a single explanatory variable
can be considered as a neural “network” with just one neuron. So we will
try to rewrite our simple model using pytorch’s notation.

One way to define our “network” is to use the *Module* notation,
provided by `torch.nn.Module`. This notation forces to create a new
Python class, which inherits from `nn.Module`, and then to create at
least an `__init__()` method (called when the model is created) and a
`forward()` method, which takes input data as argument, applies our
model and returns the predicted values.

To create our simple linear regression model, we will use `nn.Linear`,
which allows to define linear layers of arbitrary size. Here our layer
will have a single neuron which will take a single number as input (a
temperature value) and will output a single (a predicted ice cream sale
volume). In pytorch notation, this means that our layer will have
`in_features` of size 1, and `out_features` of size 1.

Here is the code of a `LinearNetwork` class which implements this model.

In [None]:
from torch import nn


class LinearNetwork(nn.Module):
    """
    Simple linear regression model with only one input variable.
    """

    def __init__(self):
        # Call the parent constructor (mandatory)
        super().__init__()
        # Create a "linear" attribute which will contain a linear layer with input and
        # output of size 1
        self.linear = nn.Linear(in_features=1, out_features=1)

    def forward(self, x):
        """
        Method which implements the model forward pass, ie which takes input data as
        argument, applies the model to it and returns the result.
        """
        # Apply our linear layer to input data
        return self.linear(x)


Once our model class has been created, we can use it to create a new
model object (or model instance).

In [None]:
model = LinearNetwork()

It is important to distinguish between:

-   a model class, like `LinearNetwork`, which is a Python class
    describing a model architecture
-   a model object or model instance, like `model`, which is a concrete
    model created using the `LinearNetwork` architecture

We can use the `summary` function of the `torchinfo` package to display
a description of our `model` object.

In [None]:
summary(model)

We can see that `model` has one layer and two parameters: the weight and
the bias of our single “neuron”. We can see that pytorch take care of
creating these parameters, we don’t have to manually create `w` and `b`
tensors anymore.

We can pass input data directly to our `model` object. In this case, it
will call the `model.forward()` which applies the model to the input
data to compute prediction. We can see that both are equivalent (the
predictions here are random because `model` parameters have been
initialized randomly during `model` creation).

In [None]:
model(x)

In [None]:
model.forward(x)

We can now build our training process. We will use `MSELoss()` as loss
function, and an `SGD` optimizer with a learning rate of 0.1. However,
instead of explicitly passing a list of parameters like`[w, b]` as first
optimizer argument, we will use `model.parameters()` which will
automatically provide all the parameters of our `model` object.

In [None]:
loss_fn = nn.MSELoss()
learning_rate = 0.1
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)  # type: ignore

Finally, we define and run our training loop for a certain number of
epochs:

-   we start by resetting our gradient with `optimizer.zero_grad()`
-   we compute the predicted values by applying our `model` object to
    the input data (forward pass)
-   we compute the loss value
-   we compute the loss gradient for each parameter (backpropagation)
-   finally we adjust our parameters by calling `optimizer.step()`

In [None]:
epochs = 20
for epoch in range(epochs):
    # Set the model to training mode - important for batch normalization and dropout
    # layers. Unnecessary in this situation but added for best practices
    model.train()
    # Reset gradients
    optimizer.zero_grad()
    # Forward pass: compute predicted values
    y_pred = model(x)
    # Compute loss
    loss = loss_fn(y_pred, y)
    # Backpropagation
    loss.backward()
    # Parameters adjustment
    optimizer.step()

    # Print results for this epoch. We can get the weight and bias values by accessing the
    # "weight" and "bias" attributes of the model.linear layer
    print(
        f"{epoch + 1:2}. loss: {loss:7.1f}, weight: {model.linear.weight.item():5.2f},"
        f" bias: {model.linear.bias.item():6.2f}"
    )

We can see that our training process seems to converge towards the
“true” values computed above.

## Regression with two explanatory variables

If we want to do a linear regression with two explanatory variables, our
input data `X` will now be an array with two columns.

In [None]:
# Input data
temperature = [-1.5, 0.2, 3.4, 4.1, 7.8, 13.4, 18.0, 21.5, 32.0, 33.5]
humidity = [50.1, 34.8, 51.3, 64.1, 47.8, 53.4, 58.0, 71.5, 32.0, 43.5]
icecream = [100.5, 110.2, 133.5, 141.2, 172.8, 225.1, 251.0, 278.9, 366.7, 369.9]


X = np.array([temperature, humidity]).transpose()
X


We can scale our input data `X` and convert `X` and `y` to tensors as
usual.

In [None]:
X = preprocessing.scale(X)
X = torch.tensor(X).float()

y = torch.tensor(icecream).float().view(-1, 1)

**Exercise 1**

-   create a new model class `LinearNetwork2` which will have the same
    architecture of `LinearNetwork` but will accept input data of
    dimension 2.
-   create a new `model2` object from the `LinearNetwork2` class
-   display a summary description of `model2`
-   run a training loop of `model2` for 20 epochs with an `MSELoss` loss
    and an `SGD` optimizer with a 0.1 learning rate

*Hint :* if you want to display the weights and bias values at each
epoch, you can use `model2.linear.weight.data` and
`model2.linear.bias.item()`.

### Generalization to any number of explanatory variables

**Exercise 2**

We created two different classes above: one for a linear regression
model with only one explanatory variable, and one for two explanatory
variables. Now we will try to create a more generic model class that can
return models accepting any number of explanatory variables.

-   Create a new `GeneralLinearNetwork` class by starting from the
    `LinearNetwork` class seen above
-   Modify the `__init__()` method so that it accepts a new argument
    called `n_variables`
-   Modify the `self.linear` creation so that it takes into account the
    value passed as `n_variables` argument

Once the class has been created:

-   instanciate a model object called `model1` which accepts input data
    with one column and apply it to the `x` input data
-   instanciate a model object called `model2` which accepts input data
    with two columns and apply it to the `X` input data