# End to End Regression Model Training in Pytorch

Let's train a regression model from start to finish on some example data. For this practical we will use the Boston Housing dataset. The Boston Housing dataset is a widely used dataset for regression analysis and machine learning, consisting of 13 feature variables describing various aspects of residential homes in the Boston suburbs and a target variable indicating the median value of owner-occupied homes in $1000s. It was collected by the U.S Census Service in 1978 and has been used for benchmarking and evaluation of machine learning algorithms.

In [None]:
# Run this cell to import the packages we need and load the dataset
from sklearn.datasets import load_boston
import pandas as pd
import torch
import torch.nn as nn
boston_data = load_boston()

Run the cell below to print the keys of the dataset dictionary

In [None]:
boston_data['data']


In the codeblock below, create a `pandas` dataframe called `df` from the array in the `data` field of the dictionary, assiging column names from the `feature_names` field.
Then add a column called `Price` to the dataframe, consisting of the values in the `target` field.

In [None]:
# write your code here!
df.head()

Let's take a look at the data by running the cell below:

In [None]:
df.describe()

The data are quite diverse, with very different absolute value ranges for each feature. In order to give the features equivalent weight in the model, it is good practice to normalise the features data. Run the code block below to normalise the data:

In [None]:
data = df[df.columns[:-1]]
data = data.apply(
    lambda x: (x - x.mean()) / x.std()
)

data['Price'] = df.Price

data.describe()

We can see that the data are now arranged around a mean of approximately zero, with an SD of 1. 

We can now use this data to build a `Dataset` class  - essentially a container that stores the data, along with a set of inbuilt functions (methods) which allow us to pass the data to our model in a format that it can use. You don't need to worry too much about the details of the `Dataset` class at this stage, but you can read a brief description of its structure below.

There are three key methods in a Pytorch dataset:

- The first is the class constructor, which is a method that every Python class has. It tells the python interpreter what to do when it makes an instance of the class.

- Second, we need a method called `__getitem__`. This method defines what happens when we ask the dataset for a single example datapoint - ie. a set of features and a label. 

- Finally we have the `__len__` method. This describes what to do when we call python's `len()` method on the dataset, and returns the number of samples in the dataset.

In [None]:
class BostonDataset(torch.utils.data.Dataset):

    def __init__(self):
        self.x=data.drop('Price', axis=1).to_numpy()
        self.y=data['Price'].to_numpy()

    def __len__(self):
        return len(self.y)
    
    def __getitem__(self,idx):
        features = torch.tensor(self.x[idx,:],dtype=torch.float32).unsqueeze(0)
        label = torch.tensor(self.y[idx], dtype=torch.float32 )
        return features,label

dataset=BostonDataset()




You can now get a single sample of the data by indexing. Add some code to the cell below to print the shape of the features and the label from a single sample of the dataset.

In [None]:
features,label=dataset[1]
# write code here to print the shape of the features and label

We next define a data loader. The `dataloader` is a tool for collating a batch of samples from the dataset, and passing it to the model. Look online at the docs for `torch.utils.data.DataLoader` and see if you can work out how to get the dataloader to give us an example batch of data, and print the shape of the batch of features and the batch of labels.

In [None]:
data_loader = torch.utils.data.DataLoader(dataset, batch_size=8, shuffle=True)
# Add code to get the next iteration of the dataloader, and print the shape of the labels and features 

Now that we have defined how to pass the data to our model, we can build the model itself. You don't need to worry about all the details of how the model is constructed at this stage, just note that it is built from an alternating sequence of linear and nonlinear (`ReLu`) layers. Also note that it contains an inbuilt function (known as a  `method`) called `forward`. The `forward` method defines what happens when we pass the data to the model during the forward pass, and all the detail is handled under the surface by `Pytorch` !

In [None]:
class RegressionModel(nn.Module):
    def __init__(self):
        super(RegressionModel,self).__init__()
        self.fc = nn.Sequential(
            nn.Linear(13, 13),
            nn.ReLU(),
            nn.Linear(13, 13),
            nn.ReLU(),
            nn.Linear(13, 1)
        )

    def forward(self, x):
        x = self.fc(x)
        return x

Next we set things up ready for training. The definition in the previous code block is called a `class`, and is a general description of the model, whereas the variable called `model` below is an instance of that class, a single example of it.

The `criterion` is the function we use to calculate the loss between the predictions and the labels, in this case we are doing regression so we use mean-squared-error (`MSE`).

Finally we need an `optimiser`, that's the function that determines how to update the model's connection weights based on the gradients.

In [None]:
model = RegressionModel()
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr = 0.0003)

The last step is to construct the training loop. The cell below contains a basic outline for the training loop. Fill in the python code to make it work correctly, per the comments.

In [None]:
torch.manual_seed(800)

epoch_idx = 0

for epoch in range(100):
    for i, (inputs, targets) in enumerate(data_loader):
        
        # reset the gradients in the optimiser
        # pass the inputs to the model and assing to a variable called 'outputs'
        # calculate the loss between the outputs and the targets using the criterion
        # perform the backward pass
        # step the optimiser
        pass
    print(f'Epoch: {epoch + 1}/100, Loss: {loss.item():.4f}')
    # add some code here to concatenate the loss for each epoch into a list, so that the data can be plotted.
    epoch_idx+=1

Now let's plot our loss curve. Use the matplotlib library to make a scattergraph of your list of loss values (y-axis) against epoch number (x-axis). What can you say about the training? Has the model converged?