# Thinking in Tensors, writing in PyTorch

A hands-on course by [Piotr Migdał](https://p.migdal.pl) (2019).
This notebook prepared by [Weronika Ormaniec](https://github.com/werkaaa).

## Notebook 4: Multiple Linear Regression

Simple linear regression is a useful tool when it comes to predicting an output given single predictor input. However, in practice we often come across problems which are described by more than one predictor. In this case we use Multiple Linear Regression.

Instead of fitting several linear equations for each predictor, we will create one equation that will take the form:
$$ Y = \alpha_0 + \alpha_1 \cdot X_1 + \alpha_2\cdot X_2 + ... + \alpha_n\cdot X_n$$
where $X_i$ is one of the predictors, $\alpha_1$ is a coefficient, we want to get to know and $n$ is the number of predictors.

The learning process in Multiple Linear Regression is the same as the one in Simple Linear Regression. 

In [None]:
import numpy as np
import pandas as pd
from sklearn.datasets import load_boston
from livelossplot import PlotLosses

from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import transforms

### Data

In this notebook we will analyze The Boston Housing Dataset. It contains information about 506 houses in Boston. There are 13 features of the houses, which have grate or little impact on the price of the house. Using PyTorch we will implement a model that will predict the prize of the house.

We will take the dataset from scikit learn datasets.

In [None]:
boston = load_boston()
boston_data_frame = pd.DataFrame(boston.data, columns=boston.feature_names)
boston_data_frame

We can see that some predictors have different orders of magnitude. That can be an obstacle during model training. That is why, we will normalize the data, so they will be in range $[-1,1]$.

In [None]:
X = torch.tensor(boston.data, dtype=torch.float32)
Y = torch.tensor(boston.target, dtype=torch.float32)

In [None]:
tmp = torch.argmax(X, 0).type(torch.FloatTensor)
tmp.type()

In [None]:
def Normalize(data):
    data_mean = torch.mean(data, 0)
    data_max = torch.argmax(data, 0).type(torch.FloatTensor)
    data_min = torch.argmin(data, 0).type(torch.FloatTensor)
    data = (data-data_mean)/(data_max-data_min)
    return data

In [None]:
X_normalized = Normalize(X)

In [None]:
boston_data_frame = pd.DataFrame(np.array(X_normalized), columns=boston.feature_names)
boston_data_frame

This time we will divide the data into training and test sets because we will be able to measure how well the model is doing in general, on the examples it has not seen during training process.

In [None]:
X_train = X_normalized[:400]
Y_train = Y[:400]
X_test = X_normalized[401:]
Y_test = Y[401:]

### Model

In [None]:
linear_model = nn.Linear(in_features=13, out_features=1)
print(linear_model.weight)
print(linear_model.bias)

In [None]:
y_predict_train = linear_model(X_train)
rmse_train = torch.sqrt(F.mse_loss(Y_train, y_predict_train))

y_predict_test = linear_model(X_test)
rmse_test = torch.sqrt(F.mse_loss(Y_test, y_predict_test))

print("The PyTorch model performance:")
print('RMSE_train is {}'.format(rmse_train))
print('RMSE_test is {}'.format(rmse_test))

In [None]:
optim = torch.optim.SGD(linear_model.parameters(), lr=0.1)
loss_function = F.mse_loss
loss = loss_function(linear_model(X), Y)
print(loss)  

In [None]:
def train(X, Y, model, loss_function, optim, num_epochs):
    loss_history = []
    preds = torch.tensor([])
    liveloss = PlotLosses()

    for epoch in range(num_epochs):
        
        epoch_loss = 0.0
        
        Y_pred = model(X)
        loss = loss_function(Y_pred, Y)
        
        loss.backward()
        optim.step()
        optim.zero_grad()
        
        preds = torch.cat([preds, Y_pred], 0)
        
        epoch_loss = loss.data.item()
        
        avg_loss = epoch_loss / len(X)

        liveloss.update({
            'loss': avg_loss,
        })
        liveloss.draw()
    
    return preds

predictions = train(X_train, Y_train, linear_model, loss_function, optim, num_epochs=80)

In [None]:
print(linear_model.weight)
print(linear_model.bias)

In [None]:
y_predict_train = linear_model(X_train)
rmse_train = torch.sqrt(F.mse_loss(Y_train, y_predict_train))

y_predict_test = linear_model(X_test)
rmse_test = torch.sqrt(F.mse_loss(Y_test, y_predict_test))

print("The PyTorch model performance:")
print('RMSE_train is {}'.format(rmse_train))
print('RMSE_test is {}'.format(rmse_test))

A we can see, our model fits the data better after training. 

We can now compare it with scikit learn linear regression model.

In [None]:
lin_model = LinearRegression()
lin_model.fit(np.array(X_train), np.array(Y_train))

In [None]:
y_ptrain = lin_model.predict(X_train)
rmse_tr = (np.sqrt(mean_squared_error(Y_train, y_ptrain)))

y_ptest = lin_model.predict(X_test)
rmse_te = (np.sqrt(mean_squared_error(Y_test, y_ptest)))

print("The model performance for training set")
print('RMSE_train is {}'.format(rmse_tr))
print('RMSE_test is {}'.format(rmse_te))

Our model is not perfect but it has learned some intuition about the data and is able to make predictions even on the data it has not seen during learning process.