![Pytorch](images/pytorch_logo.png)

# Regression in plain Pytorch
Let's rebuild our regression, but this time we use pytorch to handle our math!

In [1]:
import torch

from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler

In [2]:
# Set seed
seed = 42
torch.manual_seed(seed);

## Load data

We are using the built-in sklearn dataset Boston House Prices.

Our goal is to predict the median price of a home in a given town from a number of features, such as Crime Rate, Property Tax Rate, amount of Industry etc.

It's generally a good idea to scale our data, so we use Sklearn's MinMax scaler to scale our values between 0 and 1

In [3]:
# Load our dataset
boston = load_boston()
train_x, test_x, train_y, test_y = train_test_split(boston.data, boston.target, random_state=seed)
scaler = MinMaxScaler()

train_x = torch.tensor(scaler.fit_transform(train_x), dtype=torch.float)
test_x = torch.tensor(scaler.transform(test_x), dtype=torch.float)
train_y = torch.tensor(train_y, dtype=torch.float).view(-1, 1)
test_y = torch.tensor(test_y, dtype=torch.float).view(-1, 1)

## Setup parameters

We have some hyperparameters to set, as well as some numbers we need to know upfront.

`layer_size` --> We need to know how many input variables there are, so we can create an equivalent number of weights

`lr` --> Aka learning rate.
When we take a step in our gradient descent, we multiply by this factor, so we don't take too big or too large a step. 

`epochs` --> How many times should we keep stepping?

In [4]:
layer_size = train_x.shape[1]
lr = 0.05
epochs = 700

## Initialize weights and bias

We need one weight to multiply each feature with - we are learning what these should be, so we start them as a random number. 

In [5]:
# Initialize weights
w = torch.randn(layer_size, 1, requires_grad=True, dtype=torch.float)
b = torch.zeros(1, requires_grad=True, dtype=torch.float)

## Define Loss Function

Just like before, we want to use mean squared error to say how bad or good our line is

In [6]:
# Define loss function
def mean_squared_error(y_hat, y):
    return ((y_hat - y) ** 2).mean()

In [7]:
# Training loop
for epoch in range(epochs):
    # Forward pass
    pred = train_x @ w + b

    loss = mean_squared_error(pred, train_y)

    # Backpropagation
    loss.backward() # The magic bit!
    with torch.no_grad():
        w -= w.grad * lr
        b -= b.grad * lr
        w.grad.zero_() # Gotta reset the gradients to zero, so they don't accumulate
        b.grad.zero_()
        
        # Validate model
        val_pred = test_x @ w + b
        val_loss = mean_squared_error(val_pred, test_y)
        if epoch % 10 == 0:
            print(f"Epoch: {epoch} Train Loss: {loss.item()} Test Loss: {val_loss.item()}")

Epoch: 0 Train Loss: 507.965576171875 Test Loss: 223.5579071044922
Epoch: 10 Train Loss: 99.24380493164062 Test Loss: 78.80070495605469
Epoch: 20 Train Loss: 77.88609313964844 Test Loss: 60.800838470458984
Epoch: 30 Train Loss: 65.93444061279297 Test Loss: 51.3888053894043
Epoch: 40 Train Loss: 58.795101165771484 Test Loss: 46.30845260620117
Epoch: 50 Train Loss: 54.17264175415039 Test Loss: 43.3354606628418
Epoch: 60 Train Loss: 50.90646743774414 Test Loss: 41.38874053955078
Epoch: 70 Train Loss: 48.40536880493164 Test Loss: 39.94911193847656
Epoch: 80 Train Loss: 46.364295959472656 Test Loss: 38.77071762084961
Epoch: 90 Train Loss: 44.622459411621094 Test Loss: 37.739166259765625
Epoch: 100 Train Loss: 43.09225082397461 Test Loss: 36.802215576171875
Epoch: 110 Train Loss: 41.72350311279297 Test Loss: 35.93635559082031
Epoch: 120 Train Loss: 40.48552322387695 Test Loss: 35.130828857421875
Epoch: 130 Train Loss: 39.35799789428711 Test Loss: 34.380210876464844
Epoch: 140 Train Loss: 38.