# Linear Regression

> Linear Regression supposes that there's a linear relation between inputs and outputs (targets).

This notebook shows how to train a linear regression model in PyTorch in two ways:
- [from scratch](#1.-Linear-Regression-from-scratch), functions are built manually.
- [using PyTorch built-ins function](#2.-Linear-Regression-using-PyTorch-built-ins).

## 1. Linear Regression from scratch

The figure below presents the workflow of this part.

![lnr-scratch-workflow](images/linear_regression_from_scratch.svg)

- [x] Convert inputs & targets to tensors: convert data (*inputs* & *targets*) from numpy arrays to tensors.
- [x] Initialize parameters: identify the number of samples, of features and of targets. Initialize *weights* and *bias* to predict target. Theses parameters will be optimized in training process.
- [x] Define functions: create *hypothesis function* (model) to predict target from input, and *cost function* (loss function) to compute the difference between the prediction and the target.
- [x] Train model: find the *optimal values* of the parameters (weights & bias) by using gradient descent algorithm. Make sure **reset gradients to zero** before the next iteration.
- [x] Predict: using optimal parameters to predict target from a given input.

In [1]:
import numpy as np
import torch

### 1.1. Prepare data

Converting inputs & targets to tensors.

In [2]:
# inputs
inputs = np.array([[73, 67, 43], 
              [91, 88, 64], 
              [87, 134, 58], 
              [102, 43, 37], 
              [69, 96, 70]], dtype='float32')

# targets
targets = np.array([[56, 70], 
              [81, 101], 
              [119, 133], 
              [22, 37], 
              [103, 119]], dtype='float32')

# convert inputs and targets to tensors
X = torch.from_numpy(inputs)
Y = torch.from_numpy(targets)

### 1.2. Initialize parameters

In [3]:
# get number of samples (m) and of features (n)
m, n = X.shape
print('number of samples: %s' % m)
print('number of features: %s' % n)

# get number of outputs (a)
_, a = Y.shape
print('number of outputs: %s' % a)

number of samples: 5
number of features: 3
number of outputs: 2


In [4]:
# initialize parameters
W = torch.randn(a, n, requires_grad=True)  # weights
b = torch.randn(a, requires_grad=True)  # bias

### 1.3. Define functions

#### 1.3.1. Hypothesis function / Model

In [5]:
def model(X, W, b):
    Y_hat = X @ W.t() + b
    return Y_hat

#### 1.3.2. Cost function / Loss function

In [6]:
def cost_fn(Y_hat, Y):
    diff = Y_hat - Y
    return torch.sum(diff * diff)/diff.numel()

### 1.4. Train model

The algorithm Gradient Descent repeats the process of adjusting the weights and biases using the gradients multiple times to reduce the loss.

In [7]:
epochs = 100  # define number of iteration
lr = 1e-5  # learning rate
for i in range(epochs):
    Y_hat = model(X, W, b)
    cost = cost_fn(Y_hat, Y)
    cost.backward()
    with torch.no_grad():
        W -= W.grad * lr
        b -= b.grad * lr
        W.grad.zero_()
        b.grad.zero_()

  Variable._execution_engine.run_backward(


### 1.5. Predict

In [16]:
x = torch.tensor([[75, 63, 44.]])
y_hat = model(x, W, b)
print(y_hat.data)

tensor([[52.8746, 66.5950]])


## 2. Linear Regression using PyTorch built-ins

The figure below presents the workflow of this part.

![lnr-scratch-workflow](images/linear_regression_pytorch_built_ins.svg)

- [x] Convert inputs & targets to tensors: convert data (*inputs* & *targets*) from numpy arrays to tensors. **Make sure** that numpy arrays are in data type `float32`.
- [x] Define dataset & dataloader:
    - dataset are tuples of inputs & targets.
    - dataloader shuffles the dataset and divides a dataset into batches.
- [x] Define functions:
    - identify the number of features and of targets, set model is a linear function.
    - set cost function is a mean squared loss function.
- [x] Define optimizer: identifies the algorithm using to adjust model parameters. Set optimzer to use stochastic gradient descent algorithm.
- [x] Train model: find the *optimal values* of model parameters by repeating the process of optimizing. **Make sure** reset gradients to zero before the next iteration.
- [x] Predict: using optimal parameters to predict target from a given input.

In [17]:
import torch.nn as nn
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader
import torch.nn.functional as F

### 2.1 Prepare data

Converting inputs & targets to tensors.

In [18]:
# inputs
inputs = np.array([[73, 67, 43], 
              [91, 88, 64], 
              [87, 134, 58], 
              [102, 43, 37], 
              [69, 96, 70], 
              [74, 66, 43], 
              [91, 87, 65], 
              [88, 134, 59], 
              [101, 44, 37], 
              [68, 96, 71], 
              [73, 66, 44], 
              [92, 87, 64], 
              [87, 135, 57], 
              [103, 43, 36], 
              [68, 97, 70]], dtype='float32')

# targets
targets = np.array([[56, 70], 
              [81, 101], 
              [119, 133], 
              [22, 37], 
              [103, 119],
              [57, 69], 
              [80, 102], 
              [118, 132], 
              [21, 38], 
              [104, 118], 
              [57, 69], 
              [82, 100], 
              [118, 134], 
              [20, 38], 
              [102, 120]], dtype='float32')

# convert to tensors
X = torch.from_numpy(inputs)
Y = torch.from_numpy(targets)

In [19]:
# define dataset
dataset = TensorDataset(X, Y)

In [20]:
# define data loader
batch_size = 5
dataloader = DataLoader(dataset, batch_size, shuffle=True)

In [25]:
for batch in dataloader:
    xs, ys = batch
    print(xs.shape)
    print(xs.data)
    print('\n')
    print(ys.shape)
    print(ys.data)
    break;

torch.Size([5, 3])
tensor([[102.,  43.,  37.],
        [ 87., 135.,  57.],
        [ 92.,  87.,  64.],
        [ 91.,  88.,  64.],
        [ 74.,  66.,  43.]])


torch.Size([5, 2])
tensor([[ 22.,  37.],
        [118., 134.],
        [ 82., 100.],
        [ 81., 101.],
        [ 57.,  69.]])


### 2.2 Define functions

#### 2.2.1 Hypothesis function / Model

In [26]:
# get number of samples (m) and of features (n)
m, n = X.shape

# get number of outputs
_, a = Y.shape

# define hypothesis function
model = nn.Linear(n, a)

print(model.weight)
print(model.bias)
print(list(model.parameters()))

Parameter containing:
tensor([[-0.5470, -0.2511, -0.1040],
        [-0.4040, -0.4391, -0.1143]], requires_grad=True)
Parameter containing:
tensor([ 0.2857, -0.5363], requires_grad=True)
[Parameter containing:
tensor([[-0.5470, -0.2511, -0.1040],
        [-0.4040, -0.4391, -0.1143]], requires_grad=True), Parameter containing:
tensor([ 0.2857, -0.5363], requires_grad=True)]


#### 2.2.2 Cost function / Loss function

In [27]:
cost_fn = F.mse_loss

### 2.3 Define optimizer

Optimizer identifies the algorithm using to adjust model parameters.

In [34]:
opt = torch.optim.SGD(model.parameters(), lr=1e-5)  # use the algorithm stochastic gradient descent

### 2.4 Train model

In [32]:
def fit(epochs, model, cost_fn, opt, dataloader):
    for epoch in range(epochs):
        for xs, ys in dataloader:
            ys_hat = model(xs)  # predict
            cost = cost_fn(ys_hat, ys)  # compute cost
            cost.backward()  # compute gradients
            opt.step()  # adjust model parameters
            opt.zero_grad()  # reset gradients to 0
        if (epoch+1) % 10 == 0:
            print('epoch {}/{}, cost: {:.4f}'.format(epoch+1, epochs, cost.item()))

In [30]:
fit(100, model, cost_fn, opt, dataloader)

epoch 10/100, cost: 353.9998
epoch 20/100, cost: 142.3173
epoch 30/100, cost: 100.6870
epoch 40/100, cost: 111.7669
epoch 50/100, cost: 63.0477
epoch 60/100, cost: 42.9409
epoch 70/100, cost: 28.2862
epoch 80/100, cost: 19.6485
epoch 90/100, cost: 38.8114
epoch 100/100, cost: 16.4094


### 2.5 Predict

In [31]:
x = torch.tensor([[75, 63, 44.]])
y_hat = model(x)
print(y_hat.data)

tensor([[54.3244, 68.3188]])
