# Linear Regression

This is a notebook for reviewing the basics of linear regression and how it can be implemented from first principles in numpy and pytorch.

Sources:

* https://www.kaggle.com/aakashns/pytorch-basics-linear-regression-from-scratch
* https://nbviewer.org/url/www.cs.toronto.edu/~frossard/post/linear_regression/Linear%20Regression.ipynb

In [None]:
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F

# Create the data

In [None]:
# Input (temp, rainfall, humidity)
inputs_orig = np.array([[73, 67, 43], 
                        [91, 88, 64], 
                        [87, 134, 58], 
                        [102, 43, 37], 
                        [69, 96, 70]], dtype='float32')

# Targets (apples, oranges)
targets_orig = np.array([[56, 70], 
                         [81, 101], 
                         [119, 133], 
                         [22, 37], 
                         [103, 119]], dtype='float32')

The model will fit crop yields for apples and oranges by looking at temperature, rainfall and humidity.

| Region  | Temp (F) | Rainfall (mm) | Humidity (%) | Apples (ton) | Oranges (ton) |
| ---   | --- | ----| ---| --- | --- |
|Kanto  | 73  | 67  | 43 | 56  | 70  |
|Johto  | 91  | 88  | 64 | 81  | 101 |
|Hoenn  | 87  | 134 | 58 | 119 | 133 |
|Sinnoh | 102 | 43  | 37 | 22  | 37  |
|Unova  | 69  | 96  | 70 | 103 | 119 | 

Each target will be estimated as a weighted sum of the input variables, offset by the bias, where the bias and the weights will be learnt through gradient descent.

$y_{apples} = w_{11}*temp + w_{21} * rainfall + w_{13} * humidity + b_1$

$y_{oranges} = w_{21}*temp + w_{22} * rainfall + w_{23} * humidity + b_2$


# Pytorch from scratch

In [None]:
lr = 1e-5
epochs = 1000
inputs = torch.tensor(inputs_orig, requires_grad=True)
targets = torch.tensor(targets_orig, requires_grad=True)

w = torch.randn(2, 3, requires_grad=True)
b = torch.randn(2, requires_grad=True)

def model(x):
  return x @ w.t() + b

def mse(t1, t2):
  diff = t1 - t2
  return torch.sum(diff * diff) / diff.numel()

for i in range(epochs):
  #print('i: {} W: {}/{}, b: {}/{}'.format(i, w, w.grad, b, b.grad))
  
  # compute predictions and loss
  preds = model(inputs)
  loss = mse(preds, targets)

  # compute the gradients
  loss.backward()

  # adjust the weights and reset the gradients
  with torch.no_grad():
    w -= lr * w.grad
    b -= lr * b.grad
    w.grad.zero_()
    b.grad.zero_()
  
  if i > 0 and i % 10 == 0:
    print('***Epoch: {}, Loss:{}***'.format(i, loss))
  
print('***Final***\nTargets: {}\nPredictions:{}\nLoss:{}\n'.format(targets, preds, loss))


***Epoch: 10, Loss:5526.90966796875***
***Epoch: 20, Loss:4034.72705078125***
***Epoch: 30, Loss:3538.3203125***
***Epoch: 40, Loss:3116.92626953125***
***Epoch: 50, Loss:2746.104736328125***
***Epoch: 60, Loss:2419.520751953125***
***Epoch: 70, Loss:2131.89111328125***
***Epoch: 80, Loss:1878.5670166015625***
***Epoch: 90, Loss:1655.4547119140625***
***Epoch: 100, Loss:1458.94921875***
***Epoch: 110, Loss:1285.875732421875***
***Epoch: 120, Loss:1133.4384765625***
***Epoch: 130, Loss:999.1751098632812***
***Epoch: 140, Loss:880.9169921875***
***Epoch: 150, Loss:776.7545776367188***
***Epoch: 160, Loss:685.005859375***
***Epoch: 170, Loss:604.1898803710938***
***Epoch: 180, Loss:533.0022583007812***
***Epoch: 190, Loss:470.2943420410156***
***Epoch: 200, Loss:415.0543518066406***
***Epoch: 210, Loss:366.3914489746094***
***Epoch: 220, Loss:323.52093505859375***
***Epoch: 230, Loss:285.751953125***
***Epoch: 240, Loss:252.4760284423828***
***Epoch: 250, Loss:223.1570587158203***
***Epoc

# Pytorch built-ins

In [None]:
from torch.utils.data import TensorDataset, DataLoader

lr = 1e-5
epochs = 1000
batch_size = 5
inputs = torch.tensor(inputs_orig, requires_grad=True)
targets = torch.tensor(targets_orig, requires_grad=True)

train_ds = TensorDataset(inputs, targets)
train_dl = DataLoader(train_ds, batch_size, shuffle=True)

model = nn.Linear(3, 2)
print(model.weight)
print(model.bias)

optimizer = torch.optim.SGD(model.parameters(), lr=lr)
loss_fcn = F.mse_loss

for i in range(epochs):
  for xb, yb in train_dl:
    # compute predictions and loss
    preds = model(xb)
    loss = loss_fcn(preds, yb)

    # compute the gradients
    loss.backward()

    # adjust the weights and reset the gradients
    optimizer.step()
    optimizer.zero_grad()
  
  if i > 0 and i % 10 == 0:
    print('***Epoch: {}, Loss:{}***'.format(i, loss))
  
print('***Final***\nTargets: {}\nPredictions:{}\nLoss:{}\n'.format(targets, preds, loss))

Parameter containing:
tensor([[ 0.1913, -0.3394, -0.2842],
        [ 0.1763,  0.5359, -0.4976]], requires_grad=True)
Parameter containing:
tensor([-0.0097,  0.4950], requires_grad=True)
***Epoch: 10, Loss:828.1764526367188***
***Epoch: 20, Loss:606.9547119140625***
***Epoch: 30, Loss:541.121337890625***
***Epoch: 40, Loss:485.4707946777344***
***Epoch: 50, Loss:436.34466552734375***
***Epoch: 60, Loss:392.9202575683594***
***Epoch: 70, Loss:354.5188903808594***
***Epoch: 80, Loss:320.5444641113281***
***Epoch: 90, Loss:290.47125244140625***
***Epoch: 100, Loss:263.8368225097656***
***Epoch: 110, Loss:240.23330688476562***
***Epoch: 120, Loss:219.30184936523438***
***Epoch: 130, Loss:200.72633361816406***
***Epoch: 140, Loss:184.22805786132812***
***Epoch: 150, Loss:169.56187438964844***
***Epoch: 160, Loss:156.51165771484375***
***Epoch: 170, Loss:144.88705444335938***
***Epoch: 180, Loss:134.52023315429688***
***Epoch: 190, Loss:125.2636489868164***
***Epoch: 200, Loss:116.98703765869

# Numpy from scratch