<a href="https://colab.research.google.com/github/MasonWang025/ai-club-materials/blob/main/linear_regression_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Linear Regression with PyTorch

Predict *target variables* from *input variables*. 

To demo:

**Target variable:** apple & orange crop yields
**Input variables/features:** avg. temp, rainfall, humidity

**Given** avg. temp, rainfall, and humidity, **predict** apple & orange crop yields,

```
apple_yield = w11*temp + w12*rainfall + w13*humidity + b1
orange_yield = w21*temp + w22*rainfall + w23*humidity + b2
```

[3D linear regression](https://www.google.com/search?q=3d+linear+regression&sxsrf=ALeKk01RnS1CXqDXDdhNwRhedY_p4dHclA:1605845888673&source=lnms&tbm=isch&sa=X&ved=2ahUKEwjphoiOopDtAhXkJzQIHSfoADYQ_AUoAXoECAkQAw&cshid=1605845892324149&biw=1536&bih=722)

![data](https://i.imgur.com/6Ujttb4.png)

**Each target variable is estimated to be a weighted sum of input variables.**

The learning part is to figure out the weights and biases that lead to best predictions.

In [1]:
import numpy as np
import torch

In [2]:
# input (temp, rainfall, humidity)
inputs = np.array([[73, 67, 43],
                  [91, 88, 64],
                  [87, 134, 58],
                  [102, 43, 37],
                  [69, 96, 70]], dtype="float32")
# targets (apples, oranges)
targets = np.array([[56, 70], 
                    [81, 101], 
                    [119, 133], 
                    [22, 37], 
                    [103, 119]], dtype='float32')
# for simplicity purposes; typically would be from csv file or database
# inputs, targets 

In [3]:
 # numpy interoperability
 inputs = torch.from_numpy(inputs)
 targets = torch.from_numpy(targets)
 # inputs, targets 

#Linear Regression from Scratch


In [4]:
# weights and biases represented as tensors, randomly initialized
w = torch.randn(2, 3, requires_grad=True) # nx3 * 3x2 (2x3 transposed) = mx2
b = torch.randn(2, requires_grad=True)
print(w)
print(b)

tensor([[-0.3359, -0.7289,  0.2834],
        [-0.0224, -0.2223, -0.0199]], requires_grad=True)
tensor([0.9824, 0.0953], requires_grad=True)


![matrix-mult](https://machinethink.net/images/mps-matrix-multiplication/MatrixMultiplication@2x.png)

In [5]:
def model(x):
  return x @w.t() + b

In [6]:
preds = model(inputs)
# print(inputs) 
print(preds) # utter trash
print(targets)

tensor([[ -60.1852,  -17.2876],
        [ -75.5858,  -22.7767],
        [-109.4712,  -32.7912],
        [ -54.1333,  -12.4831],
        [ -72.3269,  -24.1819]], grad_fn=<AddBackward0>)
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


In [7]:
# utter trash, but exactly how trash?
def mse(preds, targets):
  diff = preds - targets # calc differences
  return torch.sum(diff * diff) / diff.numel() # squaring to remove neg (better than abs)

In [8]:
loss = mse(preds, targets)
loss # (really trash)

tensor(20012.9180, grad_fn=<DivBackward0>)


[gradient-img](https://www.google.com/search?q=gradient+descent&tbm=isch&ved=2ahUKEwjPiOzQp5DtAhWJnZ4KHfO7APAQ2-cCegQIABAA&oq=gradient+descent&gs_lcp=CgNpbWcQAzIFCAAQsQMyAggAMgIIADICCAAyAggAMgIIADICCAAyAggAMgIIADICCAA6BggAEAUQHjoGCAAQCBAeUJMFWPcjYOYmaANwAHgAgAFriAHCBpIBBDExLjGYAQCgAQGqAQtnd3Mtd2l6LWltZ8ABAQ&sclient=img&ei=Skm3X8-7Nom7-gTz94KADw&bih=722&biw=1536)
[threeblueonebrown-vid](https://youtu.be/IHZwWFHWa-w?t=322)

In [9]:
# now, compute gradients wrt to weights and biases
loss.backward()
print(w.grad)

tensor([[-12494.1953, -14456.8428,  -8671.7207],
        [ -9397.2676, -10965.9746,  -6628.9155]])


In [10]:
# reset so gradients don't accumulate
w.grad.zero_()
b.grad.zero_()
print(w.grad)
print(b.grad)

tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([0., 0.])


#Gradient Descent
Goal is to find set of weights and biases where loss is lowest. Update based on computed gradient.

**Idea: move in direction opposite of derivative to decrease loss.**

1. Generate predictions
2. Calculate loss
3. Compute gradients of weights and biases
4. Adjust weights by subtracting a small quantity proportional to gradient
5. Reset gradients to zero

In [11]:
# 1: predict
preds = model(inputs)
# 2: loss
loss = mse(preds, targets)
print(loss)
# 3: gradients
loss.backward() # w.grad, b.grad
# 4 & 5: adjust and reset
with torch.no_grad():
    w -= w.grad * 1e-5 # 1e-5 is called learning rate
    b -= b.grad * 1e-5
    w.grad.zero_()
    b.grad.zero_()
print(mse(model(inputs), targets))

tensor(20012.9180, grad_fn=<DivBackward0>)
tensor(13704.1611, grad_fn=<DivBackward0>)


In [12]:
for i in range(100): # called epochs, training for 100 epochs
  preds = model(inputs)
  loss = mse(preds, targets)
  loss.backward()
  with torch.no_grad():
      w -= w.grad * 1e-5
      b -= b.grad * 1e-5
      w.grad.zero_()
      b.grad.zero_()

print(mse(model(inputs), targets))

# they are close!!
print(preds)
print(targets)

tensor(201.1277, grad_fn=<DivBackward0>)
tensor([[ 61.8456,  74.6797],
        [ 86.8689, 100.3502],
        [100.5687, 126.6601],
        [ 47.4759,  61.9916],
        [ 94.8228, 104.0255]], grad_fn=<AddBackward0>)
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


#Pytorch built-ins

In [13]:
import torch.nn as nn

In [14]:
# input (temp, rainfall, humidity)
inputs = np.array([[73, 67, 43], [91, 88, 64], [87, 134, 58], 
                   [102, 43, 37], [69, 96, 70], [73, 67, 43], 
                   [91, 88, 64], [87, 134, 58], [102, 43, 37], 
                   [69, 96, 70], [73, 67, 43], [91, 88, 64], 
                   [87, 134, 58], [102, 43, 37], [69, 96, 70]], 
                  dtype='float32')

# targets (apples, oranges)
targets = np.array([[56, 70], [81, 101], [119, 133], 
                    [22, 37], [103, 119], [56, 70], 
                    [81, 101], [119, 133], [22, 37], 
                    [103, 119], [56, 70], [81, 101], 
                    [119, 133], [22, 37], [103, 119]], 
                   dtype='float32')

inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)

inputs # 15 examples

tensor([[ 73.,  67.,  43.],
        [ 91.,  88.,  64.],
        [ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 69.,  96.,  70.],
        [ 73.,  67.,  43.],
        [ 91.,  88.,  64.],
        [ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 69.,  96.,  70.],
        [ 73.,  67.,  43.],
        [ 91.,  88.,  64.],
        [ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 69.,  96.,  70.]])

In [15]:
# TensorDataset
from torch.utils.data import TensorDataset

train_ds = TensorDataset(inputs, targets)
train_ds[2:5]

(tensor([[ 87., 134.,  58.],
         [102.,  43.,  37.],
         [ 69.,  96.,  70.]]), tensor([[119., 133.],
         [ 22.,  37.],
         [103., 119.]]))

In [16]:
# DataLoader
from torch.utils.data import DataLoader

batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True) # pass in TensorDataset train_ds

In [17]:
for xb, yb in train_dl: # batches of 5!
  print(xb)
  print(yb)
  break

tensor([[ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 73.,  67.,  43.],
        [ 73.,  67.,  43.],
        [102.,  43.,  37.]])
tensor([[119., 133.],
        [ 22.,  37.],
        [ 56.,  70.],
        [ 56.,  70.],
        [ 22.,  37.]])


In [18]:
model = nn.Linear(3, 2)

#model.weight, model.bias
list(model.parameters())

[Parameter containing:
 tensor([[-0.2159, -0.5317,  0.1585],
         [-0.0622,  0.4129,  0.2038]], requires_grad=True),
 Parameter containing:
 tensor([0.1015, 0.4506], requires_grad=True)]

In [19]:
preds = model(inputs)
preds

tensor([[-44.4657,  32.3338],
        [-56.1884,  44.1636],
        [-80.7355,  62.1830],
        [-38.9157,  19.3970],
        [-54.7419,  50.0585],
        [-44.4657,  32.3338],
        [-56.1884,  44.1636],
        [-80.7355,  62.1830],
        [-38.9157,  19.3970],
        [-54.7419,  50.0585],
        [-44.4657,  32.3338],
        [-56.1884,  44.1636],
        [-80.7355,  62.1830],
        [-38.9157,  19.3970],
        [-54.7419,  50.0585]], grad_fn=<AddmmBackward>)

In [20]:
# Loss func
import torch.nn.functional as F
loss_fn = F.mse_loss

loss = loss_fn(model(inputs), targets)
loss

tensor(11212.8486, grad_fn=<MseLossBackward>)

In [21]:
# optimization
opt = torch.optim.SGD(model.parameters(), lr=1e-5)

In [22]:
def fit(num_epochs, model, loss_fn, opt, train_dl):    
    for epoch in range(num_epochs):
        # each epoch, we go through entire training set once
        for xb,yb in train_dl:          
            # 1: predict
            pred = model(xb)
            # 2: loss
            loss = loss_fn(pred, yb)
            # 3: gradient
            loss.backward()
            # 4: update
            opt.step()
            # 5: reset 
            opt.zero_grad()
        
        # print progress
        if (epoch+1) % 10 == 0: # every 10 epochs
            print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, num_epochs, loss.item()))

In [23]:
fit(100, model, loss_fn, opt, train_dl)

Epoch [10/100], Loss: 130.7594
Epoch [20/100], Loss: 199.4286
Epoch [30/100], Loss: 156.3750
Epoch [40/100], Loss: 74.9346
Epoch [50/100], Loss: 41.6042
Epoch [60/100], Loss: 42.0286
Epoch [70/100], Loss: 40.5651
Epoch [80/100], Loss: 27.9304
Epoch [90/100], Loss: 15.4286
Epoch [100/100], Loss: 20.4979


In [24]:
preds = model(inputs)
print(preds)
print(targets)  

tensor([[ 58.3951,  71.2228],
        [ 82.9493,  98.1140],
        [114.7682, 136.9236],
        [ 28.3263,  41.8695],
        [ 99.0234, 111.9749],
        [ 58.3951,  71.2228],
        [ 82.9493,  98.1140],
        [114.7682, 136.9236],
        [ 28.3263,  41.8695],
        [ 99.0234, 111.9749],
        [ 58.3951,  71.2228],
        [ 82.9493,  98.1140],
        [114.7682, 136.9236],
        [ 28.3263,  41.8695],
        [ 99.0234, 111.9749]], grad_fn=<AddmmBackward>)
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.],
        [ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.],
        [ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


#Congrats!
You've just built and trained a linear regression model.