<a href="https://colab.research.google.com/github/erfanbyt/pytorch/blob/main/Gradient_Descent_and_linear_regression_with_Pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Notes

In [167]:
import numpy as np
import torch

## Linear regression from scratch

#### Training data

In [168]:
# Input (temp, rainfall, humidity)
inputs = np.array([[73, 67, 43], 
                   [91, 88, 64], 
                   [87, 134, 58], 
                   [102, 43, 37], 
                   [69, 96, 70]], dtype='float32')

In [169]:
# Targets (apples, oranges)
targets = np.array([[56, 70], 
                    [81, 101], 
                    [119, 133], 
                    [22, 37], 
                    [103, 119]], dtype='float32')

Converting the data to tensors

In [170]:
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)

In [171]:
print(inputs)
print(tragets)

tensor([[ 73.,  67.,  43.],
        [ 91.,  88.,  64.],
        [ 87., 134.,  58.],
        [102.,  43.,  37.],
        [ 69.,  96.,  70.]])
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


### Doing the linear regression from scratch

The objective is to find the weigths for the equations below:

yield_apple  = w11 * temp + w12 * rainfall + w13 * humidity + b1

yield_orange = w21 * temp + w22 * rainfall + w23 * humidity + b2

Initializing the weights and biases randomly

In [172]:
w = torch.randn(2, 3, requires_grad=True)  # weights
b = torch.randn(2, requires_grad=True)  # biases
print(w)
print(b)

tensor([[ 0.1689,  1.6195,  0.4738],
        [-0.8709,  0.1500,  1.2688]], requires_grad=True)
tensor([1.3106, 1.6327], requires_grad=True)


#### defining the model

In [173]:
# @ --> matrix multiplication in pytorch
# tranposing w before multiplications
def model(x):
  return x @ w.t() + b

#### generating predictions

In [174]:
preds = model(inputs)
print(preds)

tensor([[142.5188,   2.6655],
        [189.5175,  16.7837],
        [260.4961,  19.5553],
        [105.7070, -33.8032],
        [201.5999,  44.7559]], grad_fn=<AddBackward0>)


In [175]:
print(targets)

tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


#### computing the loss

In [176]:
def mse(t1, t2):
  diff = t1 - t2
  return torch.sum(diff * diff) / diff.numel()

In [177]:
# compute the loss
loss = mse(preds, targets)
print(loss)

tensor(9103.2803, grad_fn=<DivBackward0>)


#### compute the gradients

Pytorch will accumulate the gradients with next time you call .backward() and the new gradient values will be added to the existing ones --> this may lead to error.

In [178]:
# loss.backward()
# print(w)
# print(w.grad)
# print(b)
# print(b.grad)

# # reseting w and b to zero
# w.grad.zero_()
# b.grad.zero_()
# print(w.grad)
# print(b.grad)

### training the model using gradient descent

In [179]:
# computing the gradient agian
loss.backward()

In [180]:
# adjusting the weights and reseting the gradients
with torch.no_grad():  # no-grad indcates to pytorch that we should not track, calculate, or modify gradients while updating the weights.
  w -= w.grad * 1e-5
  b -= b.grad * 1e-5
  w.grad.zero_()
  b.grad.zero_()

In [181]:
print(w)
print(b)

tensor([[ 0.0812,  1.5248,  0.4160],
        [-0.8013,  0.2246,  1.3141]], requires_grad=True)
tensor([1.3095, 1.6335], requires_grad=True)


In [182]:
# calcuating the loss
preds = model(inputs)
loss = mse(preds, targets)

In [183]:
print(loss)

tensor(6147.4692, grad_fn=<DivBackward0>)


### training for multiple epochs

In [184]:
epochs = 200
lr = 1e-5
for i in range(epochs):
  preds = model(inputs)
  loss = mse(preds, targets)
  loss.backward()
  with torch.no_grad():
    w -= w.grad * lr
    b -= b.grad * lr
    w.grad.zero_()
    b.grad.zero_()

In [185]:
# calculate the new loss
preds = model(inputs)
loss = mse(preds, targets)
print(loss)

tensor(26.8823, grad_fn=<DivBackward0>)


### comparing the results

the targets and preds are close to each others

In [186]:
print(preds)
print(targets)

tensor([[ 57.6658,  70.0634],
        [ 78.6535, 104.3392],
        [125.9737, 125.0138],
        [ 22.3309,  34.3830],
        [ 95.2669, 127.3796]], grad_fn=<AddBackward0>)
tensor([[ 56.,  70.],
        [ 81., 101.],
        [119., 133.],
        [ 22.,  37.],
        [103., 119.]])


## Linear regression using pytorch built-ins

In [187]:
# importing packages for neural network
import torch.nn as nn

training data

In [188]:
# Input (temp, rainfall, humidity)
inputs = np.array([[73, 67, 43], 
                   [91, 88, 64], 
                   [87, 134, 58], 
                   [102, 43, 37], 
                   [69, 96, 70], 
                   [74, 66, 43], 
                   [91, 87, 65], 
                   [88, 134, 59], 
                   [101, 44, 37], 
                   [68, 96, 71], 
                   [73, 66, 44], 
                   [92, 87, 64], 
                   [87, 135, 57], 
                   [103, 43, 36], 
                   [68, 97, 70]], 
                  dtype='float32')

# Targets (apples, oranges)
targets = np.array([[56, 70], 
                    [81, 101], 
                    [119, 133], 
                    [22, 37], 
                    [103, 119],
                    [57, 69], 
                    [80, 102], 
                    [118, 132], 
                    [21, 38], 
                    [104, 118], 
                    [57, 69], 
                    [82, 100], 
                    [118, 134], 
                    [20, 38], 
                    [102, 120]], 
                   dtype='float32')

inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)

### dataset and dataloader

We'll create a TensorDataset, which allows access to rows from inputs and targets as tuples, and provides standard APIs for working with many different types of datasets in PyTorch.

We'll also create a DataLoader, which can split the data into batches of a predefined size while training. It also provides other utilities like shuffling and random sampling of the data.

In [192]:
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

In [190]:
# define the dataset
train_ds = TensorDataset(inputs, targets)
train_ds[:2]

(tensor([[73., 67., 43.],
         [91., 88., 64.]]), tensor([[ 56.,  70.],
         [ 81., 101.]]))

In [193]:
# define data loader
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)


In [195]:
for xb, yb in train_dl:
  print('batch 1:')
  print(xb)
  print(yb)
  break

batch 1:
tensor([[101.,  44.,  37.],
        [ 88., 134.,  59.],
        [ 73.,  66.,  44.],
        [ 68.,  97.,  70.],
        [ 68.,  96.,  71.]])
tensor([[ 21.,  38.],
        [118., 132.],
        [ 57.,  69.],
        [102., 120.],
        [104., 118.]])


### nn.linear

Instead of initializing the weights & biases manually, we can define the model using the nn.Linear class from PyTorch, which does it automatically.

In [204]:
# use ? and the name of the function to the help of it!
#?nn.Linear

In [197]:
# define the model
model = nn.Linear(3,2)
print(model.weight)
print(model.bias)

Parameter containing:
tensor([[ 0.5054,  0.2476, -0.0704],
        [ 0.5446, -0.0832,  0.3560]], requires_grad=True)
Parameter containing:
tensor([0.1566, 0.2058], requires_grad=True)


PyTorch models also have a helpful .parameters method, which returns a list containing all the weights and bias matrices present in the model. For our linear regression model, we have one weight matrix and one bias matrix.

In [198]:
# parameters
list(model.parameters())

[Parameter containing:
 tensor([[ 0.5054,  0.2476, -0.0704],
         [ 0.5446, -0.0832,  0.3560]], requires_grad=True),
 Parameter containing:
 tensor([0.1566, 0.2058], requires_grad=True)]

In [200]:
# generate predictions
preds = model(inputs)
preds

tensor([[50.6092, 49.6903],
        [63.4265, 65.2203],
        [73.2154, 57.0768],
        [59.7460, 65.3452],
        [53.8663, 54.7093],
        [50.8670, 50.3181],
        [63.1086, 65.6595],
        [73.6504, 57.9774],
        [59.4882, 64.7173],
        [53.2905, 54.5207],
        [50.2912, 50.1295],
        [63.6844, 65.8481],
        [73.5333, 56.6376],
        [60.3218, 65.5338],
        [53.6085, 54.0815]], grad_fn=<AddmmBackward0>)

### loss function

nn.functional contains many useful loss functions and several other utilities

In [201]:
# import nn.functional
import torch.nn.functional as F

In [202]:
loss_fn = F.mse_loss

In [205]:
# compute the loss
loss = loss_fn(model(inputs), targets)
loss

tensor(1861.2623, grad_fn=<MseLossBackward0>)

### optimizer

Note that model.parameters() is passed as an argument to optim.SGD so that the optimizer knows which matrices should be modified during the update step

In [206]:
opt = torch.optim.SGD(model.parameters(), lr=1e-5)

### Train the model

In [207]:
def fit(epochs, model, loss_fn, opt, train_dl):

  # train for each number of epoch
  for epoch in range(epochs):

    # train with batches of the data
    for xb, yb in train_dl:
      
      # generate the prediction 
      pred = model(xb)

      # calcuate the loss
      loss = loss_fn(pred, yb)

      # compute gradients
      loss.backward()

      # update the parameters using gradient 
      opt.step()

      # reset the gradient to zero
      opt.zero_grad()

      # Print the progress
      if (epoch+1) % 10 == 0:
          print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, epochs, loss.item()))

In [208]:
fit(100, model, loss_fn, opt, train_dl)

Epoch [10/100], Loss: 844.0866
Epoch [10/100], Loss: 1046.0306
Epoch [10/100], Loss: 197.9391
Epoch [20/100], Loss: 413.0467
Epoch [20/100], Loss: 202.3129
Epoch [20/100], Loss: 778.8761
Epoch [30/100], Loss: 166.6920
Epoch [30/100], Loss: 715.8894
Epoch [30/100], Loss: 143.2899
Epoch [40/100], Loss: 128.4565
Epoch [40/100], Loss: 383.8614
Epoch [40/100], Loss: 166.8421
Epoch [50/100], Loss: 166.6998
Epoch [50/100], Loss: 11.9832
Epoch [50/100], Loss: 293.2950
Epoch [60/100], Loss: 212.6195
Epoch [60/100], Loss: 13.5321
Epoch [60/100], Loss: 109.7648
Epoch [70/100], Loss: 84.1226
Epoch [70/100], Loss: 79.1755
Epoch [70/100], Loss: 81.1316
Epoch [80/100], Loss: 106.4217
Epoch [80/100], Loss: 53.6017
Epoch [80/100], Loss: 31.0737
Epoch [90/100], Loss: 54.3171
Epoch [90/100], Loss: 26.7336
Epoch [90/100], Loss: 65.7023
Epoch [100/100], Loss: 49.1939
Epoch [100/100], Loss: 36.2880
Epoch [100/100], Loss: 25.6970
