### Basic working of backpropogation

In [18]:
import torch

In [19]:
x = torch.tensor(1.0)
y = torch.tensor(2.0)

w = torch.tensor(1.0, requires_grad=True)

In [20]:
# forward pass to compute loss
y_hat = w*x
loss = (y_hat-y)**2

loss

tensor(1., grad_fn=<PowBackward0>)

In [21]:
# Backward pass
loss.backward()
print(w.grad)

tensor(-2.)


In [22]:
## Updating weight before next forward-backward pass
## Wrap it using the "with" command

### Linear regerssion from scratch

In [45]:
import numpy as np

In [87]:
X = np.array([1,2,3,4], dtype=np.float32)
Y = np.array([2,4,6,8], dtype=np.float32)

w = 0.0 ## We are omitting any bias for now

# Model prediction
def forward(x):
    return w * x

# Loss
def loss(y, y_pred):
    return ((y-y_pred)**2).mean()

# Gradient
def gradient(x,y,y_pred):
    return np.mean(2*x*(y_pred-y))

print(f'Prediction before training: f(5) = {forward(5):.3f}')

#Training
learning_rate = 0.01
epochs = 100

for epoch in range(epochs):
    Y_pred = forward(X)
    l = loss(Y,Y_pred)
    dw = gradient(X,Y,Y_pred)
    
    w = w - learning_rate*dw
    
    if epoch%10==0:
        print(f"Loss after epoch {epoch+1} is {l:.10f}")
    

print(f'Prediction after training: f(5) = {forward(5):.3f}')

    

Prediction before training: f(5) = 0.000
Loss after epoch 1 is 30.0000000000
Loss after epoch 11 is 1.1627856493
Loss after epoch 21 is 0.0450690463
Loss after epoch 31 is 0.0017468547
Loss after epoch 41 is 0.0000677049
Loss after epoch 51 is 0.0000026244
Loss after epoch 61 is 0.0000001018
Loss after epoch 71 is 0.0000000039
Loss after epoch 81 is 0.0000000002
Loss after epoch 91 is 0.0000000000
Prediction after training: f(5) = 10.000


### Linear regression using autograd for gradient calculation

In [88]:
import torch

In [103]:
X = torch.tensor([1,2,3,4], dtype=torch.float32)
Y = torch.tensor([2,4,6,8], dtype=torch.float32)

w = torch.tensor(0.0, dtype=torch.float32, requires_grad=True)

def forward(x):
    return w*x

def loss(y,y_pred):
    return ((y-y_pred)**2).mean()

print(f'Prediction before training: f(5) = {forward(5):.3f}')

#Training
learning_rate = 0.01
epochs = 100

for epoch in range(epochs):
    Y_pred = forward(X)
    l = loss(Y,Y_pred)
#     dw = gradient(X,Y,Y_pred)

    l.backward()
    
    with torch.no_grad():
        w -= learning_rate * w.grad
   
    #Resetting the gradient to 0
    w.grad.zero_()
    
    if epoch%10==0:
        print(f"Loss after epoch {epoch+1} is {l:.10f}")
    

print(f'Prediction after training: f(5) = {forward(5):.3f}')



Prediction before training: f(5) = 0.000
Loss after epoch 1 is 30.0000000000
Loss after epoch 11 is 1.1627856493
Loss after epoch 21 is 0.0450688973
Loss after epoch 31 is 0.0017468547
Loss after epoch 41 is 0.0000677049
Loss after epoch 51 is 0.0000026244
Loss after epoch 61 is 0.0000001018
Loss after epoch 71 is 0.0000000040
Loss after epoch 81 is 0.0000000001
Loss after epoch 91 is 0.0000000000
Prediction after training: f(5) = 10.000


### Linear regression with autograd, nn and optim packages

In [104]:
## Here, we do not calculate the loss or update the weights manually. Rather, we use torch.nn adn torch.optim to utilise their modules for loss and optimizers.
import torch
import torch.nn as nn
import torch.optim as opt

In [107]:
X = torch.tensor([1,2,3,4], dtype=torch.float32)
Y = torch.tensor([2,4,6,8], dtype=torch.float32)

w = torch.tensor(0.0, dtype=torch.float32, requires_grad=True)

def forward(x):
    return w*x

print(f'Prediction before training: f(5) = {forward(5):.3f}')

#Training
learning_rate = 0.01
epochs = 100

loss = nn.MSELoss() ## No need for hyperparameter setting here
optimizer = opt.SGD([w], lr=learning_rate)

for epoch in range(epochs):
    Y_pred = forward(X)
    l = loss(Y,Y_pred) ## Using the "loss" object that we created
#     dw = gradient(X,Y,Y_pred)

    l.backward()
    
#     with torch.no_grad():
#         w -= learning_rate * w.grad

    optimizer.step()
   
    #Resetting the gradient to 0
    optimizer.zero_grad()
    
    if epoch%10==0:
        print(f"Loss after epoch {epoch+1} is {l:.10f}")
    

print(f'Prediction after training: f(5) = {forward(5):.3f}')

Prediction before training: f(5) = 0.000
Loss after epoch 1 is 30.0000000000
Loss after epoch 11 is 1.1627856493
Loss after epoch 21 is 0.0450688973
Loss after epoch 31 is 0.0017468547
Loss after epoch 41 is 0.0000677049
Loss after epoch 51 is 0.0000026244
Loss after epoch 61 is 0.0000001018
Loss after epoch 71 is 0.0000000040
Loss after epoch 81 is 0.0000000001
Loss after epoch 91 is 0.0000000000
Prediction after training: f(5) = 10.000


NOTE : Here, we define objects for `loss` and `optimizer`, wherein we instantiate the `optimizer` with our required hyperparameters(`learning_rate` for now). The `loss` object doesn't require any pre-setting. It just needs input and gives the corresponding loss.

We call the `loss` with the target values and predicted values(`Y` and `Y_pred`) as input to the function(actual parameter). We then use the autograd's functionality of calling the `backward()` method on the loss, since we have already initialised `w` with `requires_grad` to be True. Thus, this sets the gradient of `w`. Now we call the optimizer and send the parameters in a list, and other hyperparameters also(here, `w` and `learning_rate`). Then we use the step method for updating(wraps it with `torch.no_grad()` by default), ad finally reset the gradient to 0 using `optimizer.grad_zero()`.

### Creating a model to be used for prediction

In [108]:
import torch
import torch.nn as nn
import torch.optim as opt

NOTE : For any model in pytorch, the input tensor and the target variable tensor should be a 2D tensor and should be of the form (number_of_input_samples, number_of_input_features). 

In [119]:
X = torch.tensor([[1], [2], [3], [4]], dtype=torch.float32)
Y = torch.tensor([[2], [4], [6], [8]], dtype=torch.float32)

X_test = torch.tensor([5], dtype=torch.float32)

m,n = X.shape # m is the number of samples and n is the number of features

input_size = n #For a single example
output_size = n #For a single example

model = nn.Linear(input_size, output_size)

print(f'Prediction before training: f(5) = {model(X_test).item():.3f}')

#Training
learning_rate = 0.01
epochs = 4000

loss = nn.MSELoss() ## No need for hyperparameter setting here
optimizer = opt.SGD(model.parameters(), lr=learning_rate)

for epoch in range(epochs):
    Y_pred = model(X)
    l = loss(Y,Y_pred) ## Using the "loss" object that we created
#     dw = gradient(X,Y,Y_pred)

    l.backward()
    
#     with torch.no_grad():
#         w -= learning_rate * w.grad

    optimizer.step()
   
    #Resetting the gradient to 0
    optimizer.zero_grad()
    
    if epoch%10==0:
        [w, b] = model.parameters()
        print(f'Weight and bias after {epoch+1} epoch are {w.item()} and {b.item()}')
        print(f"Loss after epoch {epoch+1} is {l:.10f}")
    

print(f'Prediction after training: f(5) = {model(X_test).item():.3f}')

Prediction before training: f(5) = -4.633
Weight and bias after 1 epoch are -0.5526863932609558 and 0.4392426908016205
Loss after epoch 1 is 62.5310745239
Weight and bias after 11 epoch are 1.2667009830474854 and 1.019435167312622
Loss after epoch 11 is 1.8291726112
Weight and bias after 21 epoch are 1.5688797235488892 and 1.0847325325012207
Loss after epoch 21 is 0.2463684976
Weight and bias after 31 epoch are 1.6267356872558594 and 1.0680372714996338
Loss after epoch 31 is 0.1938321590
Weight and bias after 41 epoch are 1.645019292831421 and 1.0389572381973267
Loss after epoch 41 is 0.1815619469
Weight and bias after 51 epoch are 1.6566723585128784 and 1.0086647272109985
Loss after epoch 51 is 0.1709685177
Weight and bias after 61 epoch are 1.6670016050338745 and 0.9789338111877441
Loss after epoch 61 is 0.1610166281
Weight and bias after 71 epoch are 1.676868200302124 and 0.9500274658203125
Loss after epoch 71 is 0.1516446322
Weight and bias after 81 epoch are 1.686418056488037 and 

Weight and bias after 971 epoch are 1.9782527685165405 and 0.06393951922655106
Loss after epoch 971 is 0.0006868963
Weight and bias after 981 epoch are 1.9788951873779297 and 0.06205080822110176
Loss after epoch 981 is 0.0006469181
Weight and bias after 991 epoch are 1.9795185327529907 and 0.06021789088845253
Loss after epoch 991 is 0.0006092641
Weight and bias after 1001 epoch are 1.980123519897461 and 0.058439161628484726
Loss after epoch 1001 is 0.0005737978
Weight and bias after 1011 epoch are 1.9807106256484985 and 0.05671295151114464
Loss after epoch 1011 is 0.0005403999
Weight and bias after 1021 epoch are 1.9812804460525513 and 0.0550377294421196
Loss after epoch 1021 is 0.0005089474
Weight and bias after 1031 epoch are 1.9818334579467773 and 0.05341199040412903
Loss after epoch 1031 is 0.0004793264
Weight and bias after 1041 epoch are 1.9823700189590454 and 0.051834236830472946
Loss after epoch 1041 is 0.0004514238
Weight and bias after 1051 epoch are 1.9828908443450928 and 0.

Weight and bias after 1911 epoch are 1.998701572418213 and 0.0038173971697688103
Loss after epoch 1911 is 0.0000024483
Weight and bias after 1921 epoch are 1.9987398386001587 and 0.0037046910729259253
Loss after epoch 1921 is 0.0000023059
Weight and bias after 1931 epoch are 1.9987770318984985 and 0.003595309564843774
Loss after epoch 1931 is 0.0000021718
Weight and bias after 1941 epoch are 1.9988131523132324 and 0.0034891560208052397
Loss after epoch 1941 is 0.0000020454
Weight and bias after 1951 epoch are 1.9988481998443604 and 0.003386115189641714
Loss after epoch 1951 is 0.0000019264
Weight and bias after 1961 epoch are 1.9988821744918823 and 0.0032861176878213882
Loss after epoch 1961 is 0.0000018146
Weight and bias after 1971 epoch are 1.9989153146743774 and 0.003189092967659235
Loss after epoch 1971 is 0.0000017087
Weight and bias after 1981 epoch are 1.9989473819732666 and 0.0030948999337852
Loss after epoch 1981 is 0.0000016091
Weight and bias after 1991 epoch are 1.99897837

Weight and bias after 2961 epoch are 1.9999439716339111 and 0.00016458974278066307
Loss after epoch 2961 is 0.0000000046
Weight and bias after 2971 epoch are 1.9999456405639648 and 0.00015975820133462548
Loss after epoch 2971 is 0.0000000043
Weight and bias after 2981 epoch are 1.999947190284729 and 0.00015508042997680604
Loss after epoch 2981 is 0.0000000041
Weight and bias after 2991 epoch are 1.9999486207962036 and 0.00015054570394568145
Loss after epoch 2991 is 0.0000000038
Weight and bias after 3001 epoch are 1.9999501705169678 and 0.0001461790525354445
Loss after epoch 3001 is 0.0000000036
Weight and bias after 3011 epoch are 1.9999516010284424 and 0.00014193398237694055
Loss after epoch 3011 is 0.0000000034
Weight and bias after 3021 epoch are 1.9999529123306274 and 0.0001378164888592437
Loss after epoch 3021 is 0.0000000032
Weight and bias after 3031 epoch are 1.999954342842102 and 0.0001338361034868285
Loss after epoch 3031 is 0.0000000030
Weight and bias after 3041 epoch are 

Weight and bias after 3791 epoch are 1.9999943971633911 and 1.591305226611439e-05
Loss after epoch 3791 is 0.0000000000
Weight and bias after 3801 epoch are 1.9999943971633911 and 1.556138704472687e-05
Loss after epoch 3801 is 0.0000000000
Weight and bias after 3811 epoch are 1.9999945163726807 and 1.5238329979183618e-05
Loss after epoch 3811 is 0.0000000000
Weight and bias after 3821 epoch are 1.9999947547912598 and 1.4929579265299253e-05
Loss after epoch 3821 is 0.0000000000
Weight and bias after 3831 epoch are 1.9999947547912598 and 1.4596987057302613e-05
Loss after epoch 3831 is 0.0000000000
Weight and bias after 3841 epoch are 1.9999948740005493 and 1.4282278243626934e-05
Loss after epoch 3841 is 0.0000000000
Weight and bias after 3851 epoch are 1.9999951124191284 and 1.4004522199684288e-05
Loss after epoch 3851 is 0.0000000000
Weight and bias after 3861 epoch are 1.9999951124191284 and 1.3685042176803108e-05
Loss after epoch 3861 is 0.0000000000
Weight and bias after 3871 epoch a

NOTE : We haven't actually created a model on our own here. Since the problem statement is that of a simple regression, we have used a linear layer with just one unit as our model, by directly calling it from its package and setting all its parameters(input and output size).