# Model optimization with automatic gradient computation 
<h3 span style='color:yellow'>This tutorial will cover the implementation of baseline linear regression optimization using the Autograd module.</h3>

<h3 span style='color:yellow'>There are four different cases that can be followed towards automatic optimization.</h3>

<h3 span style='color:yellow'>Case #1 manual opertaions.</h3>
<ul>
<li ><span style="color:yellow">Prediction:</span> manually.</li>
<li><span style="color:yellow">Gradient computation:</span> manually.</li>
<li><span style="color:yellow">Loss computation:</span> manually.</li>
<li><span style="color:yellow">Parameter updates:</span> manually.</li>
</ul>
</ul>

<h3 span style='color:yellow'>Case #2 gradient computation by Autograd.</h3>
<ul>
<li ><span style="color:yellow">Prediction:</span> manually.</li>
<li><span style="color:lightgreen">Gradient computation:</span> Autograd.</li>
<li><span style="color:yellow">Loss computation:</span> manually.</li>
<li><span style="color:yellow">Parameter updates:</span> manually.</li>
</ul>
</ul>

<h3 span style='color:yellow'>Case #3 all operations except the prediction are manually.</h3>
<ul>
<li ><span style="color:yellow">Prediction:</span> manually.</li>
<li><span style="color:lightgreen">Gradient computation:</span> Autograd.</li>
<li><span style="color:lightgreen">Loss computation:</span> Pytorch loss.</li>
<li><span style="color:lightgreen">Parameter updates:</span> Pytorch optimizer.</li>
</ul>
</ul>
<h3 span style='color:yellow'>Case #4 automatic optimization and prediction.</h3>
<ul>
<li ><span style="color:green">Prediction:</span> Pytorch model.</li>
<li><span style="color:lightgreen">Gradient computation:</span> Autograd.</li>
<li><span style="color:lightgreen">Loss computation:</span> Pytorch loss.</li>
<li><span style="color:lightgreen">Parameter updates:</span> Pytorch optimizer.</li>
</ul>
</ul>

<h1>Phase #1:  manual operations/ steps</h1>

In [67]:
import numpy as np
"Remember, linear regression represents the linear combination of inputs and weights."
"Let's consider an example where the optimal weight for the x variable is 2."
x=np.array([1,2,3,4],dtype=np.float32)
y=np.array([2,4,6,8],dtype=np.float32)

In [68]:
# Weight initialization
w=0.0

In [69]:
# Model prediction
def forward(x):  # forward pass
    return w*x

In [70]:
# loss
def loss(y,y_predicted): # Considering the mean square error (loass)
    return ((y_predicted-y)**2).mean()

print(f'The prediction before training: {forward(5):.3f}')

The prediction before training: 0.000


In [71]:
# Gradient 
# Error= 1/N(w*x  - y)**2
#dj/dw=1/N 2x(wx-y)
def gradient(x,y,y_predicted):
    return np.dot(2*x, y_predicted-y).mean()

In [72]:
# Training procedure
LR=0.005
N_ITER=20
for epoch in range(N_ITER):
    #prediction
    y_pred=forward(x)
    #loss
    l=loss(y,y_pred)
    #gradient
    dw=gradient(x,y,y_pred)
    # Weight update: go in the negative direction of the gradient
    w-=LR*dw
    if epoch % 1==0:   # To print information at every epoch we use %1==0, if we want to print every even (2) epoch we use % 2
        print(f'epoch {epoch+1}: w={w:.3f}, loss={l:.3f}')
        

epoch 1: w=0.600, loss=30.000
epoch 2: w=1.020, loss=14.700
epoch 3: w=1.314, loss=7.203
epoch 4: w=1.520, loss=3.529
epoch 5: w=1.664, loss=1.729
epoch 6: w=1.765, loss=0.847
epoch 7: w=1.835, loss=0.415
epoch 8: w=1.885, loss=0.203
epoch 9: w=1.919, loss=0.100
epoch 10: w=1.944, loss=0.049
epoch 11: w=1.960, loss=0.024
epoch 12: w=1.972, loss=0.012
epoch 13: w=1.981, loss=0.006
epoch 14: w=1.986, loss=0.003
epoch 15: w=1.991, loss=0.001
epoch 16: w=1.993, loss=0.001
epoch 17: w=1.995, loss=0.000
epoch 18: w=1.997, loss=0.000
epoch 19: w=1.998, loss=0.000
epoch 20: w=1.998, loss=0.000


In [73]:
# Inference
data=np.array([5,3,4])
forward(data)

array([9.99202076, 5.99521245, 7.9936166 ])

<h1>Phase #2: Gradient calculation using Autograd</h1>

In [74]:
import torch
x=torch.tensor([1,2,3,4],dtype=torch.float32)
y=torch.tensor([2,4,6,8],dtype=torch.float32)
w=torch.tensor(0.0,dtype=torch.float32,requires_grad=True)

In [75]:
# Model prediction
def forward(x):  # forward pass
    return w*x

# loss
def loss(y,y_predicted): # consider the mean square error (loss)
    return ((y_predicted-y)**2).mean()

print(f'The prediction before training: {forward(5):.3f}')

The prediction before training: 0.000


In [76]:
LR=0.01
N_ITER=20
# The training procedure is similar to the above procedure, except for the gradient calculation
for epoch in range(N_ITER):
    # prediction
    y_pred=forward(x)
    # Loss
    l=loss(y,y_pred)
    # Gradient
    l.backward()  # dl/dw

    #Update rule: Here, we do not want 'w' to be part of gradient tracking and the computational graph.
    with torch.no_grad():
                w-=LR*w.grad
    
    # Empty the gradient: zero gradient
    w.grad.zero_()
    
    if epoch %2==0:
        print(f'epoch {epoch+1}: w={w:.3f}, loss={l:.3f}')
        
    

epoch 1: w=0.300, loss=30.000
epoch 3: w=0.772, loss=15.660
epoch 5: w=1.113, loss=8.175
epoch 7: w=1.359, loss=4.267
epoch 9: w=1.537, loss=2.228
epoch 11: w=1.665, loss=1.163
epoch 13: w=1.758, loss=0.607
epoch 15: w=1.825, loss=0.317
epoch 17: w=1.874, loss=0.165
epoch 19: w=1.909, loss=0.086


In [77]:
# Inference
data=torch.tensor([5,3,4])
forward(data)

tensor([9.6124, 5.7674, 7.6899], grad_fn=<MulBackward0>)