<h1 style="color:red;">What is Pytorch?</h1> 

    
 PyTorch is a tool that helps you build and train computer models to recognize patterns, like identifying objects in images or understanding language. PyTorch defines a class called Tensor (torch.Tensor) to store and operate on homogeneous multidimensional rectangular arrays of numbers.rch!

In [1]:
import torch
x= torch.empty(2,3)
x

tensor([[-6.7799e-09,  3.0756e-41, -6.7759e-09],
        [ 3.0756e-41,  1.1210e-43,  0.0000e+00]])

In [2]:
y=torch.rand(2,3)
y

tensor([[0.9574, 0.8304, 0.7130],
        [0.0809, 0.5707, 0.8176]])

In [3]:
z=torch.rand(2,3)
z

tensor([[0.4426, 0.8664, 0.3959],
        [0.9920, 0.9242, 0.0803]])

In [4]:
#We can also specify the data type
m=torch.ones(2,3, dtype=torch.float64)
m.dtype

torch.float64

In [5]:
y+z

tensor([[1.4000, 1.6968, 1.1089],
        [1.0729, 1.4949, 0.8979]])

In [6]:
torch.add(y,z)

tensor([[1.4000, 1.6968, 1.1089],
        [1.0729, 1.4949, 0.8979]])

In [7]:
y.add_(z)

tensor([[1.4000, 1.6968, 1.1089],
        [1.0729, 1.4949, 0.8979]])

In [8]:
y-z

tensor([[0.9574, 0.8304, 0.7130],
        [0.0809, 0.5707, 0.8176]])

In [9]:
torch.sub(y,z)

tensor([[0.9574, 0.8304, 0.7130],
        [0.0809, 0.5707, 0.8176]])

In [10]:
torch.mul(y,z)

tensor([[0.6196, 1.4700, 0.4390],
        [1.0644, 1.3815, 0.0721]])

In [11]:
# Lets print all the rows but one column and similarly all the column with 1 row
k=torch.rand(5,3)
print(k)
print(k[:,1:])
print(k[:,[0]])

tensor([[0.6463, 0.2005, 0.7957],
        [0.0374, 0.9845, 0.4592],
        [0.9075, 0.4644, 0.7022],
        [0.5143, 0.9499, 0.2334],
        [0.7156, 0.5222, 0.1222]])
tensor([[0.2005, 0.7957],
        [0.9845, 0.4592],
        [0.4644, 0.7022],
        [0.9499, 0.2334],
        [0.5222, 0.1222]])
tensor([[0.6463],
        [0.0374],
        [0.9075],
        [0.5143],
        [0.7156]])


In [12]:
# We can get the actual value of the particular row and column we can use .item()
k[0,0].item()

0.6462954878807068

In [13]:
#convert a tensor into numpy array
import numpy as np
a= torch.ones(1,5)
b= a.numpy()
a


tensor([[1., 1., 1., 1., 1.]])

In [14]:
b

array([[1., 1., 1., 1., 1.]], dtype=float32)

In [15]:
#Convert a numpy array to a tensor
c=np.ones(5)
d=torch.from_numpy(c)
d

tensor([1., 1., 1., 1., 1.], dtype=torch.float64)

In [16]:
# sometimes when the tensor is defined there is an argument requires_grad=True, by default it is false, it helps in optimization later on
e=torch.ones(3,4, requires_grad=True)
e

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]], requires_grad=True)

<span style="color:#FF5733">This section will talk about the autograd and how we can calculate gradients from it. Gradient is important for optimization. 

In [17]:
x1=torch.randn(3, requires_grad=True)
x1

tensor([0.7178, 0.9327, 1.1487], requires_grad=True)

<span style="color:blue">The requires_grad=True argument tells PyTorch that we want to compute gradients with respect to this tensor during backpropagation

In [18]:
y1=x1+2
y1

tensor([2.7178, 2.9327, 3.1487], grad_fn=<AddBackward0>)

<span style="color:blue">grad_fn tells you that y1 was created by an addition operation, and PyTorch has recorded this operation as part of the computation graph enabling PyTorch to efficiently compute gradients

In [19]:
y2=x1*x1*2
y2

tensor([1.0304, 1.7400, 2.6389], grad_fn=<MulBackward0>)

In [20]:
y3= y2.mean()
y3

tensor(1.8031, grad_fn=<MeanBackward0>)

<span style="color:blue">To calculate the gradient, all we need to now do is y3.backward() which will do dy3/dx1.. For scalar , no argument is needed but for vector we need to put the argument as same size as x1 

In [21]:
#y3.backward() # for scalar


In [22]:
v= torch.tensor([0.1,0.02,0.003], dtype=torch.float32)
y2.backward(v) #for vector

In [23]:
x1.grad

tensor([0.2871, 0.0746, 0.0138])

<span style="color:blue">Let's say now that we don't want the requires_grad=true, so that pytorch wont track the history in computational graph: 
We essentially have three options: 
1) x1.requires_grad_(False), remember that whenever there is underscore _ it will modify the variable in place
2) x.detach()
3) with torch.no_grad()

In [24]:
x1.requires_grad_(False)
x1

tensor([0.7178, 0.9327, 1.1487])

In [25]:
y2.detach()


tensor([1.0304, 1.7400, 2.6389])

<span style="color:blue">Let's look at the trainning iteration where we want to make sure that for each iteration the x1.grad computes the same value. It is done by setting the grad value to zero after each iteration.

In [26]:
weights=torch.ones(4, requires_grad=True)
for epoch in range(3):
    model_output=(weights*3).sum()
    model_output.backward()
    print(weights.grad)

tensor([3., 3., 3., 3.])
tensor([6., 6., 6., 6.])
tensor([9., 9., 9., 9.])


In [27]:
weights=torch.ones(4, requires_grad=True)
for epoch in range(3):
    model_output=(weights*3).sum()
    model_output.backward()
    print(weights.grad)
    weights.grad.zero_()

tensor([3., 3., 3., 3.])
tensor([3., 3., 3., 3.])
tensor([3., 3., 3., 3.])


<span style="color:red"> Backpropagation

In [28]:
import torch
x=torch.tensor(1.0)
y=torch.tensor(2.0)

w=torch.tensor(1.0, requires_grad=True)

#Forward pass and  compute the loss
y_hat=w*x
loss=(y_hat-y)**2

print(loss)

#Backward pass

loss.backward()
print(w.grad)

## update weights



tensor(1., grad_fn=<PowBackward0>)
tensor(-2.)


<span style="color:blue"> For this part we will do everything manually only using numpy array and then translate the idea to pytorch ... We will start with our prediction and then compute gradient and then the loss and finally update the parameter


In [29]:
import numpy as np 

# f=w*x
X=np.array([1,2,3,4], dtype=np.float32)
Y=np.array([2,4,6,8], dtype=np.float32)

w=0.0

#model prediction
def forward(x): 
    return w*x

#loss =MSE

def loss(y,y_predicted):
    return((y_predicted-y)**2).mean()

#gradient : 
def gradient(x,y,y_predicted):
    return np.dot(2*x,y_predicted-y).mean()

print(f'Prediction before training: f(5)={forward(5):.3f}')

#Training
learning_rate = 0.01
n_iters=30

for epoch in range(n_iters):
    #prediction=forward pass 
    y_pred=forward(X)

    #loss
    l=loss(Y,y_pred)

    #gradient
    dw=gradient(X,Y,y_pred)

    #update weights
    w-=learning_rate*dw
    if epoch %2==0:
        print(f'epoch{epoch+1}: w={w:.3f}, loss= {l:.8f}')

print(f'Prediction after training : f(5) = {forward(5):.3f}')

    


    
    

Prediction before training: f(5)=0.000
epoch1: w=1.200, loss= 30.00000000
epoch3: w=1.872, loss= 0.76800019
epoch5: w=1.980, loss= 0.01966083
epoch7: w=1.997, loss= 0.00050331
epoch9: w=1.999, loss= 0.00001288
epoch11: w=2.000, loss= 0.00000033
epoch13: w=2.000, loss= 0.00000001
epoch15: w=2.000, loss= 0.00000000
epoch17: w=2.000, loss= 0.00000000
epoch19: w=2.000, loss= 0.00000000
epoch21: w=2.000, loss= 0.00000000
epoch23: w=2.000, loss= 0.00000000
epoch25: w=2.000, loss= 0.00000000
epoch27: w=2.000, loss= 0.00000000
epoch29: w=2.000, loss= 0.00000000
Prediction after training : f(5) = 10.000


<span style="color:blue"> Let's see how this translates to the pytorch framework by computing gradient using backward leaving the loss computation and forward pass as manually for now.

In [30]:
import torch

# f=w*x
X=torch.tensor([1,2,3,4], dtype=torch.float32)
Y=torch.tensor([2,4,6,8], dtype=torch.float32)

w=torch.tensor(0.0, dtype=torch.float32, requires_grad=True)

#model prediction
def forward(x): 
    return w*x

#loss =MSE

def loss(y,y_predicted):
    return((y_predicted-y)**2).mean()

#gradient : 
def gradient(x,y,y_predicted):
    return np.dot(2*x,y_predicted-y).mean()

print(f'Prediction before training: f(5)={forward(5):.3f}')

#Training
learning_rate = 0.01
n_iters=100

for epoch in range(n_iters):
    #prediction=forward pass 
    y_pred=forward(X)

    #loss
    l=loss(Y,y_pred)

    #gradient= Backward pass
    l.backward() #dl/dw

    #update weights
    with torch.no_grad():
        w-=learning_rate*w.grad

    #zero gradient
    w.grad.zero_()
    
    if epoch %2==0:
        print(f'epoch{epoch+1}: w={w:.3f}, loss= {l:.8f}')

print(f'Prediction after training : f(5) = {forward(5):.3f}')

    


    
    

Prediction before training: f(5)=0.000
epoch1: w=0.300, loss= 30.00000000
epoch3: w=0.772, loss= 15.66018772
epoch5: w=1.113, loss= 8.17471695
epoch7: w=1.359, loss= 4.26725292
epoch9: w=1.537, loss= 2.22753215
epoch11: w=1.665, loss= 1.16278565
epoch13: w=1.758, loss= 0.60698116
epoch15: w=1.825, loss= 0.31684780
epoch17: w=1.874, loss= 0.16539653
epoch19: w=1.909, loss= 0.08633806
epoch21: w=1.934, loss= 0.04506890
epoch23: w=1.952, loss= 0.02352631
epoch25: w=1.966, loss= 0.01228084
epoch27: w=1.975, loss= 0.00641066
epoch29: w=1.982, loss= 0.00334642
epoch31: w=1.987, loss= 0.00174685
epoch33: w=1.991, loss= 0.00091188
epoch35: w=1.993, loss= 0.00047601
epoch37: w=1.995, loss= 0.00024848
epoch39: w=1.996, loss= 0.00012971
epoch41: w=1.997, loss= 0.00006770
epoch43: w=1.998, loss= 0.00003534
epoch45: w=1.999, loss= 0.00001845
epoch47: w=1.999, loss= 0.00000963
epoch49: w=1.999, loss= 0.00000503
epoch51: w=1.999, loss= 0.00000262
epoch53: w=2.000, loss= 0.00000137
epoch55: w=2.000, l

<span style="color:blue">As we can see that this method is not as efficient as doing it manually 

<span style="color:blue"> Now lets encorporate the loss and forward pass using pytorch module

In [31]:
import torch
import torch.nn as nn # neural network module
# f=w*x
X=torch.tensor([[1],[2],[3],[4]], dtype=torch.float32)
Y=torch.tensor([[2],[4],[6],[8]], dtype=torch.float32)

X_test=torch.tensor([5], dtype=torch.float32)
n_samples, n_features = X.shape
print(n_samples, n_features)

input_size=n_features
output_size=n_features
model=nn.Linear(input_size, output_size)

#loss =MSE
#gradient : 
def gradient(x,y,y_predicted):
    return np.dot(2*x,y_predicted-y).mean()

print(f'Prediction before training: f(5)={model(X_test).item():.3f}')

#Training
learning_rate = 0.01
n_iters=100

loss = nn.MSELoss()
optimizer= torch.optim.SGD(model.parameters(), lr=learning_rate)
for epoch in range(n_iters):
    #prediction=forward pass 
    y_pred=model(X)

    #loss
    l=loss(Y,y_pred)

    #gradient= Backward pass
    l.backward() #dl/dw

    #update weights
    optimizer.step()
    #zero gradient
    optimizer.zero_grad()
    
    if epoch %2==0:
        [w,b]=model.parameters()
        print(f'epoch{epoch+1}: w={w[0][0]:.3f}, loss= {l:.8f}')

print(f'Prediction after training : f(5) = {model(X_test).item():.3f}')

    


    
    

4 1
Prediction before training: f(5)=0.131
epoch1: w=0.357, loss= 29.84667778
epoch3: w=0.814, loss= 14.38744164
epoch5: w=1.131, loss= 6.94411182
epoch7: w=1.352, loss= 3.36018634
epoch9: w=1.505, loss= 1.63444066
epoch11: w=1.612, loss= 0.80335093
epoch13: w=1.686, loss= 0.40301198
epoch15: w=1.738, loss= 0.21006858
epoch17: w=1.774, loss= 0.11698116
epoch19: w=1.799, loss= 0.07197370
epoch21: w=1.817, loss= 0.05011751
epoch23: w=1.830, loss= 0.03941014
epoch25: w=1.839, loss= 0.03407263
epoch27: w=1.846, loss= 0.03132283
epoch29: w=1.850, loss= 0.02982105
epoch31: w=1.854, loss= 0.02892222
epoch33: w=1.857, loss= 0.02831589
epoch35: w=1.859, loss= 0.02785236
epoch37: w=1.861, loss= 0.02745970
epoch39: w=1.862, loss= 0.02710311
epoch41: w=1.863, loss= 0.02676594
epoch43: w=1.864, loss= 0.02644010
epoch45: w=1.865, loss= 0.02612161
epoch47: w=1.866, loss= 0.02580859
epoch49: w=1.867, loss= 0.02550017
epoch51: w=1.868, loss= 0.02519578
epoch53: w=1.869, loss= 0.02489522
epoch55: w=1.87

<span style="color:blue"> Here we have used the model from the module itself, but we can customize it as follow 

In [32]:
import torch
import torch.nn as nn # neural network module
# f=w*x
X=torch.tensor([[1],[2],[3],[4]], dtype=torch.float32)
Y=torch.tensor([[2],[4],[6],[8]], dtype=torch.float32)

X_test=torch.tensor([5], dtype=torch.float32)
n_samples, n_features = X.shape
print(n_samples, n_features)

input_size=n_features
output_size=n_features

class LinearRegression(nn.Module):

    def __init__(self, input_dim, output_dim):
        super(LinearRegression,self).__init__()
        self.lin = nn.Linear(input_dim, output_dim)

    def forward(self, x):
        return self.lin(x)
    
#model=nn.Linear(input_size, output_size)
model=LinearRegression(input_size, output_size)
#loss =MSE
#gradient : 
def gradient(x,y,y_predicted):
    return np.dot(2*x,y_predicted-y).mean()

print(f'Prediction before training: f(5)={model(X_test).item():.3f}')

#Training
learning_rate = 0.01
n_iters=100

loss = nn.MSELoss()
optimizer= torch.optim.SGD(model.parameters(), lr=learning_rate)
for epoch in range(n_iters):
    #prediction=forward pass 
    y_pred=model(X)

    #loss
    l=loss(Y,y_pred)

    #gradient= Backward pass
    l.backward() #dl/dw

    #update weights
    optimizer.step()
    #zero gradient
    optimizer.zero_grad()
    
    if epoch %2==0:
        [w,b]=model.parameters()
        print(f'epoch{epoch+1}: w={w[0][0]:.3f}, loss= {l:.8f}')

print(f'Prediction after training : f(5) = {model(X_test).item():.3f}')

    


    
    

4 1
Prediction before training: f(5)=-3.447
epoch1: w=-0.234, loss= 55.52162170
epoch3: w=0.389, loss= 26.76190567
epoch5: w=0.822, loss= 12.91467190
epoch7: w=1.123, loss= 6.24732018
epoch9: w=1.332, loss= 3.03685331
epoch11: w=1.477, loss= 1.49077260
epoch13: w=1.578, loss= 0.74604261
epoch15: w=1.649, loss= 0.38714239
epoch17: w=1.698, loss= 0.21401006
epoch19: w=1.733, loss= 0.13032314
epoch21: w=1.757, loss= 0.08970562
epoch23: w=1.774, loss= 0.06982809
epoch25: w=1.787, loss= 0.05994007
epoch27: w=1.796, loss= 0.05486553
epoch29: w=1.802, loss= 0.05211221
epoch31: w=1.807, loss= 0.05048028
epoch33: w=1.811, loss= 0.04939189
epoch35: w=1.813, loss= 0.04856873
epoch37: w=1.816, loss= 0.04787694
epoch39: w=1.818, loss= 0.04725181
epoch41: w=1.819, loss= 0.04666230
epoch43: w=1.821, loss= 0.04609346
epoch45: w=1.822, loss= 0.04553780
epoch47: w=1.824, loss= 0.04499200
epoch49: w=1.825, loss= 0.04445421
epoch51: w=1.826, loss= 0.04392352
epoch53: w=1.827, loss= 0.04339956
epoch55: w=1