<a href="https://colab.research.google.com/github/shayari21/Udacity--Secure-AI/blob/master/Training_NN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training Neural Networks

First the NN we have constructed isn't trained yet. It cannot predict the probability of certain classes and thus considers equal probabilities for all cases. So thus we can conclude by saying that training a NN is required to get appropriate results.

To find if the prediction is correct, we try to calculate a loss function which basically is the mean squared loss.

## Gradient Descent and Back Propagation
To minimise the loss, we use the technique known as gradient decsent. It basically points to that direction where the loss decreases or the slope gradient is maximum. We use **back propagation** for training NN.



In [0]:
import numpy as np
from torchvision import datasets, transforms
import torch
import matplotlib.pyplot as plt
from torch import nn
import torch.nn.functional as F

transform= transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,),
                                                                           (0.5,)),])
trainset=datasets.MNIST('~/.pytorch/MNIST_data/',download=True,train=True, transform= transform)
trainloader= torch.utils.data.DataLoader(trainset,batch_size=64,shuffle=True)

In [7]:
# Building Feed Forward Network and calculating Loss

model = nn.Sequential(nn.Linear(784,128),
                      nn.ReLU(),
                      nn.Linear(128,64),
                      nn.ReLU(),
                      nn.Linear(64,10))


criterion =  nn. CrossEntropyLoss()
images,labels= next(iter(trainloader))

images= images.view(images.shape[0],-1)
logits= model(images)
loss = criterion(logits,labels)

print(loss)

tensor(2.3012, grad_fn=<NllLossBackward>)


we can use nn.LogSoftmax or F.log_softmax  to model it will log softmax output and get actual probabilities by torch.exp(output).

We here calculate the error using nn.NLLLoss.

In [17]:
model = nn.Sequential(nn.Linear(784,128),
                      nn.ReLU(),
                      nn.Linear(128,64),
                      nn.ReLU(),
                      nn.Linear(64,10),
                      nn.LogSoftmax(dim=1))


criterion =  nn. NLLLoss()
images,labels= next(iter(trainloader))

images= images.view(images.shape[0],-1)
logits= model(images)
loss = criterion(logits,labels)

print(loss)

tensor(2.3222, grad_fn=<NllLossBackward>)


pytorch has  a module called **Autograd** which keeps track of the operation the user does and displayes them when the user asks to. We can switch it off as well.

the gradients are calculated wrt to some variable.

In [0]:
x = torch.randn(2,2,requires_grade=True)
print(x)
y = x**2
z=y.mean
print(x.grad)
z.backward()
print(x.grad)

In [19]:
# using it in NN

model = nn.Sequential(nn.Linear(784,128),
                      nn.ReLU(),
                      nn.Linear(128,64),
                      nn.ReLU(),
                      nn.Linear(64,10),
                      nn.LogSoftmax(dim=1))


criterion =  nn. NLLLoss()
images,labels= next(iter(trainloader))

images= images.view(images.shape[0],-1)
logits= model(images)
loss = criterion(logits,labels)

print("Before Bck:",model[0].weight.grad)
loss.backward()
print("After back:", model[0].weight.grad)

Before Bck: None
After back: tensor([[ 2.6282e-03,  2.6282e-03,  2.6282e-03,  ...,  2.6282e-03,
          2.6282e-03,  2.6282e-03],
        [-1.1559e-03, -1.1559e-03, -1.1559e-03,  ..., -1.1559e-03,
         -1.1559e-03, -1.1559e-03],
        [ 2.4392e-03,  2.4392e-03,  2.4392e-03,  ...,  2.4392e-03,
          2.4392e-03,  2.4392e-03],
        ...,
        [-1.4205e-03, -1.4205e-03, -1.4205e-03,  ..., -1.4205e-03,
         -1.4205e-03, -1.4205e-03],
        [-2.0766e-04, -2.0766e-04, -2.0766e-04,  ..., -2.0766e-04,
         -2.0766e-04, -2.0766e-04],
        [ 3.9423e-05,  3.9423e-05,  3.9423e-05,  ...,  3.9423e-05,
          3.9423e-05,  3.9423e-05]])


#Training!

Using optim module of torch.

In [20]:
from torch import optim
optimizer= optim.SGD(model.parameters(),lr=0.01)


print ('Initial weights -', model[0].weight)

images, labels= next(iter(trainloader))
images.resize_(64,784)

optimizer.zero_grad() # Clears old gradients- VERY IMPORTANT

output=model.forward(images)
loss= criterion(output,labels)
loss.backward()
print("Gradient -",model[0].weight.grad)

Initial weights - Parameter containing:
tensor([[ 0.0098,  0.0087, -0.0070,  ...,  0.0335,  0.0150,  0.0325],
        [ 0.0081, -0.0270,  0.0135,  ..., -0.0288, -0.0058, -0.0119],
        [-0.0108,  0.0004,  0.0320,  ...,  0.0213,  0.0070, -0.0161],
        ...,
        [-0.0091,  0.0233,  0.0271,  ...,  0.0088, -0.0228, -0.0226],
        [-0.0196, -0.0241,  0.0256,  ...,  0.0118, -0.0292,  0.0024],
        [-0.0204, -0.0056,  0.0226,  ..., -0.0205, -0.0016,  0.0053]],
       requires_grad=True)
Gradient - tensor([[ 0.0034,  0.0034,  0.0034,  ...,  0.0034,  0.0034,  0.0034],
        [ 0.0049,  0.0049,  0.0049,  ...,  0.0049,  0.0049,  0.0049],
        [ 0.0027,  0.0027,  0.0027,  ...,  0.0027,  0.0027,  0.0027],
        ...,
        [-0.0025, -0.0025, -0.0025,  ..., -0.0025, -0.0025, -0.0025],
        [-0.0001, -0.0001, -0.0001,  ..., -0.0001, -0.0001, -0.0001],
        [ 0.0011,  0.0011,  0.0011,  ...,  0.0011,  0.0011,  0.0011]])


In [21]:
optimizer.step()
print('Updated weights -', model[0].weight)

Updated weights - Parameter containing:
tensor([[ 0.0097,  0.0086, -0.0071,  ...,  0.0334,  0.0150,  0.0324],
        [ 0.0081, -0.0270,  0.0135,  ..., -0.0289, -0.0059, -0.0120],
        [-0.0108,  0.0004,  0.0320,  ...,  0.0212,  0.0069, -0.0161],
        ...,
        [-0.0091,  0.0233,  0.0272,  ...,  0.0088, -0.0227, -0.0226],
        [-0.0196, -0.0241,  0.0256,  ...,  0.0118, -0.0292,  0.0024],
        [-0.0204, -0.0056,  0.0226,  ..., -0.0205, -0.0016,  0.0053]],
       requires_grad=True)


In [0]:
## COMPLETE MODEL
model = nn.Sequential(nn.Linear(784,128),
                      nn.ReLU(),
                      nn.Linear(128,64),
                      nn.ReLU(),
                      nn.Linear(64,10),
                      nn.LogSoftmax(dim=1))


criterion =  nn. NLLLoss()
optimizer= optim.SGD(model.parameters(),lr=0.03)

epochs=5
for e in range(epochs):
  running_loss = 0
  for images, labels in trainloader:
    
    images = images.view(images.shape[0],-1)
    optimizer.zero_grad() # Clears old gradients- VERY IMPORTANT

    output=model.forward(images)
    loss= criterion(output,labels)
    loss.backward()
    optimizer.step()
    
    running_loss += loss.item()
    
  else:
     
     print(f"Training loss:{running_loss/len(trainloader)}")
    
    


Training loss:0.6049599748557565
Training loss:0.27909948513197747
Training loss:0.21470904898748341
Training loss:0.17165970954813684
Training loss:0.14449586481380183


In [27]:
!pip install helper




In [29]:
!wget https://raw.githubusercontent.com/udacity/deep-learning-v2-pytorch/3bd7dea850e936d8cb44adda8200e4e2b5d627e3/intro-to-pytorch/helper.py

--2019-06-06 12:44:53--  https://raw.githubusercontent.com/udacity/deep-learning-v2-pytorch/3bd7dea850e936d8cb44adda8200e4e2b5d627e3/intro-to-pytorch/helper.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2813 (2.7K) [text/plain]
Saving to: ‘helper.py’


2019-06-06 12:44:53 (83.3 MB/s) - ‘helper.py’ saved [2813/2813]



In [31]:
%matplotlib inline
import helper.py as helper

images, labels = next(iter(trainloader))

img = images[0].view(1, 784)
# Turn off gradients to speed up this part
with torch.no_grad():
    logps = model(img)

# Output of the network are log-probabilities, need to take exponential for probabilities
ps = torch.exp(logps)
helper.view_classify(img.view(1, 28, 28), ps)

ModuleNotFoundError: ignored