<a href="https://colab.research.google.com/github/TheDeshBhakt/DeepLearning_Pytorch/blob/master/feed_forward_neural_network_using_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable

**Intialize Hyperparameters**



In [0]:
input_size = 784       # Image size = 28 * 28 = 784
hidden_size = 500      # The number of nodes at hidden layer
num_classes = 10       # The number of output classes/ In this case 0 to 9
num_epochs = 5         # The number of times entire dataset is trained
batch_size = 100       # The size of input data took for one iteration
learning_rate = 1e-3   # The speed of convergence

**USING MNIST DATASET (huge database of handwritten digits (0 to 9) that is used in Image classification**

In [6]:
train_dataset = dsets.MNIST(root = './data', train = True, transform=transforms.ToTensor(), download = True)

test_datasets = dsets.MNIST(root = './data', train = False, transform=transforms.ToTensor())

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw
Processing...
Done!


# Load the DataSet


In [0]:
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size=batch_size, shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_datasets, batch_size=batch_size, shuffle = False)

# Build Feedforward Neural Network

**Feedforward Neural Network Model Structure**

The FNN includes two fully-connected layers (i.e. fc1 & fc2) and a non-linear ReLU layer in between. Normally we call this structure 1-hidden layer FNN, without counting the output layer (fc2) in.

By running the forward pass, the input images (x) can go through the neural network and generate a output (out) demonstrating how are the likabilities it belongs to each of the 10 classes. For example, a cat image can have 0.8 likability to a dog class and a 0.3 likability to a airplane class.

In [0]:
class Net(nn.Module):
  def __init__(self, input_size, hidden_size, num_classes):
    super(Net, self).__init__()                     # Inherited from the parent class nn.module
    self.fc1 = nn.Linear(input_size, hidden_size)   # 1st Full-Connected Layer: 784 (input data) -> 500 (hidden node)
    self.relu = nn.ReLU()                           # Non-Linear ReLu Layer : \\\\\\max(0,x)
    self.fc2 = nn.Linear(hidden_size, num_classes)  # 2nd Full-Connected Layer:  500 (hidden node) -> 10 (output class)

  def forward(self, x):                             # Forward pass: stacking each layer together
      out = self.fc1(x)
      out = self.relu(out)
      out = self.fc2(out)
      return out

**Instantiate the FeedForward Neural Network**

We create a real FNN based on our structure

In [0]:
net = Net(input_size, hidden_size, num_classes)

# Enable GPU

You could enable this line to run the codes on GPU

In [0]:
use_cuda = True

In [0]:
if use_cuda and torch.cuda.is_available():
      net.cuda()

# Choose the Loss Function and Optimizer
Loss function **(criterion)** decides how the output can be compared to a class,which determines how good or bad the neural network performs. 

And the **optimizer** chooses a way to update the weight in order to converge to find the best weights in this neural network.

In [0]:
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)

# Training of Feed Forward Neural Network 

It takes few minutes depending upon your machine. Atleast (3-5 minutes)

In [32]:
for epoch in range(num_epochs):
  for i, (images, labels) in enumerate(train_loader):  # Load a batch of images with its (index, data, class)
      images = Variable(images.view(-1, 28*28))        # Convert torch tensor to Variable: change image from a vector of size 784 to a matrix of 28 x 28
      labels = Variable(labels)

      if use_cuda and torch.cuda.is_available():
        images = images.cuda()
        labels = labels.cuda()

      optimizer.zero_grad()                # Intialize the hidden weight to all zeros
      outputs = net(images)                # Forward pass: compute the output class given a image
      loss = criterion(outputs,labels)     # Compute the loss: difference between the output class and the pre-given label
      loss.backward()                      # Backward pass: compute the weight
      optimizer.step()                     # Optimizer: update the weights of hidden nodes

      if (i+1) % 100 == 0:                  #Logging
        print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' %(epoch+1, num_epochs, i+1, len(train_dataset) /batch_size, loss.data))

Epoch [1/5], Step [100/600], Loss: 0.0980
Epoch [1/5], Step [200/600], Loss: 0.1087
Epoch [1/5], Step [300/600], Loss: 0.1217
Epoch [1/5], Step [400/600], Loss: 0.1007
Epoch [1/5], Step [500/600], Loss: 0.0425
Epoch [1/5], Step [600/600], Loss: 0.1776
Epoch [2/5], Step [100/600], Loss: 0.0874
Epoch [2/5], Step [200/600], Loss: 0.0493
Epoch [2/5], Step [300/600], Loss: 0.0336
Epoch [2/5], Step [400/600], Loss: 0.0405
Epoch [2/5], Step [500/600], Loss: 0.1266
Epoch [2/5], Step [600/600], Loss: 0.0267
Epoch [3/5], Step [100/600], Loss: 0.0254
Epoch [3/5], Step [200/600], Loss: 0.0603
Epoch [3/5], Step [300/600], Loss: 0.0750
Epoch [3/5], Step [400/600], Loss: 0.0938
Epoch [3/5], Step [500/600], Loss: 0.0756
Epoch [3/5], Step [600/600], Loss: 0.0403
Epoch [4/5], Step [100/600], Loss: 0.0188
Epoch [4/5], Step [200/600], Loss: 0.0211
Epoch [4/5], Step [300/600], Loss: 0.0362
Epoch [4/5], Step [400/600], Loss: 0.0266
Epoch [4/5], Step [500/600], Loss: 0.0283
Epoch [4/5], Step [600/600], Loss:

# Testing the FeedForward Neural Network Model


Similar to training the neural network, we also need to load batches of test images and collect the outputs. The differences are that:

* No loss & weights calculation
* No weights update
* Has correct prediction calculation

In [34]:
correct = 0
total = 0
for images. labels in test_loader:
    images = Variable(images.view(-1, 28*28))

    if use_cuda and torch.cuda.is_available():
        images = images.cuda()
        labels = labels.cuda()

    outputs = net(images)
    _, predicted = torch.max(outputs.data, 1)  # Choose the best class from the output: The class with the best score
    total = total + labels.size(0)             # Increment the total count 
    correct += (predicted == labels).sum()     # Increment the current count

print('Accuracy of the network on the 10K test images: %d %%' % (100 * correct / total))

Accuracy of the network on the 10K test images: 99 %


**It's amazing we get 99 % accuracy on feed forward Neural Network**

**Save The Trained Model for further Use**

*We save the trained model as a pickle that can be loaded and used late*

In [0]:
torch.save(net.state_dict(), 'feedforward_neural_model.pkl')

**Model Saved for further Uses**

!! Making of Feedforwad Neural Network Done !!