# Foundations of Artificial Intelligence and Machine Learning
## A Program by IIIT-H and TalentSprint
#### To be done in the Lab

The objective of this experiment is to understand how to implement MLP using PyTorch. 

In this experiment we will be using MNIST database. The MNIST database is a dataset of handwritten digits. It has 60,000 training samples, and 10,000 test samples. Each image is represented by 28 x 28 pixels, each containing a value 0 - 255 with its gray scale value.

It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image.

It is a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting.

#### PyTorch

It’s a Python based scientific computing package targeted at two sets of audiences:

1. A replacement for NumPy to use the power of GPUs

2. a deep learning research platform that provides maximum flexibility and speed

For more information refer the following url :

http://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html

In this experiment will be implementing MLP using Pytorch. We are going to do this step-by-step:

    1. Load MNIST dataset, and visualize
    2. Define the Neural Network
    3. Define loss and optimizer
    4. Train the model
    5. Test the model

In [None]:
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable
import matplotlib.pyplot as plt

In [None]:
torch.__version__

#### Importing Required Packages

In [None]:
# Hyper Parameters 
input_size = 784
hidden_size = 500
num_classes = 10
num_epochs = 10
batch_size = 10
learning_rate = 0.001

#### 1. Loading MNIST dataset

Now, we'll load the MNIST data. First time we may have to download the data, which can take a while.

In [None]:
#Loading the train set file
train_dataset = dsets.MNIST(root='../data', 
                            train=True, 
                            transform=transforms.ToTensor(),  
                            download=True)
#Loading the test set file
test_dataset = dsets.MNIST(root='../data', 
                           train=False, 
                           transform=transforms.ToTensor())

Loading the dataset using "Dataloader" - this dataloader will return batches of data.

In [None]:
#loading the train dataset
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

# loading the test dataset

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

The train and test data are provided via data loaders that provide iterators over the datasets. Loading X and Y train values from the loader.

In [None]:
for (X_train, y_train) in train_loader:
    print('X_train:', X_train.size(), 'type:', X_train.type())
    print('y_train:', y_train.size(), 'type:', y_train.type())
    break

#### Plotting first 10 training digits

In [None]:
pltsize=1
plt.figure(figsize=(10*pltsize, pltsize))

for i in range(10):
    plt.subplot(1,10,i+1)
    plt.axis('off')
    plt.imshow(X_train[i,:,:,:].numpy().reshape(28,28), cmap="gray")
    plt.title('Class: '+str(y_train[i]))

#### 2. Defining the Neural Network

Let's define the network as a Python class. This Python class inherits functions from _nn.module_.

There are three convenient functions that are defined in this class:

- ### **\__init__()**:
In this function, we shall declare all the layers of our neural network, including the number of neurons, non-linear activations, etc.

- ### **forward()**:
This is the function that is used to compute forward pass of the network. Here, we shall connect the different layers we had defined in \__init__, according to the network architecture we want to make. In this case, $x -> fc1 -> relu -> fc2 -> out$.

"forward" can be called by calling the object of this class directly. For example:

```
net = Network()
out = net(x)
```

- ### **backward()**:
This function is used to compute gradients across the entire network, and is called from the loss function at the end of the network.

```
loss.backward()
```

We have to write the **__init__()** and **forward()** methods, and PyTorch will automatically generate a **backward()** method for computing the gradients for the backward pass.

In [None]:
class Net(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        out = self.softmax(out)
        return out

#### Creating a neural network object

In [None]:
net = Net(input_size, hidden_size, num_classes)

#### 3. Loss and Optimizer

We shall use the Cross Entropy Loss function as the loss.

In [None]:
criterion = nn.CrossEntropyLoss()

We shall use SGD as the optimizer.

In [None]:
optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate)  

#### 4. Train the Model

In [None]:
# In each epoch
for epoch in range(num_epochs):
    net.train(True)
    # For each batch of images in train set
    for i, (images, labels) in enumerate(train_loader):
        
        images = images.view(-1, 28*28)
        labels = labels
        
        # Initialize gradients to 0
        optimizer.zero_grad()
        
        # Forward pass (this calls the "forward" function within Net)
        outputs = net(images)
        
        # Find the loss
        loss = criterion(outputs, labels)
        
        # Find the gradients of all weights using the loss
        loss.backward()
        
        # Update the weights using the optimizer
        # For e.g.: w = w - (delta_w)*lr
        optimizer.step()
        
        if (i+1) % 1000 == 0:
            print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 
                   %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.item()))


#### 5. Test the Model

In [None]:
correct = 0
total = 0
# For each batch of images in test set
for images, labels in test_loader:
    net.eval()
    # Get the images
    images = images.view(-1, 28*28)
    
    # Find the output by doing a forward pass through the network
    outputs = net(images)
    
    # Find the class of each sample by taking a max across the probabilities of each class
    _, predicted = torch.max(outputs.data, 1)
    
    # Increment 'total', and 'correct' according to whether the prediction was correct or not
    total += labels.size(0)
    correct += (predicted.cpu() == labels).sum()

print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / total))


#### Exercise 1:

Play with number of epochs, batch_size, hidden layer size, non-linearity, etc.