# Simple code for neural network training in PyTorch

by Fayyaz Minhas

Here, we create a simple neural network (multilayered Perceptron) with PyTorch. 

# MNIST Classification
Digit classification using DataLoader

In [26]:
# Hack
# PIL no longer supports PILLOW_VERSION. Instead uses __version__
# torchvision needs PIL.PILLOW_VERSION
import PIL
PIL.PILLOW_VERSION = PIL.__version__

In [30]:
# pip install tqdm

Collecting tqdm
  Downloading tqdm-4.64.1-py2.py3-none-any.whl (78 kB)
[K     |████████████████████████████████| 78 kB 2.9 MB/s eta 0:00:011
[?25hInstalling collected packages: tqdm
Successfully installed tqdm-4.64.1
Note: you may need to restart the kernel to use updated packages.


Each image is represented by a 28x28 matrix. So the total number of features we get is 784-->input_size.

We will use 500 neurons in the hidden layer. 

We want 10 outputs(targets), each corresponding to a digit 0-9. 

We will use 5 epochs.

batch_size=100. nn takes 100 examples and performs a weight update.

PyTorch comes with MNIST built-in.

Convert MNIST into train and test data and place them in a folder 'data' in the current directory.

transforms.ToTensor converts the data to tensor directly. So need to explicitly do that later. 

download=True: if the data is not already downloaded, its going to be downloaded.

DataLoader makes our job easy when interacting with large datasets.

Create two dataloaders: for train and test. 

The dataloader automatically gives the nn batchsize=100 examples. Shuffle=True makes the batch random. 

Create a class Net inherited from nn.Module.

In [31]:
import torch
import torch.nn as nn
import torchvision.datasets as dsets
import torchvision.transforms as transforms
from torch.autograd import Variable
from tqdm import tqdm_notebook as tqdm
USE_CUDA = torch.cuda.is_available() 
from torch.autograd import Variable

def cuda(v):
    if USE_CUDA:
        return v.cuda()
    return v
def toTensor(v,dtype = torch.float,requires_grad = False):       
    return cuda(Variable(torch.tensor(v)).type(dtype).requires_grad_(requires_grad))
def toNumpy(v):
    if USE_CUDA:
        return v.detach().cpu().numpy()
    return v.detach().numpy()

#####################################################################################################################
    
# Hyper Parameters 
input_size = 784 #number of features
hidden_size = 500
num_classes = 10
num_epochs = 5
batch_size = 100
learning_rate = 0.001

# MNIST Dataset 
train_dataset = dsets.MNIST(root='./data', 
                            train=True, 
                            transform=transforms.ToTensor(),  
                            download=True)

test_dataset = dsets.MNIST(root='./data', 
                           train=False, 
                           transform=transforms.ToTensor())

# Data Loader (Input Pipeline)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

# Representation: components of nn and how they are connecetd
# Neural Network Model (1 hidden layer)
class Net(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(Net, self).__init__()
        # weight layer: matrix of weights. 784 i/p feeding into 500 neurons. 
        self.fc1 = nn.Linear(input_size, hidden_size) 
        # apply relu on top
        self.relu = nn.ReLU()
        # pass the outputs of hidden layer to 10 neurons of the o/p layer
        self.fc2 = nn.Linear(hidden_size, num_classes)
    
    # connects the layers
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

# create the network
# pass it to cuda, so it can pass into the gpu
model = cuda(Net(input_size, hidden_size, num_classes))

    
# Loss and Optimizer
criterion = nn.CrossEntropyLoss()  #loss function 
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)  

# Train the Model
model.train() #set the mode to training 
for epoch in tqdm(range(num_epochs)):
    for i, (images, labels) in tqdm(enumerate(train_loader)):  #pick a batch of 100 images and labels at random
        # Convert torch tensor to Variable
        images = toTensor(images.view(-1, 28*28))
        labels = toTensor(labels,dtype=torch.long)
        
        # Forward + Backward + Optimize
        optimizer.zero_grad()  # zero the gradient buffer
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()





Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100.1%

Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz


113.5%

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100.4%

Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  return torch.from_numpy(parsed).view(length, num_rows, num_cols)
Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`
  for epoch in tqdm(range(num_epochs)):


Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Processing...
Done!


  0%|          | 0/5 [00:00<?, ?it/s]

Please use `tqdm.notebook.tqdm` instead of `tqdm.tqdm_notebook`
  for i, (images, labels) in tqdm(enumerate(train_loader)):  #pick a batch


0it [00:00, ?it/s]

  return cuda(Variable(torch.tensor(v)).type(dtype).requires_grad_(requires_grad))


0it [00:00, ?it/s]

0it [00:00, ?it/s]

0it [00:00, ?it/s]

0it [00:00, ?it/s]

Can monitor the error by plotting loss vs no. of epochs --> gives an idea of convergence of the model.

Test the nn using test_loader

In [32]:
# Test the Model
model.eval() #set the mode to evaluation
correct = 0
total = 0
for images, labels in test_loader:
    images = toTensor(images.view(-1, 28*28)) # flattens it
    outputs = model(images) # computes output
    _, predicted = torch.max(outputs.data, 1) # computes prediction
    total += labels.size(0)
    correct += (toNumpy(predicted) == toNumpy(labels)).sum()

print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / total))

# Save the Model
torch.save(model.state_dict(), 'model.pkl')

  return cuda(Variable(torch.tensor(v)).type(dtype).requires_grad_(requires_grad))


Accuracy of the network on the 10000 test images: 97 %


Pytorch allows you to essentially go to different levels of detail, when you are training your ML models.