# 2. Augmented training dataset

In this part, we will try to use an augmented training dataset to train our model. Data augmentation is a technique of artificially increasing the training set by creating modified copies of a dataset using existing data. It includes making minor changes to the dataset or generating new data points. The goal of data augmentation technique is make the data rich and sufficient and thus makes the model perform better and accurately. We will the take a look at the accuracy of the model after training with augmented dataset.

In [1]:
import numpy as np
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import torch.optim as optim
from torch.utils.data.sampler import SubsetRandomSampler
from net import Net

The data augmentation technique in our case is fairly simple: it consists only of randomly flipping horizontally 50% of the original training dataset. We will see what impact it will have on the prediction of the model. Now, we proceed with the same process as the first part.

In [2]:
train_dir = './train_images'    # folder containing training images
test_dir = './test_images'    # folder containing test images

transform = transforms.Compose(
    [transforms.Grayscale(),   # transforms to gray-scale (1 input channel)
     transforms.ToTensor(),    # transforms to Torch tensor (needed for PyTorch)
     transforms.Normalize(mean=(0.5,),std=(0.5,))]) # subtracts mean (0.5) and devides by standard deviation (0.5) -> resulting values in (-1, +1)

# Define the transformations for augmentation
transforms_augmented = transforms.Compose([
        transforms.Grayscale(), 
        transforms.RandomResizedCrop(36),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(mean=(0.5,),std=(0.5,))
])
       


In [4]:
# Define two pytorch datasets (train/test) 
train_data_augmented = torchvision.datasets.ImageFolder(train_dir, transform=transforms_augmented)
test_data = torchvision.datasets.ImageFolder(test_dir, transform=transform)

valid_size = 0.2   # proportion of validation set (80% train, 20% validation)
batch_size = 32    

# Define randomly the indices of examples to use for training and for validation
num_train = len(train_data_augmented)
indices_train = list(range(num_train))
np.random.shuffle(indices_train)
split_tv = int(np.floor(valid_size * num_train))
train_new_idx, valid_idx = indices_train[split_tv:],indices_train[:split_tv]

# Define two "samplers" that will randomly pick examples from the training and validation set
train_sampler = SubsetRandomSampler(train_new_idx)
valid_sampler = SubsetRandomSampler(valid_idx)

In [5]:
# Dataloaders (take care of loading the data from disk, batch by batch, during training)
train_loader = torch.utils.data.DataLoader(train_data_augmented, batch_size=batch_size, sampler=train_sampler, num_workers=4)
valid_loader = torch.utils.data.DataLoader(train_data_augmented, batch_size=batch_size, sampler=valid_sampler, num_workers=4)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, shuffle=True, num_workers=4)

classes = ('noface','face')  # indicates that "1" means "face" and "0" non-face (only used for display)


In [6]:
net = Net()
n_epochs = 16

optimizer = optim.Adam(net.parameters(), lr=0.001, weight_decay=1e-4)
criterion = nn.CrossEntropyLoss()

In [7]:
# Training 
running_loss =0
# loop over epochs: one epoch = one pass through the whole training dataset
for epoch in range(1, n_epochs+1):  
#   loop over iterations: one iteration = 1 batch of examples
    running_loss =0
    for data, target in train_loader: 
        optimizer.zero_grad() # zero the gradient buffers
        output = net(data)
        loss = criterion(output, target)
        running_loss +=loss
        loss.backward()
        optimizer.step() # Does the update
    print ('epoch: %d, running_loss: %5.7f' % (epoch,running_loss))  

epoch: 1, running_loss: 745.0629272
epoch: 2, running_loss: 432.7305298
epoch: 3, running_loss: 341.3478088
epoch: 4, running_loss: 292.0621338
epoch: 5, running_loss: 267.2956848
epoch: 6, running_loss: 251.9956818
epoch: 7, running_loss: 238.5132904
epoch: 8, running_loss: 232.7164001
epoch: 9, running_loss: 224.1664276
epoch: 10, running_loss: 222.7513428
epoch: 11, running_loss: 206.9162445
epoch: 12, running_loss: 213.8405457
epoch: 13, running_loss: 203.5930023
epoch: 14, running_loss: 203.8099213
epoch: 15, running_loss: 201.2661285
epoch: 16, running_loss: 196.9024353


In [9]:
correct = 0
total = 0
with torch.no_grad():
    for data in test_loader:
        images, labels = data
        outputs = net(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on test images: %5.6f %%' % (
    100 * correct / total))

Accuracy of the network on test images: 97.981122 %


As we see, the accuracy of the prediction on the test images is fairly higher than the model trained on the original training dataset. It shows that data augmentation technique can lead to better performance and higher accuracy, even with a simple technique in our case.

In [23]:
#Save the model
torch.save(net.state_dict(), './saved_model.pth')