PyTorch is a powerful deep learning framework for Python that specializes in the creation of image classification systems. This notebook will serve as a practical example of how to implement a custom image classificiation network in Pytorch, using the Fruits 360 dataset.   

To start off with, we'll begin by importing all the libraries we need. 

In [1]:
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset
from torchvision import datasets, transforms

We'll then set the directory for the data.

In [2]:
# Set main directory for use in getting training/testing data

main_dir = "./fruits-360/"

Now we can set up the transforms we are going to use for the datasets.  We'll do this for both the training and testing datasets, and then use the DataLoader to create iterable objects out of the preprocessed data. We'lll do some random rotating and flipping of the images, and we'll also normalize the data. It's critical to transform the data into a Tensor so that the deep neural network model can interpret it. Applying random perturbations to the training dataset can help make the image classifier more robust, able to recognize images that have been altered in certain ways from your target images.

Traditionally, you don't pass the image perturbations to the test dataset, although you can. 

After we declare the transforms, we'll join the train and test directories to the base URL to get the full paths and then make our data iterable with the DataLoader. 

In [3]:
# Declare transforms for train and test data

train_transforms = transforms.Compose([
    transforms.Resize((32, 32)),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(25),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),
                         (0.5,0.5,0.5)),
])

test_transforms = transforms.Compose([transforms.Resize((32, 32)),
    transforms.ToTensor(),
    transforms.Normalize((0.5,0.5,0.5),
                         (0.5,0.5,0.5)),])

# Set the directories for the training and testing data, create the datasets with the transforms

train_data = datasets.ImageFolder(os.path.join(main_dir, 'Train/'), transform=train_transforms)
test_data = datasets.ImageFolder(os.path.join(main_dir, 'Test/'), transform=test_transforms)

# Use the dataloaders to create iterable objects out of the datasets

train_loader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=64, shuffle=True)

We also want to set the number of classes in the dataset, as the final output of the model needs to be equal to the number of classes. We set the training device here too, cuda - if available.

In [4]:
# Set the number of classes
classes = 118

# Specify the device to use
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

We can now go about creating the deep neural network. We have our custom model inherit from `nn.Module`, and we define both the convolutional portion of the network as well as the fully connected layers/classifier portion of the model. We then add these together in a method and flatten the inputs heading into the fully connected layers.

In [5]:
# Create the Model to use for training

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 8, kernel_size=5, stride=1, padding=2)
        self.conv2 = nn.Conv2d(8, 16, kernel_size=5, stride=1, padding=2)
        self.fc1 = nn.Linear(8*8*16, 300)
        self.fc2 = nn.Linear(300, classes)
        self.conv1_bn = nn.BatchNorm2d(8)
        self.conv2_bn = nn.BatchNorm2d(16)
        self.pool = nn.MaxPool2d(2, 2)

    # Can combine layers and activations if you want:
    # Ex. X = self.conv1_bn(self.conv1(X))

    def forward(self, X):
        X = self.conv1(X)
        X = self.conv1_bn(X)
        X = F.relu(X)
        X = self.pool(X)
        X = self.conv2(X)
        X = self.conv2_bn(X)
        X = F.relu(X)
        X = self.pool(X)
        X = X.view(-1, 8*8*16)
        X = self.fc1(X)
        X = F.relu(X)
        X = F.dropout(X)
        X = self.fc2(X)

        return X

We now instantiate the model in a variable and declare our optimizer, criterion, and learning rate. We'll also specify the device to use, GPU/cuda if available. If you wanted, you could also use a learning rate schedule here to decrement the learning rate when it reaches a pleateau.

In [6]:
# Set number of training epochs and learning rate
# May also want to use learning rate scheduler

epochs = 40
learning_rate = 0.0001

# Declare the model, loss criterion, and chosen optimizer

model = Model().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), learning_rate, weight_decay=0.0001)

Now we can create the function to train the model. We'll set the model to training mode and decrement the learning rate at the start of every epoch. We also need to get the current batch, the image data, and the target from the train loader. We'll set the data and targets as variables, and send them to the device. We then need to zero the gradients before we start training. We run the data through the model and save the output in a variable. We then calculate the loss and carry out backpropogation, and after backprop we can carry out a step of optimization.

Then we'll just evaluate the model. We set the model to eval mode, and create variables to hold the loss and number of instances correctly classified. We then get the image data and targets from the DataLoader for the test data, and like before, run the data through the model. We can do this just by saying "else", since if it isn't in training mode logically it must be in evaluation mode.

Finally, we can print out some metrics and analyze the results of the training.

In [None]:
# Declare variables to hold the training and testing loss
# Which will be updated over the course of the epoch

epoch_loss_train = []
epoch_loss_test = []


# For our chosen number of epochs, train and check performance on test set

for i in range(epochs):

    # Declare variables to hold metrics

    train_loss = 0.0
    train_total = 0
    train_correct = 0

    # For the features and labels in the train loader
    for x, y in train_loader:

        # Declare the inputs and labels, zero the gradients,
        # run the inputs through the model and calculate the loss
        inputs, labels = x.to(device), y.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Get the most likely prediction
        # Do backprop to calculate gradient and optimize
        _, prediction = torch.max(outputs.data, 1)
        loss.backward()
        optimizer.step()

        # Update the values
        train_loss += loss.item()
        train_total += labels.size(0)
        train_correct += (prediction == labels).sum().item()

    else:

        test_loss = 0.0
        test_total = 0
        test_correct = 0

        # Set to no_grad for purposes of evaluation

        with torch.no_grad():
            for test_images, test_targets in test_loader:
                images, targets = test_images.to(device), test_targets.to(device)
                test_outputs = model(images)
                difference = criterion(test_outputs, targets)
                test_loss += difference.item()
                _, test_pred = torch.max(test_outputs.data, 1)
                test_total += targets.size(0)
                test_correct += (test_pred == targets).sum().item()

        epoch_loss_train.append(train_loss/len(train_loader))
        epoch_loss_test.append(test_loss/len(test_loader))

        print("---------")
        print("End Epoch")
        print('Training Accuracy: {:.2f}%'.format(100 * train_correct / train_total))
        print('Testing Accuracy: {:.2f}%'.format(100 * test_correct / test_total))

---------
End Epoch
Training Accuracy: 20.59%
Testing Accuracy: 34.41%
---------
End Epoch
Training Accuracy: 53.65%
Testing Accuracy: 52.00%
---------
End Epoch
Training Accuracy: 70.98%
Testing Accuracy: 61.81%
---------
End Epoch
Training Accuracy: 79.19%
Testing Accuracy: 66.87%
---------
End Epoch
Training Accuracy: 84.81%
Testing Accuracy: 71.50%
---------
End Epoch
Training Accuracy: 88.11%
Testing Accuracy: 73.51%
---------
End Epoch
Training Accuracy: 90.33%
Testing Accuracy: 75.31%
---------
End Epoch
Training Accuracy: 92.10%
Testing Accuracy: 78.92%
