<h1>Training Convolutional Neural Networks to Categorize Clothing with PyTorch</h1>

<b>Creating the Convolutional Neural Network:</b>
I’ll be showing you how I built my convolutional neural network in Pytorch. I trained it using the MNIST — Fashion dataset with 60,000 examples of 28x28 resolution black-and-white images of clothes. Let’s jump right in!

In [10]:
# Begin with the imports
import torch
import torchvision
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

Then we initialize the hyperparameters, which include the number of epochs (training rounds), number of classes (there are 10: t-shirt, trouser, pullover, dress, coat, sandal, shirt, sneaker, bag, and boot), the batch-size for mini-batch gradient descent, and the learning rate for gradient descent.

In [11]:
#Initializing hyperparameters
# number of times to pass
num_epochs = 8
# number of categories to group into
num_classes = 10

batch_size = 100
learning_rate = 0.001

Time to retrieve the dataset! We load both the training set and the test set of the MNIST-Fashion and set the transform parameter to convert the datasets into tensors (transforms.ToTensor()). We also normalize them by setting the mean and standard deviation (transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))).

In [3]:
#Download and load dataset
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,0.5,0.5), (0.5,0.5,0.5))])

train_dataset = datasets.FashionMNIST(root='./data', 
                            train=True, 
                            download=True,
                            transform=transform)

test_dataset = datasets.FashionMNIST(root='./data', 
                           train=False, 
                           download=True,
                           transform=transform)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Processing...
Done!


The last step in this stage is to place the data into an object (data loader) to make it more easily accessible. We shuffle the training dataset so that there won’t be any bias in the training.

In [4]:
#Loading dataset into dataloader
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 
                                           batch_size=batch_size, 
                                           shuffle=True)

test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 
                                          batch_size=batch_size, 
                                          shuffle=False)

<b>Part 2: Constructing the Convolutional Neural Network</b>

We initialize our neural network using the nn.Module class, the base class of all layered neural network modules:

Our convolutional network will two convolution layers, each followed by a non-linear function (ReLu) and a max-pooling layer, and finally a fully connected layer and softmax for linear regression.


Layers of the Convolutional Neural Network
We also use dropout for regularization before the fully connected layer to prevent against overfitting. Thus, we can initialize the layers in the network as follows:

In [5]:
# define your cnn model
class CNNModel(nn.Module):
    def __init__(self):
        super(CNNModel, self).__init__()
        
        #Convolution 1
        self.cnn1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=5, stride=1, padding=2)
        self.relu1 = nn.ReLU()
        
        #Max pool 1
        self.maxpool1 = nn.MaxPool2d(kernel_size=2)
        
        #Convolution 2
        self.cnn2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=2)
        self.relu2 = nn.ReLU()
        
        #Max pool 2
        self.maxpool2 = nn.MaxPool2d(kernel_size=2)
        
        #Dropout for regularization
        self.dropout = nn.Dropout(p=0.5)
        
        #Fully Connected 1
        self.fc1 = nn.Linear(32*7*7, 10)
        
    '''In our forward function for forward propagation, we apply each layer to the input data, 
    as well as the dropout. Additionally, before the regularization, 
    we flatten the data to be one-dimensional for the linear regression (the first dimension is the batch size).'''
    def forward(self, x):
        #Convolution 1
        out = self.cnn1(x)
        out = self.relu1(out)
        
        #Max pool 1
        out = self.maxpool1(out)
        
        #Convolution 2
        out = self.cnn2(out)
        out = self.relu2(out)
        
        #Max pool 2
        out = self.maxpool2(out)
        
        #Resize
        out = out.view(out.size(0), -1)
        
        #Dropout
        out = self.dropout(out)
        
        #Fully connected 1
        out = self.fc1(out)
        return out

<b>Part 3: Creating Instances</b>

Now that we’ve created a class for our convolutional neural network, we need to create an instance of it (we’ve created a class to determine its layers and forward propagation, but we haven’t actually created an actual neural net yet).

In [12]:
#Create instance of model
model = CNNModel()

'''We also use cross-entropy loss to determine the labels from the output of the neural net.'''
#Create instance of loss
criterion = nn.CrossEntropyLoss()

'''Finally, we initialize the linear regression/softmax function.'''
#Create instance of optimizer (Adam)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

<b>Part 4: Training the Model</b>

After creating the network and the instances, we’re ready to train it using the dataset! We iterate through the dataset and for each mini-batch, we perform forward propagation, calculate the cross-entropy loss, perform backward propagation and use the gradients to update the parameters.

In [7]:
#Train the model
iter = 0
for epoch in range(num_epochs):
    for i, (images, labels) in enumerate(train_loader):
        images = Variable(images)
        labels = Variable(labels)
        
        #Clear the gradients
        optimizer.zero_grad()
        
        #Forward propagation 
        outputs = model(images)      
        
        #Calculating loss with softmax to obtain cross entropy loss
        loss = criterion(outputs, labels)
        
        #Backward propation
        loss.backward()
        
        #Updating gradients
        optimizer.step()
        
        iter += 1
        
        #Total number of labels
        total = labels.size(0)
        
        #Obtaining predictions from max value
        _, predicted = torch.max(outputs.data, 1)
        
        #Calculate the number of correct answers
        correct = (predicted == labels).sum().item()
        
        #Print loss and accuracy
        if (i + 1) % 100 == 0:
            print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}, Accuracy: {:.2f}%'
                  .format(epoch + 1, num_epochs, i + 1, len(train_loader), loss.item(),
                          (correct / total) * 100))

Epoch [1/8], Step [100/600], Loss: 0.6973, Accuracy: 76.00%
Epoch [1/8], Step [200/600], Loss: 0.4977, Accuracy: 79.00%
Epoch [1/8], Step [300/600], Loss: 0.5466, Accuracy: 81.00%
Epoch [1/8], Step [400/600], Loss: 0.3835, Accuracy: 88.00%
Epoch [1/8], Step [500/600], Loss: 0.4047, Accuracy: 86.00%
Epoch [1/8], Step [600/600], Loss: 0.3863, Accuracy: 88.00%
Epoch [2/8], Step [100/600], Loss: 0.3510, Accuracy: 86.00%
Epoch [2/8], Step [200/600], Loss: 0.3373, Accuracy: 87.00%
Epoch [2/8], Step [300/600], Loss: 0.3519, Accuracy: 85.00%
Epoch [2/8], Step [400/600], Loss: 0.2520, Accuracy: 90.00%
Epoch [2/8], Step [500/600], Loss: 0.3548, Accuracy: 87.00%
Epoch [2/8], Step [600/600], Loss: 0.2723, Accuracy: 90.00%
Epoch [3/8], Step [100/600], Loss: 0.4901, Accuracy: 82.00%
Epoch [3/8], Step [200/600], Loss: 0.3767, Accuracy: 82.00%
Epoch [3/8], Step [300/600], Loss: 0.5188, Accuracy: 85.00%
Epoch [3/8], Step [400/600], Loss: 0.4118, Accuracy: 87.00%
Epoch [3/8], Step [500/600], Loss: 0.310

<b>Part 5: Testing the Model</b>

So our model is trained, all that’s left to do is to test it! We run the neural network on the test dataset, compare our outputs to the correct labels, and determine our overall accuracy.

In [8]:
#Testing the model
with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = Variable(images)
        labels = Variable(labels)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))

Test Accuracy of the model on the 10000 test images: 89.46 %


The neural network should achieve an accuracy from 88%–90%, which is quite good compared to the 91.6% benchmark of 2-layered CNNs on the MNIST-Fashion dataset. Note that the MNIST-Fashion dataset is much harder to train on than the original MNIST-digit dataset. If we want to achieve a higher accuracy, we would have to add more layers, preprocess the data more to normalize it better, and increase the number of epochs.