In this project, we will make a convolutional neural network (hereafter CNN) to classify the different fashion images contained on the [Fashion MNIST dataset](https://www.kaggle.com/zalando-research/fashionmnist). This dataset consists of a set of 70000 images of 28 x 28 pixels in greyscale. Such images can be of 10 different labels, corresponding to 

| Label number | Label |
| --- | --- |
| 0 | T-shirt/top |
| 1 | Trouser |
| 2 | Pullover |
| 3 | Dress |
| 4 | Coat |
| 5 | Sandal |
| 6 | Shirt |
| 7 | Sneaker |
| 8 | Bag |
| 9 | Ankle boot |

We will implement a CNN whose aim is to predict the label given the image. Let us start importing the packages.

In [3]:
import torch
from torch import nn
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor, Lambda, Compose
import matplotlib.pyplot as plt
import matplotlib.font_manager
import numpy as np
import sklearn as skl
#from torchviz import make_dot
import torch.optim as optim

As usual in machine learning models, we need to define the hyperparameters of the CNN. We do this now because it is very important to be organized while programming and we will need the batch size soon, so we put all these parameters together and we present them now.

In [18]:
# Model parameters
p_dropout = 0.1   # Dropout probability
batch_size = 1000   # Mini-batch
lr = 1e-3   # Learning rate
n_epochs = 200   # Number of epochs

In [2]:
# We create a custom Dataset class to work the images
class CustomImageDataset(Dataset):
    def __init__(self, dataset):
        super().__init__()
        self.dataset = dataset
        
    # We redefine the __len__() method
    def __len__(self):
        return len(self.dataset)
    
    # We redefine the __getitem__() method
    def __getitem__(self, i):
        image, label = self.dataset[i]
        label = torch.flatten(image) # We rewrite the original label in a flatten version of the image
        return image, label

In [12]:
# Download training data
train_d = datasets.FashionMNIST(
    root='Dataset',
    train=True,
    download=True,
    transform=ToTensor(),
)

train_data = train_d
train_data = CustomImageDataset(train_data)

# Download test data
test_d = datasets.FashionMNIST(
    root="Dataset",
    train=False,
    download=True,
    transform=ToTensor(),
)

test_data = test_d
test_data = CustomImageDataset(test_data)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to Dataset\FashionMNIST\raw\train-images-idx3-ubyte.gz


13.0%IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)

39.8%IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)

56.3%IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)

74.0%IOPub

Extracting Dataset\FashionMNIST\raw\train-images-idx3-ubyte.gz to Dataset\FashionMNIST\raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to Dataset\FashionMNIST\raw\train-labels-idx1-ubyte.gz


100.6%


Extracting Dataset\FashionMNIST\raw\train-labels-idx1-ubyte.gz to Dataset\FashionMNIST\raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to Dataset\FashionMNIST\raw\t10k-images-idx3-ubyte.gz


23.3%IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)

100.0%


Extracting Dataset\FashionMNIST\raw\t10k-images-idx3-ubyte.gz to Dataset\FashionMNIST\raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to Dataset\FashionMNIST\raw\t10k-labels-idx1-ubyte.gz


119.3%

Extracting Dataset\FashionMNIST\raw\t10k-labels-idx1-ubyte.gz to Dataset\FashionMNIST\raw






In [15]:
# Check some info about the data downloaded
train_data.__class__.__mro__

(__main__.CustomImageDataset,
 torch.utils.data.dataset.Dataset,
 typing.Generic,
 object)

In [16]:
# Check some info about the data downloaded
test_data.__class__.__mro__

(__main__.CustomImageDataset,
 torch.utils.data.dataset.Dataset,
 typing.Generic,
 object)

Once we have loaded the Dataset objects it is time to instantiate the Dataloader objects in order to get the proper inputs to the CNN. Also, it is possible to train the CNN in batches if the inputs are Dataloader objects.

In [20]:
train_dl = DataLoader(train_data, batch_size=batch_size, shuffle=True)
test_dl = DataLoader(test_data, batch_size=batch_size, shuffle=False)

We can check if cuda is available for training. The use of cuda optimizes the training process, allowing us to use the different GPUs we have in our computer.

In [21]:
# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print("Using {} device".format(device))

Using cpu device


Now we have loaded the dataset and we transformed it in appropiate Dataloader objects, it is time to define the model we will train to predict the label of the image used as input. As said before, the model is a CNN, and the architecture of such a network will be explained right now.

In [None]:
# Model definition
class Model(nn.Module):
    # Define model elements
    def __init__(self, ):
        # The super() builtin returns a proxy object (temporary object of the superclass)
        # that allows us to access methods of the base class.
        super().__init__()
        # Converting the inputs in a flatten vector
        self.flatten = nn.Flatten()
        # Sequence of transformations implemented by the layers of the network
        self.cnn = nn.Sequential(
            nn.Linear(n_inputs, n_hidden1),
            nn.Dropout(0.1),
            nn.ReLU(),
            nn.Linear(n_hidden1, n_outputs)
        )
 
    # Method to transform inputs in outputs considering the internal structure of the network
    def forward(self, X):
        X = self.flatten(X)
        output = self.cnn(X)
        return output
    
# Now we can create a model and send it at once to the device
model = Model().to(device)
# We can also inspect its parameters using its state_dict
print(model.state_dict())
print(model)

### References

[1] [A Comprehensive Guide to Convolutional Neural Networks — the ELI5 way](https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53)