![clothing_classification](clothing_classification.png)


Fashion Forward is a new AI-based e-commerce clothing retailer.
They want to use image classification to automatically categorize new product listings, making it easier for customers to find what they're looking for. It will also assist in inventory management by quickly sorting items.

As a data scientist tasked with implementing a garment classifier, your primary objective is to develop a machine learning model capable of accurately categorizing images of clothing items into distinct garment types such as shirts, trousers, shoes, etc.

In [1]:
# Run the cells below first

In [2]:
!pip install torchmetrics
!pip install torchvision

Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable


This code imports necessary libraries:

numpy for numerical operations.
torch and its submodules for building and training neural networks.
torchmetrics for performance metrics.
torchvision and its submodules for loading datasets and transforming images.

In [3]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from torchmetrics import Accuracy, Precision, Recall

This code loads the FashionMNIST dataset:

train_data contains training images and labels.
test_data contains test images and labels.
The transform=transforms.ToTensor() converts images to PyTorch tensors.

In [4]:
# Load datasets
from torchvision import datasets
import torchvision.transforms as transforms

train_data = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
test_data = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())

This code extracts dataset information:

classes is a list of class names.
num_classes is the number of classes.
num_input_channels is 1 (since FashionMNIST images are grayscale).
    "grayscale" refers to images that contain shades of gray rather than colors.
num_output_channels is the number of output channels for the convolutional layer.
image_size is the height (and width, as images are square) of the images.
    The size of the input images (the images are 28x28 pixels, image_size will be 28).

In [5]:
# Get the number of classes
classes = train_data.classes
num_classes = len(train_data.classes)

# Define some relevant variables
num_input_channels = 1
num_output_channels = 16
image_size = train_data[0][0].shape[1]


init method initializes layers: a convolutional layer, ReLU activation, max-pooling layer, flattening layer, and fully connected layer.
forward method defines the forward pass: data flows through each layer sequentially.

In [6]:
# Define CNN
class MultiClassImageClassifier(nn.Module):
    
    # Define the init method
    def __init__(self, num_classes):
        super(MultiClassImageClassifier, self).__init__()
        self.conv1 = nn.Conv2d(num_input_channels, num_output_channels, kernel_size=3, stride=1, padding=1)
        self.relu = nn.ReLU()
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()

        # Create a fully connected layer
        self.fc = nn.Linear(num_output_channels * (image_size//2)**2, num_classes)
        
    def forward(self, x):
        # Pass inputs through each layer
        x = self.conv1(x)
        x = self.relu(x)
        x = self.maxpool(x)
        x = self.flatten(x)
        x = self.fc(x)
        return x


This code creates a DataLoader for the training data:

batch_size=10 specifies the number of samples per batch.
shuffle=True shuffles the data at every epoch.

In [7]:
     
# Define the training set DataLoader
dataloader_train = DataLoader(
    train_data,
    batch_size=10,
    shuffle=True,
)

This code defines the training loop:

criterion is the loss function.
The loop iterates over epochs and batches.
For each batch, it zeroes gradients, computes outputs, calculates loss, backpropagates, and updates weights.
It tracks and prints the average loss per sample.

In [8]:

# Define training function
def train_model(optimizer, net, num_epochs):
    num_processed = 0
    criterion = nn.CrossEntropyLoss()
    for epoch in range(num_epochs):
        running_loss = 0
        num_processed = 0
        for features, labels in dataloader_train:
            optimizer.zero_grad()
            outputs = net(features)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
            num_processed += len(labels)
        print(f'epoch {epoch}, loss: {running_loss / num_processed}')
        
    train_loss = running_loss / len(dataloader_train)

In [9]:

# Train for 1 epoch
net = MultiClassImageClassifier(num_classes)
optimizer = optim.Adam(net.parameters(), lr=0.001)

train_model(
    optimizer=optimizer,
    net=net,
    num_epochs=1,
)


epoch 0, loss: 0.042165075822842


In [None]:
# Test the model on the test set
              
# Define the test set DataLoader
dataloader_test = DataLoader(
    test_data,
    batch_size=10,
    shuffle=False,
)
# Define the metrics
accuracy_metric = Accuracy(task='multiclass', num_classes=num_classes)
precision_metric = Precision(task='multiclass', num_classes=num_classes, average=None)
recall_metric = Recall(task='multiclass', num_classes=num_classes, average=None)


In [10]:

# Run model on test set
net.eval()
predictions = []
for i, (features, labels) in enumerate(dataloader_test):
    output = net.forward(features.reshape(-1, 1, image_size, image_size))
    cat = torch.argmax(output, dim=-1)
    predictions.extend(cat.tolist())
    accuracy_metric(cat, labels)
    precision_metric(cat, labels)
    recall_metric(cat, labels)

In [11]:
# Compute the metrics
accuracy = accuracy_metric.compute().item()
precision = precision_metric.compute().tolist()
recall = recall_metric.compute().tolist()
print('Accuracy:', accuracy)
print('Precision (per class):', precision)
print('Recall (per class):', recall)

Accuracy: 0.8790000081062317
Precision (per class): [0.8530954718589783, 0.9876922965049744, 0.8258196711540222, 0.7865448594093323, 0.8270270228385925, 0.9793174862861633, 0.6869834661483765, 0.9278151988983154, 0.9727822542190552, 0.9540459513664246]
Recall (per class): [0.8130000233650208, 0.9629999995231628, 0.8059999942779541, 0.9470000267028809, 0.7649999856948853, 0.9470000267028809, 0.6650000214576721, 0.9639999866485596, 0.9649999737739563, 0.9549999833106995]
