# Introduction


In this project, we implement a Convolutional Neural Network (CNN) using PyTorch to classify images from the FashionMNIST dataset. FashionMNIST is a widely-used dataset consisting of 70,000 grayscale images, each of size 28x28, divided into 10 categories of fashion-related items (e.g., shirts, shoes, bags). The objective is to train a CNN model to automatically classify these images into their respective categories.

The model’s architecture includes multiple convolutional layers, ReLU activation, and max-pooling, followed by fully connected layers and dropout for regularization. After training, we evaluate the model using key metrics such as accuracy, precision, and recall to measure its performance.

In [1]:
!pip install torchmetrics
!pip install torchvision



In [2]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from torchmetrics import Accuracy, Precision, Recall

In [3]:
# Load datasets
from torchvision import datasets
import torchvision.transforms as transforms

train_data = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
test_data = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:03<00:00, 7304798.46it/s]


Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 142781.38it/s]


Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:01<00:00, 2598420.53it/s]


Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 11837871.16it/s]

Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw






# Model Architecture

#### The CNN model consists of the following components:

#### 1. Convolutional Layers:
   Three convolutional layers with 32, 64, and 128 filters, respectively, each followed by a ReLU activation function and max-pooling for downsampling. These layers allow the model to extract hierarchical features from the input images.
#### 2.Fully Connected Layers:
   After the convolutional layers, the output is flattened and passed through two fully connected layers with 256 and 128 neurons, respectively. Dropout is applied to reduce overfitting.
#### 3.Output Layer:
   The final output layer consists of 10 neurons (one for each class in FashionMNIST), followed by softmax (implicitly applied during loss computation) to output class probabilities.

# Training Process

The model is trained on the FashionMNIST training dataset using the Adam optimizer with a learning rate of 0.001. The CrossEntropyLoss function is used as the loss criterion since it is suitable for multi-class classification problems. The training loop runs for 10 epochs, with the model performing gradient descent and backpropagation on each batch of images.

During each epoch, the loss is accumulated and averaged across the training batches, providing feedback on the model’s performance during training. The model is moved to the GPU (if available) to accelerate computations.

# Evaluation

After training, the model is evaluated on the test set, where we calculate accuracy, precision, and recall for each of the 10 classes. These metrics provide insight into the model’s performance in terms of correctly classifying images and handling imbalanced classes.

In [4]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from sklearn.metrics import precision_score, recall_score

# Check if a GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Define the CNN class
class FashionMNIST_CNN(nn.Module):
    def __init__(self):
        super(FashionMNIST_CNN, self).__init__()
        # First convolutional block: Conv -> ReLU -> MaxPool
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)
        self.relu1 = nn.ReLU()
        self.maxpool1 = nn.MaxPool2d(2, 2)
        
        # Second convolutional block
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.relu2 = nn.ReLU()
        self.maxpool2 = nn.MaxPool2d(2, 2)
        
        # Third convolutional block
        self.conv3 = nn.Conv2d(64, 128, kernel_size=3, padding=1)
        self.relu3 = nn.ReLU()
        self.maxpool3 = nn.MaxPool2d(2, 2)
        
        # Fully connected layers
        self.flatten = nn.Flatten()  # Flatten before the FC layers
        self.fc1 = nn.Linear(128 * 3 * 3, 256)
        self.relu4 = nn.ReLU()
        self.fc2 = nn.Linear(256, 128)
        self.relu5 = nn.ReLU()
        self.fc3 = nn.Linear(128, 10)  # 10 output classes
        
        # Dropout for regularization
        self.dropout = nn.Dropout(0.25)

    # Define the forward pass
    def forward(self, x):
        x = self.maxpool1(self.relu1(self.conv1(x)))
        x = self.maxpool2(self.relu2(self.conv2(x)))
        x = self.maxpool3(self.relu3(self.conv3(x)))
        x = self.flatten(x)
        x = self.dropout(self.relu4(self.fc1(x)))
        x = self.dropout(self.relu5(self.fc2(x)))
        x = self.fc3(x)
        return x

# Load FashionMNIST data
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])

trainset = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
testset = torchvision.datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)

trainloader = DataLoader(trainset, batch_size=64, shuffle=True)
testloader = DataLoader(testset, batch_size=64, shuffle=False)

# Initialize the model, loss function, and optimizer
model = FashionMNIST_CNN().to(device)  # Move the model to the GPU if available
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop (keeping epochs low as per instruction)
num_epochs = 10
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for images, labels in trainloader:
        images, labels = images.to(device), labels.to(device)  # Move data to the GPU
        
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(trainloader)}")

print("Finished Training")

# Make predictions on the test set and store in 'predictions'
predictions = []
model.eval()
with torch.no_grad():
    for images, labels in testloader:
        images = images.to(device)  # Move test images to the GPU
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        predictions.extend(predicted.cpu().numpy())  # Store predictions and move them back to CPU

# Convert predictions to a list
predictions = list(predictions)

# Calculate accuracy, precision, and recall
all_labels = []
with torch.no_grad():
    for images, labels in testloader:
        all_labels.extend(labels.cpu().numpy())  # Store true labels

# Accuracy calculation
correct = sum([pred == label for pred, label in zip(predictions, all_labels)])
accuracy = correct / len(all_labels) * 100
print(f"Test Accuracy: {accuracy:.2f}%")

# Precision and recall calculation
precision = precision_score(all_labels, predictions, average=None)
recall = recall_score(all_labels, predictions, average=None)

print(f"Per-class Precision: {precision}")
print(f"Per-class Recall: {recall}")

Using device: cuda
Epoch [1/10], Loss: 0.5623613644415127
Epoch [2/10], Loss: 0.3257112097717933
Epoch [3/10], Loss: 0.2721219977765068
Epoch [4/10], Loss: 0.23980968208836595
Epoch [5/10], Loss: 0.2161986966655135
Epoch [6/10], Loss: 0.19593229614245866
Epoch [7/10], Loss: 0.17976405663586564
Epoch [8/10], Loss: 0.16342382995423668
Epoch [9/10], Loss: 0.14986353732351618
Epoch [10/10], Loss: 0.1406423020532041
Finished Training
Test Accuracy: 91.73%
Per-class Precision: [0.87234043 0.99195171 0.81672598 0.91666667 0.87316562 0.98993964
 0.80463576 0.95348837 0.97235933 0.98155738]
Per-class Recall: [0.861 0.986 0.918 0.935 0.833 0.984 0.729 0.984 0.985 0.958]


# Conclusion

The CNN model successfully classifies FashionMNIST images with an accuracy of **91.67%**, demonstrating its effectiveness in extracting features from the input data and generalizing well to unseen data. The high precision and recall values across most classes indicate that the model performs well in distinguishing between different categories.

While the model performs well, further improvements could be made by experimenting with different architectures, hyperparameters, or optimization strategies. Additionally, increasing the number of epochs or using data augmentation techniques may further enhance the model’s performance.

This project demonstrates the power of CNNs for image classification tasks and highlights the capabilities of PyTorch for deep learning implementations.