#  Fashion MNIST Classification Task

##  Objective
The goal of this task is to build a neural network model to classify images from the **Fashion MNIST** dataset into one of ten clothing categories.  
Will train the model, evaluate its accuracy, and display a confusion matrix to understand misclassifications.

---

## Step 1: Import Required Libraries


In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

torch.manual_seed(1234)
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', DEVICE)

Using device: cpu


##  Step 2: Load and Prepare the Data

Basic transformations to normalize the pixel values of Fashion MNIST images and convert them into tensors.

In [2]:
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_data = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transform)
test_data = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_data, batch_size=64, shuffle=True)
test_loader = DataLoader(test_data, batch_size=64, shuffle=False)
print('Train samples:', len(train_data))
print('Test samples:', len(test_data))

Train samples: 60000
Test samples: 10000


## Step 3: Define the Neural Network

Define simple feedforward neural network with one hidden layer.  
The output layer has 10 units — one for each clothing category.

In [None]:
class FashionNN(nn.Module):
    def __init__(self):
        super(FashionNN, self).__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28*28, 256)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(256, 10)
    
    def forward(self, x):
        x = self.flatten(x)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

model = FashionNN()


MLP(
  (fc1): Linear(in_features=784, out_features=256, bias=True)
  (fc2): Linear(in_features=256, out_features=128, bias=True)
  (fc3): Linear(in_features=128, out_features=10, bias=True)
  (relu): ReLU()
  (log_softmax): LogSoftmax(dim=1)
)


## Step 4: Choose a Hyperparameter to Tune

### Chosen Hyperparameter: Learning Rate (`lr`)

The **learning rate** controls how quickly or slowly a model learns during training.  
A too-high learning rate causes unstable learning, while a too-low one slows convergence.

Use **`lr = 0.001`** because it provides a good balance between stability and convergence speed for this dataset.


In [4]:
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
EPOCHS = 15
print(f'Epochs: {EPOCHS}')

Epochs: 15


## Step 5: Train the Model
Ttrain the model for 5 epochs and print the average loss after each epoch.


In [8]:
def train(model, loader):
    model.train()
    running_loss = 0.0
    for inputs, labels in loader:
        inputs, labels = inputs.to(DEVICE), labels.to(DEVICE)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item() * inputs.size(0)
    return running_loss / len(loader.dataset)

for epoch in range(EPOCHS):
    loss = train(model, train_loader)
    print(f'Epoch {epoch+1}/{EPOCHS}, Loss: {loss:.4f}')

Epoch 1/15, Loss: 0.1693
Epoch 2/15, Loss: 0.1588
Epoch 3/15, Loss: 0.1515
Epoch 4/15, Loss: 0.1501
Epoch 5/15, Loss: 0.1424
Epoch 6/15, Loss: 0.1358
Epoch 7/15, Loss: 0.1280
Epoch 8/15, Loss: 0.1278
Epoch 9/15, Loss: 0.1232
Epoch 10/15, Loss: 0.1146
Epoch 11/15, Loss: 0.1165
Epoch 12/15, Loss: 0.1070
Epoch 13/15, Loss: 0.1022
Epoch 14/15, Loss: 0.0998
Epoch 15/15, Loss: 0.0953


## Step 6: Evaluate the Model and Generate Confusion Matrix
The confusion matrix shows how well the model predicts each class.


In [11]:
import pandas as pd
from sklearn.metrics import confusion_matrix

# Define class names
class_names = ["T-shirt/top", "Trouser", "Pullover", "Dress", "Coat", 
               "Sandal", "Shirt", "Sneaker", "Bag", "Ankle boot"]

def evaluate_model(model, test_loader):
    model.eval()
    all_preds = []
    all_labels = []
    with torch.no_grad():
        for inputs, labels in test_loader:
            outputs = model(inputs)
            _, preds = torch.max(outputs, 1)
            all_preds.extend(preds.tolist())
            all_labels.extend(labels.tolist())
    return all_preds, all_labels

# Evaluate model
preds, labels = evaluate_model(model, test_loader)

# Generate confusion matrix
conf_matrix = confusion_matrix(labels, preds)

# Convert to DataFrame for labeled display
conf_df = pd.DataFrame(conf_matrix, index=class_names, columns=class_names)
print("Confusion Matrix:")
print(conf_df)

Confusion Matrix:
             T-shirt/top  Trouser  Pullover  Dress  Coat  Sandal  Shirt  \
T-shirt/top          829        3        24     17     1       1    116   
Trouser                2      978         2      9     5       0      3   
Pullover              17        2       796     13   114       1     54   
Dress                 23        9         8    888    27       1     39   
Coat                   0        0        59     27   861       0     49   
Sandal                 1        0         0      0     0     969      0   
Shirt                109        0        76     23    85       0    696   
Sneaker                0        0         0      0     0      38      0   
Bag                    4        0         1      4     4       3      5   
Ankle boot             1        0         0      0     0      24      0   

             Sneaker  Bag  Ankle boot  
T-shirt/top        0    9           0  
Trouser            0    1           0  
Pullover           0    3           

## Confusion Matrix Analysis
The confusion matrix shows the performance of the MLP model on the Fashion MNIST test set.  
Rows represent **true labels**, columns represent **predicted labels**, and diagonal entries are correct predictions.

**Class mapping:**  
0 = T-shirt/top, 1 = Trouser, 2 = Pullover, 3 = Dress, 4 = Coat, 5 = Sandal, 6 = Shirt, 7 = Sneaker, 8 = Bag, 9 = Ankle boot

### Observations:

- **T-shirt/top (0)** is often confused with **Shirt (6)** (116 times).  
- **Pullover (2)** is frequently misclassified as **Coat (4)** (114 times).  
- **Shirt (6)** is misclassified as **T-shirt/top (0)** (109 times) and **Pullover (2)** (76 times).  
- **Sneaker (7)** and **Ankle boot (9)** show mutual confusion (36 and 41 misclassifications).  

In [10]:
accuracy = accuracy_score(labels, preds)
precision = precision_score(labels, preds, average='macro', zero_division=0)
recall = recall_score(labels, preds, average='macro', zero_division=0)
f1 = f1_score(labels, preds, average='macro', zero_division=0)

print(f'Accuracy: {accuracy:.4f}')
print(f'Precision: {precision:.4f}')
print(f'Recall: {recall:.4f}')
print(f'F1-score: {f1:.4f}')

Accuracy: 0.8854
Precision: 0.8855
Recall: 0.8854
F1-score: 0.8852


### Hyperparameter Tuning Explanation
- **Chosen hyperparameter:** Learning rate (`lr=0.001`)
- **Reason:** The learning rate strongly affects how fast and how well the model converges.
- Lower rates (e.g., 0.0005) converge slower but are stable.
- Higher rates (e.g., 0.005) converge faster but risk overshooting.
- After tuning, `0.001` with Adam optimizer gave the best stability and accuracy balance.


## Summary of Fashion MNIST MLP Task

### Task Overview
- Implemented a **Multilayer Perceptron (MLP)** to classify Fashion MNIST images (10 classes).  
- Used PyTorch, including `nn.Module` for the model, `CrossEntropyLoss` for loss, and `Adam` optimizer.  
- Dataset was normalized and split into training and test sets using `DataLoader`.

### Hyperparameter Tuning
- **Chosen hyperparameter:** Learning rate (`lr`)  
- **Reason:** Learning rate controls the step size during optimization. Too high → unstable training; too low → slow convergence.  
- **Selected value:** `0.001` with Adam optimizer, which provided stable and efficient training.

### Model Performance
- **Accuracy:** 0.8854  
- **Precision (macro):** 0.8855  
- **Recall (macro):** 0.8854  
- **F1-score (macro):** 0.8852  
- The model achieves **>80% accuracy**.

### Confusion Matrix Analysis
- Diagonal entries show correct predictions; off-diagonal entries indicate misclassifications.  
- **Most confused classes:**  
  - T-shirt/top ↔ Shirt  
  - Pullover ↔ Coat  
  - Sneaker ↔ Ankle boot  
- Confusions make sense due to visual similarity between these items.

### Conclusion
- The MLP successfully classifies fashion items with high accuracy.  
- Learning rate tuning and model architecture choices significantly impacted performance.  
- Confusion matrix reveals specific areas where the model struggles, which can guide further improvement.
