[[Neural Networks from Scratch]]

##### What is our objective here?
We want to introduce the dataset **Fashion MNIST** to expose the model to: file loading, image preprocessing, batching, shuffling, and balancing.

##### Dataset Description
- **Fashion MNIST**: 60,000 training and 10,000 test greyscale images (28×28), ten classes:
0: T-shirt/top  
1: Trouser  
2: Pullover  
3: Dress  
4: Coat  
5: Sandal  
6: Shirt  
7: Sneaker  
8: Bag  
9: Ankle boot  

##### Data Retrieval

In [None]:
from zipfile import ZipFile
import os
import urllib
import urllib.request

URL = 'https://nnfs.io/dataset/fashion_mnist_images.zip'
FILE = 'fashion_mnist_images.zip'
FOLDER = 'fashion_mnist_images'

if not os.path.isfile(FILE):
	print(f'Downloading {URL} and saving as {FILE}...')
	urllib.request.urlretrieve(URL, FILE)

print('Unzipping images...')
with ZipFile(FILE) as zip_images:
	zip_images.extractall(FOLDER)

print('Done!')

This extracts `train/` and `test/` folders, each with sub-folders `0-9` for each class.

##### Image Loading

In [None]:
import os

labels = os.listdir('fashion_mnist_images/train')
files = os.listdir('fashion_mnist_images/train/0')
print(labels)   # ['0', ..., '9']
print(files[:10])

Each class has 6,000 samples -> **balanced dataset**. It's important to prevent model bias towards majority classes.

##### Visualisation

In [None]:
import cv2
import numpy as np
import matplotlib.pyplot as plt

image_data = cv2.imread('fashion_mnist_images/train/7/0002.png', cv2.IMREAD_UNCHANGED)
plt.imshow(image_data, cmap='gray')
plt.show()


##### Data Shuffling

In [None]:
keys = np.array(range(X.shape[0]))
np.random.shuffle(keys)
X = X[keys]
y = y[keys]

Shuffles indices, then applies to both `X` and `y` which is critical for Batch Training as it prevents batch bias.

##### Batch Logic
Instead of training on full dataset at once:

In [None]:
BATCH_SIZE = 128
steps = X.shape[0] // BATCH_SIZE
if steps * BATCH_SIZE < X.shape[0]:
    steps += 1

Use `batch_X = X[start:end]` to iterate in chunks across steps. This process enables training on larger datasets.

##### Epoch and Step Loops

In [None]:
for epoch in range(1, epochs + 1):
    for step in range(steps):
        batch_X = X[step * BATCH_SIZE:(step + 1) * BATCH_SIZE]
        batch_y = y[step * BATCH_SIZE:(step + 1) * BATCH_SIZE]

        output = model.forward(batch_X, training=True)
        data_loss, reg_loss = model.loss.calculate(output, batch_y, include_regularisation=True)
        loss = data_loss + reg_loss

        predictions = model.output_layer_activation.predictions(output)
        accuracy = model.accuracy.calculate(predictions, batch_y)

        model.backward(output, batch_y)
        model.optimiser.pre_update_params()
        for layer in model.trainable_layers:
            model.optimiser.update_params(layer)
        model.optimiser.post_update_params()

        if not step % print_every or step == steps - 1:
            print(f'step: {step}, acc: {accuracy:.3f}, loss: {loss:.3f} '
                  f'(data_loss: {data_loss:.3f}, reg_loss: {reg_loss:.3f}), '
                  f'lr: {model.optimiser.current_learning_rate}')

This process controls the training length.
##### Validation Loop

In [None]:
for step in range(validation_steps):
    batch_X = X_val[step * BATCH_SIZE:(step + 1) * BATCH_SIZE]
    batch_y = y_val[step * BATCH_SIZE:(step + 1) * BATCH_SIZE]

    output = model.forward(batch_X, training=False)
    model.loss.calculate(output, batch_y)
    predictions = model.output_layer_activation.predictions(output)
    model.accuracy.calculate(predictions, batch_y)

validation_loss = model.loss.calculate_accumulated()
validation_accuracy = model.accuracy.calculate_accumulated()

print(f'validation, acc: {validation_accuracy:.3f}, loss: {validation_loss:.3f}')

Detects overfitting.

##### Next Step
[[Model Evaluation]]