# Fashion-MNIST Image Classification using Logistic Regression
## Objective
By the end of this project, we will:
1. Understand how to load and explore the Fashion-MNIST dataset.
2. Preprocess images for logistic regression.
3. Train and evaluate a logistic regression classifier.
4. Interpret the results and identify strengths/weaknesses.

## Introduction to Fashion-MNIST dataset image Classification

The Fashion-MNIST dataset is a widely used benchmark in machine learning and computer vision. It consists of 70,000 grayscale images of fashion items, such as t-shirts, trousers, and sneakers, each sized 28×28 pixels. The goal of Fashion-MNIST classification is to develop models that can accurately recognize and classify these images into their correct clothing categories.

Because of its simplicity and well-structured format, Fashion-MNIST serves as an excellent starting point for learning image classification techniques, including logistic regression, neural networks, and deep learning models. Success on Fashion-MNIST demonstrates a model’s ability to extract meaningful features from image data and make accurate predictions on unseen samples.

In this lab, we will use the logistic regression model for classification. Logistic regression is a simple yet powerful statistical method that predicts the probability of an input belonging to a particular class. For Fashion-MNIST, it models the probability that an image corresponds to each fashion category.

### Step 1: Import libraries
We import essential libraries for this task:
1. NumPy for numerical operations.
2. Matplotlib for data visualization.
3. scikit-learn for building and evaluating the logistic regression model. LogisticRegression from scikit-learn will be our classifier.
4. classification_report & confusion_matrix help evaluate performance.
5. TensorFlow to load the Fashion-MNIST dataset easily. We use Fashion-MNIST directly from Keras for convenience.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
from tensorflow.keras.datasets import fashion_mnist


ModuleNotFoundError: No module named 'numpy'

### Step 2: Load the MNIST dataset
In this step, we load the Fashion-MNIST dataset using TensorFlow’s built-in tf.keras.datasets.fashion_mnist.load_data() function. This provides the dataset split into training and testing sets directly as NumPy arrays. Each image is a 28×28 grayscale pixel array, and the labels are integers from 0 to 9, each representing a specific clothing category. This built-in method is convenient and efficient for quick access to the dataset.

1. Fashion-MNIST contains 28×28 grayscale images of 10 fashion categories.
2. x_train has 60,000 images, x_test has 10,000 images

In [None]:
# Load data: (x_train, y_train) for training, (x_test, y_test) for testing
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

print("Training set shape:", x_train.shape)
print("Test set shape:", x_test.shape)

### Step 3: Visualize sample images
Here we display five images from the training set. This visualization helps familiarize  with the dataset and confirms that the images and labels align.

In [None]:
plt.figure(figsize=(10,5))
for i in range(5):
    plt.subplot(1, 5, i+1)
    plt.imshow(x_train[i], cmap='gray')
    plt.title(f"Label: {y_train[i]}")
    plt.axis('off')
plt.show()


### Visualization of the class distribution of the dataset
Our Fashion-MNIST dataset is well balanced with 7000 images of each ctaegorys.

In [None]:


# Combine training and test labels for overall distribution
all_labels = np.concatenate([y_train, y_test])

# Count occurrences of each class
unique, counts = np.unique(all_labels, return_counts=True)

# Plot class distribution
plt.figure(figsize=(8, 5))
plt.bar(unique, counts, tick_label=unique)
plt.title("Class Distribution in Fashion-MNIST Dataset")
plt.xlabel("Class Label")
plt.ylabel("Number of Images")
plt.show()


### Step 4: Preprocess the data
Since logistic regression expects input as feature vectors, each 28×28 image must be flattened into a one-dimensional vector of length 784 (28 × 28).

We also normalize the pixel values from the range 0–255 to 0–1. Normalization improves model performance and helps achieve faster and more stable convergence during training.

1. Flattening: Converts each 28×28 image into a 1D vector of size 784.
2. Normalization: Scales pixel values to the range [0, 1] for better model convergence and stability

In [None]:
# Flatten each 28x28 image into a 784-length vector
x_train_flat = x_train.reshape(x_train.shape[0], -1)
x_test_flat = x_test.reshape(x_test.shape[0], -1)

# Normalize pixel values to range [0, 1]
x_train_flat = x_train_flat.astype('float32') / 255.0
x_test_flat = x_test_flat.astype('float32') / 255.0

### Step 4.1: Displaying the pixel matrix of an image
Now we will select an image and display the pixel matrix of that image.

In [None]:
# Select one image
image = x_train[0]

# Display the pixel matrix
print("Pixel matrix for the first image:")
print(image)


### Step 5: Build and Train the logistic regression model
We create a multinomial logistic regression model using the ‘saga’ solver, which is suitable for multi-class classification problems and large datasets. The model is then trained on the flattened and normalized Fashion-MNIST images along with their corresponding labels.

1. Multinomial logistic regression: Handles multiple classes (10 fashion categories in Fashion-MNIST).
2. SAGA solver: Efficient for large datasets and supports the multinomial loss function.
3. Training process: The model learns weights for each pixel to distinguish between different clothing categories.

In [None]:
# Using 'saga' solver for large datasets and multinomial classification
model = LogisticRegression(
    solver='saga',
    multi_class='multinomial',
    max_iter=3000,
    tol=0.01, 
    verbose=1,        
    n_jobs=-1,        
    warm_start=True
)

# Train the model
model.fit(x_train_flat, y_train)

### Step 6: Evaluate the model
After training, we predict labels for the test dataset. We evaluate the model’s performance using accuracy and detailed classification metrics such as precision, recall, and F1-score.
1. Accuracy gives an overall measure of correct predictions.
2. Confusion matrix shows per-class performance.
3. Classification report gives precision, recall, and F1 score for each digit.

In [None]:
# Accuracy on the test set
accuracy = model.score(x_test_flat, y_test)
print(f"Test Accuracy: {accuracy:.4f}")

# Predictions for detailed evaluation
y_pred = model.predict(x_test_flat)

# Confusion matrix
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:\n", cm)

# Classification report
print("\nClassification Report:\n", classification_report(y_test, y_pred))

### Step 7: Visualize the predictions
This step visualizes the first 10 images from the test set where the model predicted the digit correctly. It helps you qualitatively assess the model’s successes. Here we also display the first 10 test images where the model made incorrect predictions, highlighting areas where the model could be improved. Visualisation helps you better understand and interpret your results and model performance.
1. np.where(y_pred == y_test) gets the indices where predictions match the true labels.
2. np.where(y_pred != y_test) gets the indices where they differ.


In [None]:
plt.figure(figsize=(10,5))
for i in range(10):
    plt.subplot(2, 5, i+1)
    plt.imshow(x_test[i], cmap='gray')
    plt.title(f"Pred: {y_pred[i]}\nTrue: {y_test[i]}")
    plt.axis('off')
plt.show()

In [None]:
# Find indices of correct and incorrect predictions
correct_indices = np.where(y_pred == y_test)[0]
incorrect_indices = np.where(y_pred != y_test)[0]

# Visualize first 10 correct predictions
plt.figure(figsize=(10, 5))
for i, idx in enumerate(correct_indices[:10]):
    plt.subplot(2, 5, i+1)
    plt.imshow(x_test[idx], cmap='gray')
    plt.title(f"Pred: {y_pred[idx]}\nTrue: {y_test[idx]}")
    plt.axis('off')
plt.suptitle("Correct Predictions")
plt.show()

In [None]:
# Visualize first 10 incorrect predictions
plt.figure(figsize=(10, 5))
for i, idx in enumerate(incorrect_indices[:10]):
    plt.subplot(2, 5, i+1)
    plt.imshow(x_test[idx], cmap='gray')
    plt.title(f"Pred: {y_pred[idx]}\nTrue: {y_test[idx]}", color='red')
    plt.axis('off')
plt.suptitle("Incorrect Predictions")
plt.show()

### Model output analysis

1. Logistic regression performs well on visually distinct classes like Trouser, Sneaker, Bag, achieving high precision, recall, and F1-scores.
2. The model struggles with visually similar top-wear items such as T-shirt, Pullover, Shirt, and Coat, leading to more misclassifications.
3. Despite these limitations, the overall accuracy of ~84.5% shows that the model captures general patterns in the data and serves as a reasonable     baseline.
4. Decision: Logistic regression is sufficient for initial experiments or fast baseline evaluation, but for higher accuracy on similar fashion items,   more advanced models like CNNs or feature engineering are recommended.
5. The model’s macro F1-score of 0.84 indicates balanced performance across classes, but targeted improvements are needed for challenging categories.

## Regularization in Logistic Regression

### L1 and L2 regularization in logistic regression
In logistic regression, L1 regularization is a penalty added to the loss function that encourages sparsity in the model’s weights. It adds the sum of the absolute values of the coefficients to the loss. L1 regularization penalizes large weights, which can help with feature selection and prevent overfitting.

In logistic regression, L2 regularization is a penalty added to the loss function that discourages large weights. It adds the sum of the squared values of the coefficients to the loss. L2 regularization shrinks the weights gradually, which helps prevent overfitting and improves the model’s generalization.

### Applied regularization in our model.

In our current Fashion-MNIST logistic regression setup, we did not explicitly apply regularization. The default LogisticRegression in scikit-learn uses L2 regularization with C=1.0.

So technically, some L2 regularization is applied by default, but we did not tune the strength (C) or switch to L1. That’s why the results we observed are from a baseline model with default regularization.


## Saving and Using the Trained Model

After training our logistic regression model on the Fashion-MNIST dataset, ywe can save it to disk using Python’s joblib library. Saving the model allows to reuse it later without retraining.

In [None]:
from sklearn.linear_model import LogisticRegression
import joblib
import numpy as np

# 1. Save the model
joblib.dump(model, 'fashion_mnist_logreg_model.pkl')



Now we can call 'fashion_mnist_logreg_model.pk1' to make predictions on unseen, new data.

In [None]:
# 2. Load the model
loaded_model = joblib.load('fashion_mnist_logreg_model.pkl')

# 3. Predict on a new sample (example: first test image)
# Flatten and normalize the image
new_image = x_test[0].reshape(1, -1) / 255.0
predicted_class = loaded_model.predict(new_image)

print("Predicted class:", predicted_class[0])


We use our saved model to predict a new image.