# Fashion-MNIST Dataset Classification with Scikit-Learn and PyTorch

In this project, we'll create a classifier using the fashion-MNIST dataset. We'll start with data exploration and a baseline model using Scikit-Learn's Logistic Regression. Then, we'll build a more advanced model using PyTorch to create a neural network. This step-by-step guide is designed to be completed in small sections of 5-10 minutes each.

## High Level outline

### 1) Fetching, using, and manipulating datasets
- Requires knowing sklearn, numpy, matplotlib, and pandas
### 2) Baseline ML training with sklearn
- Use sklearn to train and evaluate your model on the data
### 3) Create your own SimpleNN and CNN
- Use pytorch to pull in data, train your model and evaluate it
### 4) Compare Sklearn and your custom models

# Sources You should use for your project

- [NumPy Documentation](https://numpy.org/doc/)
- [Matplotlib Documentation](https://matplotlib.org/stable/contents.html)
- [Pandas Documentation](https://pandas.pydata.org/docs/)
- [scikit-learn Documentation](https://scikit-learn.org/stable/documentation.html)
- [PyTorch Documentation](https://pytorch.org/)



## Table of Contents
- [Step 0 Install all necessary libraries](#step-0-install-all-necesssary-libraries) 
- [Step 1 Import Libraries and Load Dataset](#step-1-import-libraries-and-load-dataset)
- [Step 2 Explore the dataset](#step-2-explore-the-dataset)
- [Step 3 Preprocess and normalize the data](#step-3-preprocess-and-normalize-data)
- [Step 4 Split the data into test and training](#step-4-split-data-into-training-and-testing-sets)
- [Step 5 baseline model with sklearn](#step-5-baseline-model-with-scikit-learn-linear-models)
- [Step 6 evalute the baseline model](#step-6-evaluate-baseline-model)
- [Step 7 convert the data to pytorch tensors](#step-7-convert-data-to-pytorch-tensors)
- [Step 8 Create DataLoaders](#step-8-create-dataloaders)
- [Steo 9 Define custom NN with pytorch](#step-9-define-the-pytorch-neural-network-model)
- [Step 10 Train your NN](#step-10-train-the-neural-network)
- [Step 11 Visualize the training loss](#step-11-visualize-training-loss)
- [Step 12 Evaluate the NN model](#step-12-evaluate-neural-network-model)
- [Step 13 Create a confusion matrix](#step-13-confusion-matrix-and-sample-predictions)
- [Step 14 Visualize classification vectors of custom NN](#step-14-visualizing-classification-vectors-of-a-neural-network)
- [Step 15 Repeat steps 7-14 for CNN](#step-15-repeat-steps-7-14-for-a-custom-cnn-model)
- [Step 16 Convert data to tensors](#step-16-convert-the-data-to-tensors)
- [Step 17 Create DataLoaders](#step-17-create-data-loaders)
- [Step 18 Define simple CNN model](#step-18-define-a-simple-cnn-model)
- [Step 19 Train CNN](#step-19-train-the-cnn-model)
- [Step 20 Visualize the training loss](#step-20-visualize-the-training-loss)
- [Step 21 Evaluate the CNN model](#step-21-evaluate-the-cnn-model)
- [Step 22 Create a confusion matrix](#step-22-plot-the-confusion-matrix-and-sample-predictions) 
- [Step 23 Visualize the classification vectors](#step-23-visualize-classification-vectors-for-the-cnn)
- [Step 24 Display the feature maps for CNN](#step-24-display-the-feature-maps-for-your-convolutional-network)
- [Step 25 Compare and contrast prebuilt models vs your custom models](#step-25-compare-the-different-models-accuracy)

# Step 0. Install all necesssary libraries

In [None]:
!pip install -r requirements.txt

# Step 1. Import Libraries and Load Dataset
Estimated Time: 5 minutes

First, we'll import all the necessary libraries and load the MNIST dataset from fetch_openml.

### Library Imports Overview

In this project, we use several powerful libraries and modules to handle data processing, build neural networks, and visualize results. Below is a breakdown of each library and module and how it contributes to our project.

---

#### 1. `torch`
The `torch` library is the core of the PyTorch framework, a popular open-source library for machine learning and deep learning. PyTorch provides efficient, flexible tensor operations that enable us to perform mathematical operations on multidimensional arrays (tensors), which are essential for neural networks. We use `torch` for creating and manipulating tensors that store our data and for performing the calculations required during neural network training.

**From the docs:**
The torch package contains data structures for multi-dimensional tensors and defines mathematical operations over these tensors. Additionally, it provides many utilities for efficient serialization of Tensors and arbitrary types, and other useful utilities.

It has a CUDA counterpart, that enables you to run your tensor computations on an NVIDIA GPU with compute capability >= 3.0.

---

#### 2. `torch.nn`
The `torch.nn` module is part of PyTorch and provides tools for creating neural network layers and defining complex models. The `nn` module includes pre-built layers like fully connected (`Linear`), convolutional (`Conv2d`), and activation functions (`ReLU`). By using `torch.nn`, we can build each part of a neural network and chain them together to create the final model.

---

#### 3. `torch.optim`
The `torch.optim` module provides various optimization algorithms for training neural networks. Optimization is a key step in machine learning, where we adjust the model’s parameters to minimize the error in predictions. `optim` contains optimizers such as **Stochastic Gradient Descent (SGD)** and **Adam**, which update the model's weights during training based on the gradients computed from the loss function.

---

#### 4. `torch.utils.data`
The `torch.utils.data` module offers utilities for working with datasets, particularly in batching and loading data. It includes:
   - **`DataLoader`**: Manages and batches data, making it easy to feed data into our model in small groups, which is essential for efficient training.
   - **`TensorDataset`**: Wraps tensors into a dataset object, allowing us to combine data and labels into a single dataset that can be loaded by `DataLoader`.

These tools help us efficiently load and prepare data for training and testing our model.

---

#### 5. `sklearn.datasets.fetch_openml`
`fetch_openml` is a function within the `sklearn.datasets` module, part of the Scikit-Learn library. It allows us to download popular datasets hosted on the [OpenML platform](https://www.openml.org/). In this project, we use `fetch_openml` to easily access and load the **MNIST dataset**, a collection of handwritten digits commonly used for training and testing image classification models.

---

#### 6. `numpy`
`numpy` is a fundamental library for scientific computing in Python, providing support for efficient numerical operations on large, multidimensional arrays. In this project, we use `numpy` to handle data operations like reshaping images or performing mathematical calculations on arrays. PyTorch integrates well with `numpy`, allowing for smooth transitions between `numpy` arrays and PyTorch tensors, making it essential for data preparation and processing.

---

#### 7. `matplotlib.pyplot`
`matplotlib.pyplot` is a submodule of Matplotlib, a popular Python plotting library. `plt` (as it’s commonly abbreviated) enables us to create data visualizations like line plots, bar charts, and images. In our project, we use `plt` to visualize training results, such as loss over time, and to display images from our dataset, which helps in understanding the model’s behavior and performance.

---

Each of these libraries plays a critical role in our project, from data handling and visualization to building and training the neural network.


In [5]:
# Import libraries

# AI libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.datasets import fetch_openml

# non AI libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
print(pd.__version__)




2.2.3


You can find the datasets we will be using today on this website: https://openml.org/search?type=data&status=active&id=40996

In [None]:
from sklearn.datasets import fetch_openml

# Load Fashion MNIST dataset using fetch_openml


# print the description of the dataset


# Step 2. Explore the Dataset
Estimated Time: 5 minutes

Let's take a quick look at the dataset to understand its structure.

In [2]:
# Check the shape of the data


# Show the information about the data using info() method


# Show the structure of the data using head() method


### Display the fist image in the dataset

In [5]:
# Display the first image in the dataset using matplotlibs imshow() method


## Step 3. Preprocess and Normalize Data

Estimated Time: 5 minutes


In [None]:
# Normalize the data



## Step 4. Split Data into Training and Testing Sets
Estimated Time: 5 minutes

Split the data into training and testing sets using `train_test_split`. Set test_size and random_state.

In [7]:
from sklearn.model_selection import train_test_split
# Split the data into training and testing sets


## Step 5. Baseline Model with Scikit-Learn linear models
Estimated Time: 10 minutes

Train the following 3 algorithms with the data:
- Logistic Regression model. Use the `lbfgs` as your solver
- Ridge Classifier
- SGD Classifier

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import RidgeClassifier, SGDClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, mean_squared_error
from sklearn.model_selection import train_test_split



In [9]:
# Train Logistic Regression on a subset of data

# Ridge Classifier

# SGD Classifier


## Step 6. Evaluate Baseline Model
Estimated Time: 10 minutes

Predict on the test set and evaluate the baseline model's accuracy.

In [10]:
# Compare the performance of each model


### Plot the classification vectors for the 3 different algorithms. 

In [12]:

# Get the coefficients for each classifier


# Get the classes for each classifier


# Get the intercepts for each classifier


# Get the n_features for each classifier


# Define the descriptive labels for each class


# Plot the classification vectors for each classifier and each class



# Moving to tensorflow

In this section we will create two different custom made neural networks. The goal is to deepen your understanding about both the easy of python and the basics of machie learning. 

## Step 7. Convert Data to PyTorch Tensors
Estimated Time: 5 minutes

Convert the pandas DataFrames and Series to PyTorch tensors.

In [13]:
# Convert data to tensors for a simple neural network



## Step 8. Create DataLoaders
Estimated Time: 5 minutes

Create `TensorDataset` and `DataLoader` objects for batch processing

In [14]:
# Create TensorDatasets

# Create DataLoaders


## Step 9. Define the PyTorch Neural Network Model
Estimated Time: 10 minutes

Define a simple feedforward neural network using PyTorch's `nn.Module`.

In [15]:
# Define the neural network model

# Instantiate the model

# Define loss function and optimizer


## Step 10. Train the Neural Network
Estimated Time: 10 minutes

Train the neural network over several epochs.

In [16]:
# Training loop



## Step 11. Visualize Training Loss
Estimated Time: 5 minutes

Plot the training loss over epochs to see how the model learns.

In [17]:
# Plot training loss over epochs



## Step 12. Evaluate Neural Network Model
Estimated Time: 10 minutes

Evaluate the trained neural network on the test set.

In [18]:
# Evaluate the model on the test set

# Calculate accuracy


## Step 13. Confusion Matrix and Sample Predictions

Estimated Time: 10 minutes

Visualize the model's performance with a confusion matrix and sample predictions.

In [19]:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

# Collect all predictions and true labels


# Confusion Matrix



In [20]:
# Display a few test images with predictions



## Step 14. Visualizing Classification Vectors of a Neural Network

In this task, you will visualize the classification vectors of a neural network by extracting the weights from the final layer and reshaping them to match the input dimensions. This helps in understanding how the neural network differentiates between different classes based on the learned features.

In [21]:
# Extract the weights from the final layer

# Plot the classification vectors for each class


## Step 15 Repeat steps 7-14 for a custom CNN model
We want to test how a CNN would perform compared to our simple Neural Network

## Step 16 Convert the data to tensors

In [23]:
# Reshape data for CNN

# Convert data to tensors


## Step 17 Create Data Loaders

In [22]:
# Create TensorDatasets


# Create DataLoaders


## Step 18 Define a simple CNN model

In [24]:
# Define the CNN model

# Instantiate the CNN model

# Define loss function and optimizer



## Step 19 Train the CNN model

In [25]:
# Training loop for CNN


## Step 20 Visualize the Training Loss 

In [26]:
# Plot training loss over epochs for CNN


## Step 21 Evaluate the CNN model

In [27]:
# Evaluate the CNN model on the test set

# Calculate accuracy


## Step 22 Plot the Confusion matrix and Sample Predictions

In [28]:
# Collect all predictions and true labels for CNN

# Confusion Matrix


In [29]:
# Display a few test images with predictions from CNN


## Step 23. Visualize Classification Vectors for the CNN

In [None]:

# Extract the weights from the final layer

# Plot the classification vectors for each class

## Step 24 Display the feature maps for your convolutional network

In [30]:
# Select an input image

# Forward pass through conv1

# Plot the feature maps



## Step 25. Compare the different models accuracy

Estimated Time: 5 minutes
