Convolutional Neural Network From Scratch (MNIST)

Project Overview

The goal of this project was to build a Convolutional Neural Network (CNN) from scratch, without using high-level deep learning frameworks such as TensorFlow or PyTorch. All neural network computations are implemented using NumPy, while Pandas, Scikit-learn, and kagglehub are used only for data loading, preprocessing, and dataset management.

This project demonstrates a complete CNN pipeline, including convolution, pooling, flattening, dense layers, forward propagation, backpropagation, and training on the MNIST handwritten digit dataset.

Project Structure and Core Classes

The project is composed of several modular classes, each representing a core component of a CNN.

Dense_Layer

The Dense_Layer class represents a fully connected (dense) layer in the network. Each layer is initialized with a specified number of neurons, number of input features, and an activation function.

Key features:

Xavier initialization for weight matrices
Bias vectors for each neuron
Supported activation functions: sigmoid, tanh, ReLU, and softmax
Methods for computing weighted sums and applying activation functions
Getters and setters for weights and biases

This class is used for both hidden dense layers and the final output layer.

Convolutional_Layer

The Convolutional_Layer class implements a 2D convolutional layer for image inputs.

Key features:

Support for multi-channel inputs
Configurable number of filters, filter size, stride, and padding
Xavier or He initialization for filters
Manual convolution implementation using NumPy
Generation of feature maps

This layer is responsible for learning spatial features from input images.

Pooling_Layer

The Pooling_Layer class performs downsampling on feature maps produced by convolutional layers.

Key features:

Supports max pooling and average pooling
Configurable filter size and stride
Reduces spatial dimensions while preserving important features
Includes backpropagation logic for both pooling types

Pooling layers improve computational efficiency and robustness to small spatial shifts.

Flattening_Layer

The Flattening_Layer class converts multi-dimensional feature maps into a one-dimensional vector suitable for dense layers.

Key features:

Forward pass flattens feature maps into a vector
Backward pass reshapes gradients back to the original feature map dimensions
Ensures dimensional consistency during backpropagation

CNN

The CNN class represents the full convolutional neural network and manages the interactions between all layers.

Key features:

Sequential feedforward propagation through all layers
Manual backpropagation for convolutional, pooling, flattening, and dense layers
Support for cross-entropy and hinge loss functions
Gradient-based parameter updates using stochastic gradient descent
Training and testing routines with accuracy evaluation

MNIST Dataset and Preprocessing

This project uses the MNIST handwritten digit dataset, consisting of grayscale images of digits from 0 to 9.

Preprocessing Steps

Data Loading: The dataset is downloaded using kagglehub and loaded into a Pandas DataFrame.
Normalization: Pixel values are scaled from [0, 255] to [0, 1] to improve training stability.
Train/Test Split: The dataset is split into training and testing sets using Scikit-learn with stratification to preserve class balance.
Reshaping: Each image is reshaped to (28, 28, 1) before being passed into the CNN.
One-Hot Encoding: Target labels are converted to one-hot encoded vectors during training for multi-class classification.

Model Architecture

The CNN architecture used in this project is:

Convolutional Layer (8 filters, 3×3, stride 1, padding 1, He initialization)
Max Pooling Layer (2×2, stride 2)
Flattening Layer
Dense Layer (128 neurons, ReLU activation)
Dense Output Layer (10 neurons, Softmax activation)

This architecture is designed to balance clarity, simplicity, and performance.

Training and Evaluation

The network is trained using stochastic gradient descent with configurable:

Number of epochs
Learning rate
Loss function (cross-entropy)

During training, classification accuracy is reported after each epoch. After training, the model is evaluated on the test set, and overall test accuracy is displayed.

A fixed random seed is used to ensure reproducible results.

Conclusion

This project demonstrates how a convolutional neural network can be implemented entirely from scratch using low-level numerical operations. By manually implementing convolution, pooling, flattening, backpropagation, and optimization, the project provides a strong conceptual understanding of how CNNs function internally.

Overall, this work showcases both the theoretical foundations and practical implementation of deep learning applied to image classification using the MNIST dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
CNN_From_Scratch.ipynb		CNN_From_Scratch.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Convolutional Neural Network From Scratch (MNIST)

Project Overview

Project Structure and Core Classes

Dense_Layer

Convolutional_Layer

Pooling_Layer

Flattening_Layer

CNN

MNIST Dataset and Preprocessing

Preprocessing Steps

Model Architecture

Training and Evaluation

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

BenTennyson4/CNNFromScratch

Folders and files

Latest commit

History

Repository files navigation

Convolutional Neural Network From Scratch (MNIST)

Project Overview

Project Structure and Core Classes

Dense_Layer

Convolutional_Layer

Pooling_Layer

Flattening_Layer

CNN

MNIST Dataset and Preprocessing

Preprocessing Steps

Model Architecture

Training and Evaluation

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages