Skip to content

BenTennyson4/CNNFromScratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Convolutional Neural Network From Scratch (MNIST)

Project Overview

The goal of this project was to build a Convolutional Neural Network (CNN) from scratch, without using high-level deep learning frameworks such as TensorFlow or PyTorch. All neural network computations are implemented using NumPy, while Pandas, Scikit-learn, and kagglehub are used only for data loading, preprocessing, and dataset management.

This project demonstrates a complete CNN pipeline, including convolution, pooling, flattening, dense layers, forward propagation, backpropagation, and training on the MNIST handwritten digit dataset.


Project Structure and Core Classes

The project is composed of several modular classes, each representing a core component of a CNN.

Dense_Layer

The Dense_Layer class represents a fully connected (dense) layer in the network. Each layer is initialized with a specified number of neurons, number of input features, and an activation function.

Key features:

  • Xavier initialization for weight matrices
  • Bias vectors for each neuron
  • Supported activation functions: sigmoid, tanh, ReLU, and softmax
  • Methods for computing weighted sums and applying activation functions
  • Getters and setters for weights and biases

This class is used for both hidden dense layers and the final output layer.


Convolutional_Layer

The Convolutional_Layer class implements a 2D convolutional layer for image inputs.

Key features:

  • Support for multi-channel inputs
  • Configurable number of filters, filter size, stride, and padding
  • Xavier or He initialization for filters
  • Manual convolution implementation using NumPy
  • Generation of feature maps

This layer is responsible for learning spatial features from input images.


Pooling_Layer

The Pooling_Layer class performs downsampling on feature maps produced by convolutional layers.

Key features:

  • Supports max pooling and average pooling
  • Configurable filter size and stride
  • Reduces spatial dimensions while preserving important features
  • Includes backpropagation logic for both pooling types

Pooling layers improve computational efficiency and robustness to small spatial shifts.


Flattening_Layer

The Flattening_Layer class converts multi-dimensional feature maps into a one-dimensional vector suitable for dense layers.

Key features:

  • Forward pass flattens feature maps into a vector
  • Backward pass reshapes gradients back to the original feature map dimensions
  • Ensures dimensional consistency during backpropagation

CNN

The CNN class represents the full convolutional neural network and manages the interactions between all layers.

Key features:

  • Sequential feedforward propagation through all layers
  • Manual backpropagation for convolutional, pooling, flattening, and dense layers
  • Support for cross-entropy and hinge loss functions
  • Gradient-based parameter updates using stochastic gradient descent
  • Training and testing routines with accuracy evaluation

MNIST Dataset and Preprocessing

This project uses the MNIST handwritten digit dataset, consisting of grayscale images of digits from 0 to 9.

Preprocessing Steps

  • Data Loading: The dataset is downloaded using kagglehub and loaded into a Pandas DataFrame.
  • Normalization: Pixel values are scaled from [0, 255] to [0, 1] to improve training stability.
  • Train/Test Split: The dataset is split into training and testing sets using Scikit-learn with stratification to preserve class balance.
  • Reshaping: Each image is reshaped to (28, 28, 1) before being passed into the CNN.
  • One-Hot Encoding: Target labels are converted to one-hot encoded vectors during training for multi-class classification.

Model Architecture

The CNN architecture used in this project is:

  1. Convolutional Layer (8 filters, 3×3, stride 1, padding 1, He initialization)
  2. Max Pooling Layer (2×2, stride 2)
  3. Flattening Layer
  4. Dense Layer (128 neurons, ReLU activation)
  5. Dense Output Layer (10 neurons, Softmax activation)

This architecture is designed to balance clarity, simplicity, and performance.


Training and Evaluation

The network is trained using stochastic gradient descent with configurable:

  • Number of epochs
  • Learning rate
  • Loss function (cross-entropy)

During training, classification accuracy is reported after each epoch. After training, the model is evaluated on the test set, and overall test accuracy is displayed.

A fixed random seed is used to ensure reproducible results.


Conclusion

This project demonstrates how a convolutional neural network can be implemented entirely from scratch using low-level numerical operations. By manually implementing convolution, pooling, flattening, backpropagation, and optimization, the project provides a strong conceptual understanding of how CNNs function internally.

Overall, this work showcases both the theoretical foundations and practical implementation of deep learning applied to image classification using the MNIST dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors