Skip to content

Neural Network from Scratch using Numpy for MNIST Dataset

License

Notifications You must be signed in to change notification settings

abhie7/dl-from-scratch-mnist

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Neural Network from Scratch using Numpy for MNIST Dataset

This repository contains Python code for implementing a simple feedforward neural network from scratch using only NumPy. The neural network is trained on the MNIST dataset for digit classification.

Table of Contents


Overview

The neural network architecture consists of an input layer, a hidden layer with ReLU activation, and an output layer with softmax activation. It's trained using stochastic gradient descent (SGD) with backpropagation.

ReLU (Rectified Linear Activation)

ReLU (Rectified Linear Unit) is a popular activation function used in neural networks. It introduces non-linearity by outputting the input directly if it is positive, otherwise, it outputs zero. ReLU has become the preferred choice for many neural network architectures due to its simplicity and effectiveness in mitigating the vanishing gradient problem.

Softmax Activation

Softmax activation is commonly used in the output layer of neural networks for multi-class classification problems. It converts the raw output scores of the network into probabilities, ensuring that they sum up to one. Softmax is particularly useful when dealing with mutually exclusive classes, as it provides a probability distribution over all classes.

Forward Propagation

Forward propagation is the process of computing the output of a neural network given an input. It involves passing the input through each layer of the network, applying the activation functions, and generating the final output. In this implementation, forward propagation computes the output of the neural network given an input image.

Backward Propagation

Backward propagation is the process of updating the weights of a neural network based on the computed gradients of the loss function with respect to the weights. It involves propagating the error backward from the output layer to the input layer, adjusting the weights using gradient descent. Backward propagation enables the network to learn from the training data by updating its parameters to minimize the loss.

Loss Function

The loss function measures the difference between the predicted output of the neural network and the true labels. It quantifies how well the network is performing during training and provides feedback for adjusting the model parameters. In this implementation, the cross-entropy loss function is used, which is commonly employed for multi-class classification tasks.

Preprocessing

Preprocessing is an essential step in preparing the input data for training a neural network. It involves transforming the raw data into a format that is suitable for the network architecture and learning algorithm. In this implementation, the MNIST images are flattened and normalized to ensure that pixel values are within the range [0, 1]. Additionally, the labels are converted to one-hot encoding to represent the target classes.

Results

After training for 5 epochs, the model achieved an accuracy of approximately 90.83% on the test set.


Requirements

  • Python 3.x
  • NumPy
  • Matplotlib

Installation

Clone the repository to your local machine:

git clone https://github.com/abhie7/dl-from-scratch-mnist.git

Install the required dependencies:

pip install numpy matplotlib

Usage

  1. Run all the cells to train the neural network.
  2. After training, the output will display the training loss and accuracy for each epoch, as well as plots showing the training loss and accuracy trends.

References

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributors