Skip to content

aasmaa01/DeepPink-Challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNN from Scratch – Pinktober Datathon 2025

Micro Club Challenge #4

Introduction

This project was completed as part of the 4th challenge of the Pinktober 2025 Datathon, organized by Micro Club – USTHB.

The challenge required participants to build a Convolutional Neural Network (CNN), or optionally a CNN + LSTM hybrid, entirely from scratch using NumPy only — without the use of high-level frameworks such as TensorFlow, PyTorch, or scikit-learn.

The goal of this challenge was to test deep learning fundamentals, especially understanding the inner workings of CNNs and LSTMs: from convolution operations and feature extraction to sequence modeling and backpropagation.

Team AmalAI


Challenge Description

This project was developed as part of the Pinktober Deep Learning Challenge organized by Micro Club.

The goal of this challenge was to build a Convolutional Neural Network (CNN) and optionally combine it with an LSTM, entirely from scratch, using only low-level libraries such as NumPy.

No high-level frameworks like TensorFlow or PyTorch were allowed.

The objective was to implement all core components manually — including convolution operations, activation functions, pooling, flattening, softmax classification, and backpropagation — and to provide a clear, well-documented report explaining how each part works.


Project Structure

pinktober_cnn_lstm/
│
├── data/
│   ├── mnist-numpy/
│   │   └── mnist.npz
│   └── load_images.py
│
├── src/
│   ├── cnn.py
│   ├── loss.py
│   ├── train.py
│
├── report/
│   └── rapport.md
│
├── main.py
├── README.md
└── .gitignore


Dataset

The dataset used is MNIST (handwritten digits 0–9), downloaded from

Kaggle: vikramtiwari/mnist-numpy.

Each image is 28×28 grayscale pixels.

The data is loaded using NumPy and normalized to values between 0 and 1 for better convergence.

x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

To visualize a sample image:

plt.imshow(x_train[0], cmap="gray")
plt.title(f"Label: {y_train[0]}")
plt.savefig("sample_image.png")

Implementation Details

1. Convolution Layer (Conv3x3)

Applies 3×3 filters that slide across the input image to extract spatial features such as edges or corners.

Each filter learns to detect a different pattern from the image.

2. Activation Function (ReLU)

Applies the Rectified Linear Unit (ReLU) function:

f(x) = max(0, x)

ReLU introduces non-linearity, allowing the model to learn complex relationships rather than simple linear patterns.

3. Pooling Layer (MaxPool2)

Reduces spatial dimensions by taking the maximum value from non-overlapping 2×2 regions.

This operation keeps the most important information and reduces computational cost.

4. Flatten + Fully Connected Layer (Softmax)

After convolution and pooling, the 3D output is flattened into a 1D vector and passed through a fully connected (dense) layer with a softmax activation.

Softmax converts raw outputs into probabilities for each class (digits 0–9).

5. Loss Function (Cross-Entropy)

Used to measure how well the model predicts the correct labels.

Implemented manually as:

loss = -np.sum(y_true_one_hot * np.log(y_pred))

6. Training Loop

For each image in the dataset:

  1. Forward pass through the CNN (Conv → ReLU → Pool → Softmax)
  2. Compute loss using cross-entropy
  3. Compute gradients for the softmax layer
  4. Update weights using a simple gradient descent step

How to Run

  1. Clone the repository:

    git clone https://github.com/aasmaa01/DeepPink-Challenge
    cd DeepPink-Challenge
    
  2. Create a virtual environment and install dependencies:

    python3 -m venv venv
    source venv/bin/activate
    pip install numpy matplotlib
    
  3. Download the dataset (if not already):

    kaggle datasets download vikramtiwari/mnist-numpy
    unzip mnist-numpy.zip -d data/mnist-numpy
    
  4. Run the main script:

    python3 main.py
    

To test faster, you can modify the number of training samples in train.py:

for i in range(100):  # instead of len(x_train)

Results

  • Successfully implemented a CNN from scratch using only NumPy.
  • Trained and tested the model on MNIST data.
  • Verified that each layer (Convolution, ReLU, Pooling, Flatten, Softmax) functions correctly.
  • Visualized a sample input image to confirm data integrity.
  • Observed valid loss and accuracy outputs during training.

Future Work

  • Add the LSTM part to process CNN features as sequential data.
  • Implement full backpropagation for all CNN layers (not just softmax).
  • Optimize the code using NumPy vectorization or GPU acceleration.

About

Micro Club Pinktober: CNN + LSTM from scratch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages