CNN from Scratch – Pinktober Datathon 2025

Micro Club Challenge #4

Introduction

This project was completed as part of the 4th challenge of the Pinktober 2025 Datathon, organized by Micro Club – USTHB.

The challenge required participants to build a Convolutional Neural Network (CNN), or optionally a CNN + LSTM hybrid, entirely from scratch using NumPy only — without the use of high-level frameworks such as TensorFlow, PyTorch, or scikit-learn.

The goal of this challenge was to test deep learning fundamentals, especially understanding the inner workings of CNNs and LSTMs: from convolution operations and feature extraction to sequence modeling and backpropagation.

Team AmalAI

Challenge Description

This project was developed as part of the Pinktober Deep Learning Challenge organized by Micro Club.

The goal of this challenge was to build a Convolutional Neural Network (CNN) and optionally combine it with an LSTM, entirely from scratch, using only low-level libraries such as NumPy.

No high-level frameworks like TensorFlow or PyTorch were allowed.

The objective was to implement all core components manually — including convolution operations, activation functions, pooling, flattening, softmax classification, and backpropagation — and to provide a clear, well-documented report explaining how each part works.

Project Structure

pinktober_cnn_lstm/
│
├── data/
│   ├── mnist-numpy/
│   │   └── mnist.npz
│   └── load_images.py
│
├── src/
│   ├── cnn.py
│   ├── loss.py
│   ├── train.py
│
├── report/
│   └── rapport.md
│
├── main.py
├── README.md
└── .gitignore

Dataset

The dataset used is MNIST (handwritten digits 0–9), downloaded from

Kaggle: vikramtiwari/mnist-numpy.

Each image is 28×28 grayscale pixels.

The data is loaded using NumPy and normalized to values between 0 and 1 for better convergence.

x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

To visualize a sample image:

plt.imshow(x_train[0], cmap="gray")
plt.title(f"Label: {y_train[0]}")
plt.savefig("sample_image.png")

Implementation Details

1. Convolution Layer (`Conv3x3`)

Applies 3×3 filters that slide across the input image to extract spatial features such as edges or corners.

Each filter learns to detect a different pattern from the image.

2. Activation Function (ReLU)

Applies the Rectified Linear Unit (ReLU) function:

f(x) = max(0, x)

ReLU introduces non-linearity, allowing the model to learn complex relationships rather than simple linear patterns.

3. Pooling Layer (MaxPool2)

Reduces spatial dimensions by taking the maximum value from non-overlapping 2×2 regions.

This operation keeps the most important information and reduces computational cost.

4. Flatten + Fully Connected Layer (Softmax)

After convolution and pooling, the 3D output is flattened into a 1D vector and passed through a fully connected (dense) layer with a softmax activation.

Softmax converts raw outputs into probabilities for each class (digits 0–9).

5. Loss Function (Cross-Entropy)

Used to measure how well the model predicts the correct labels.

Implemented manually as:

loss = -np.sum(y_true_one_hot * np.log(y_pred))

6. Training Loop

For each image in the dataset:

Forward pass through the CNN (Conv → ReLU → Pool → Softmax)
Compute loss using cross-entropy
Compute gradients for the softmax layer
Update weights using a simple gradient descent step

How to Run

Clone the repository:

git clone https://github.com/aasmaa01/DeepPink-Challenge
cd DeepPink-Challenge

Create a virtual environment and install dependencies:

python3 -m venv venv
source venv/bin/activate
pip install numpy matplotlib

Download the dataset (if not already):

kaggle datasets download vikramtiwari/mnist-numpy
unzip mnist-numpy.zip -d data/mnist-numpy

Run the main script:
```
python3 main.py
```

To test faster, you can modify the number of training samples in train.py:

for i in range(100):  # instead of len(x_train)

Results

Successfully implemented a CNN from scratch using only NumPy.
Trained and tested the model on MNIST data.
Verified that each layer (Convolution, ReLU, Pooling, Flatten, Softmax) functions correctly.
Visualized a sample input image to confirm data integrity.
Observed valid loss and accuracy outputs during training.

Future Work

Add the LSTM part to process CNN features as sequential data.
Implement full backpropagation for all CNN layers (not just softmax).
Optimize the code using NumPy vectorization or GPU acceleration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNN from Scratch – Pinktober Datathon 2025

Micro Club Challenge #4

Introduction

Challenge Description

Project Structure

Dataset

Implementation Details

1. Convolution Layer (`Conv3x3`)

2. Activation Function (ReLU)

3. Pooling Layer (MaxPool2)

4. Flatten + Fully Connected Layer (Softmax)

5. Loss Function (Cross-Entropy)

6. Training Loop

How to Run

Results

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
report		report
src		src
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Folders and files

Latest commit

History

Repository files navigation

CNN from Scratch – Pinktober Datathon 2025

Micro Club Challenge #4

Introduction

Challenge Description

Project Structure

Dataset

Implementation Details

1. Convolution Layer (Conv3x3)

2. Activation Function (ReLU)

3. Pooling Layer (MaxPool2)

4. Flatten + Fully Connected Layer (Softmax)

5. Loss Function (Cross-Entropy)

6. Training Loop

How to Run

Results

Future Work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Convolution Layer (`Conv3x3`)

Packages