# Mask Detection CNN Model

---

This is a ML model created by Keshav Ghai (An aspiring AI/ML dev).

This is a **Convolutional Neural Network (CNN)** which detects whether a person is wearing a face mask or not from image data. Unlike previous models that worked with text or tabular data, this model learns visual patterns directly from raw image pixels. The training script **"CNN_trainer.py"** takes image data, preprocesses it, trains a CNN with 3 convolutional layers, and generates visualizations.

## What makes this different?

CNNs are special neural networks designed for image data. They use **convolutional layers** that automatically learn to detect features like edges, textures, and shapes by sliding small filters across images. This is much more effective than flattening images into vectors because it preserves spatial relationships.

## Imports:-
---

In [None]:
import os
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix
import cv2
import pandas as pd

## 0. Dataset Normalization (Data Preprocessing)
---

Before actually starting the model, we have to normalize the dataset. It is done through these steps:

**Why normalize?** Raw images come in different sizes and formats. Normalization standardizes them so the model can process them consistently.

### Normalization Steps:
1. **Read images** using OpenCV from the dataset folder
2. **Convert color space** from BGR (OpenCV's default) to RGB (standard format)
3. **Resize to 256×256** - ensures all images have the same dimensions
4. **Save normalized images** to a new clean directory

The `normalization.py` script handles this automatically, creating a `normalized_dataset` folder with properly formatted images ready for training.

In [None]:
# normalization.py

import os
import cv2
import numpy as np

DATASET_DIR = "./tensorflow/mask_detector/dataset"
OUTPUT_DIR = "./tensorflow/mask_detector/normalized_dataset"
TARGET_SIZE = (256, 256)

folders = ["with_mask", "without_mask", "eval"]

def ensure_dir(path):
    if not os.path.exists(path):
        os.makedirs(path)

def process_folder(folder):
    input_path = os.path.join(DATASET_DIR, folder)
    output_path = os.path.join(OUTPUT_DIR, folder)
    ensure_dir(output_path)

    for img_name in os.listdir(input_path):
        img_path = os.path.join(input_path, img_name)

        # Read image with OpenCV
        img = cv2.imread(img_path)
        if img is None:
            print(f"Skipping unreadable file: {img_path}")
            continue

        # Convert BGR → RGB
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

        # Resize to 256×256
        img = cv2.resize(img, TARGET_SIZE)

        # Save normalized image
        out_path = os.path.join(output_path, img_name)
        cv2.imwrite(out_path, cv2.cvtColor(img, cv2.COLOR_RGB2BGR))

    print(f"Processed folder: {folder}")

def main():
    ensure_dir(OUTPUT_DIR)
    for folder in folders:
        process_folder(folder)

    print("Normalization complete! All images resized to 256x256.")

if __name__ == "__main__":
    main()


## 1. Define Paths and Constants
---

> These variables point to the normalized dataset directories and where to save the trained model.

In [None]:
# Paths to normalized dataset
DATA_DIR = "./tensorflow/mask_detector/normalized_dataset"
EVAL_DIR = "./tensorflow/mask_detector/normalized_eval"
CLASSES_CSV = os.path.join(EVAL_DIR, "classes.csv")

# Where to save the trained model
MODEL_SAVE_PATH = "./Models/mask_detector.keras"
PLOT_SAVE_DIR = "./tensorflow/mask_detector"

# Image and training settings
IMG_SIZE = (256, 256)  # Height x Width of images the model expects
BATCH_SIZE = 32        # Number of images to process together

print(f"Data will be loaded from: {DATA_DIR}")
print(f"Model will be saved to: {MODEL_SAVE_PATH}")

## 2. Load Dataset Using TensorFlow ImageDataset
---

> TensorFlow's `image_dataset_from_directory` automatically loads images and labels from folder structure. It assumes folders are class names (e.g., "with_mask", "without_mask").

In [None]:
# Load training data (85% of dataset)
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    DATA_DIR,
    labels="inferred",           # Read labels from folder names
    label_mode="binary",          # Binary classification: mask vs no mask (0 or 1)
    color_mode="rgb",             # Load as RGB color images (3 channels)
    batch_size=BATCH_SIZE,         # Process 32 images at once
    image_size=IMG_SIZE,           # Resize all images to 256x256
    validation_split=0.15,         # 15% for validation
    subset="training",            # This loads the training portion
    seed=42                        # Fixed seed for reproducibility
)

# Load validation data (remaining 15%)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    DATA_DIR,
    labels="inferred",
    label_mode="binary",
    color_mode="rgb",
    batch_size=BATCH_SIZE,
    image_size=IMG_SIZE,
    validation_split=0.15,
    subset="validation",           # This loads the validation portion
    seed=42
)

# Prefetch: Load next batch while current batch is being processed (optimization)
train_ds = train_ds.prefetch(buffer_size=tf.data.AUTOTUNE)
val_ds = val_ds.prefetch(buffer_size=tf.data.AUTOTUNE)

print("Dataset loaded successfully!")

## 3. Understanding the CNN Architecture
---

> A CNN learns hierarchical features: early layers detect simple patterns (edges), middle layers combine them (shapes), and later layers recognize high-level features (faces, masks).

### What each component does:

**Rescaling Layer**: Converts pixel values from [0, 255] to [0, 1]. This helps the neural network learn better.

**Conv2D Layers**: Apply filters that slide across the image to detect features. `Conv2D(32, (3,3))` means 32 filters of size 3×3 pixels.

**MaxPooling2D**: Reduces spatial dimensions by taking the maximum value in 2×2 windows. This:
- Makes computation faster
- Prevents overfitting
- Focuses on the most important features

**Flatten**: Converts the 2D image data into a 1D vector so it can go through dense layers.

**Dense Layers**: Traditional neural network layers that learn patterns from the features extracted by convolution.

**Dropout(0.3)**: Randomly ignores 30% of neurons during training to prevent overfitting.

**Output Layer**: Single neuron with sigmoid activation for binary classification (mask=1, no mask=0).

In [None]:
# Build the CNN model
model = tf.keras.Sequential([
    # Input: Image is normalized from [0, 255] to [0, 1]
    tf.keras.layers.Rescaling(1./255, input_shape=(*IMG_SIZE, 3)),

    # First convolutional block: Extract low-level features (edges, textures)
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(),

    # Second convolutional block: Combine features from first layer
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(),

    # Third convolutional block: Learn high-level patterns
    tf.keras.layers.Conv2D(16, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(),

    # Flatten and fully connected layers
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Dropout(0.3),
    
    # Binary classification output (0 or 1)
    tf.keras.layers.Dense(1, activation='sigmoid', dtype="float32")
])

# Compile the model
model.compile(
    optimizer="adam",                    # Adam optimizer - smart learning rate adjustment
    loss="binary_crossentropy",          # Loss function for binary classification
    metrics=["accuracy"]                  # Track accuracy during training
)

# Display model architecture
model.summary()

## 4. Training the Model
---

> The model learns by:
1. Making predictions on training images
2. Comparing predictions to actual labels
3. Adjusting weights to reduce error
4. Repeating for 10 epochs (passes through entire dataset)

Each epoch takes all batches of 32 images and updates the model 10 times.

In [None]:
# Train the model
history = model.fit(
    train_ds,                    # Training data
    validation_data=val_ds,      # Validation data (for monitoring)
    epochs=10                    # Number of complete passes through dataset
)

# Save the trained model
model.save(MODEL_SAVE_PATH)
print(f"\nModel saved at: {MODEL_SAVE_PATH}")

## 5. Evaluate Model Performance
---

> We test the model on both training and validation data to see:
- **Training Accuracy**: How well it learned the training data
- **Validation Accuracy**: How well it generalizes to unseen data (the real measure)

If training accuracy is much higher than validation, the model is overfitting.

In [None]:
# Evaluate on training set
train_loss, train_acc = model.evaluate(train_ds)

# Evaluate on validation set
val_loss, val_acc = model.evaluate(val_ds)

print(f"\n=== Model Performance ===")
print(f"Training Accuracy:   {train_acc * 100:.2f}%")
print(f"Validation Accuracy: {val_acc * 100:.2f}%")
print(f"\nTraining Loss:   {train_loss:.4f}")
print(f"Validation Loss: {val_loss:.4f}")

## 6. Visualizations
---

> Multiple graphs show how the model improved over training epochs and its actual predictions.

### a. Loss Over Epochs:-

Shows how the error decreased during training. Ideally, both curves should smoothly decrease, and validation loss should not increase sharply (sign of overfitting).

In [None]:
plt.figure(figsize=(6,4))
plt.plot(history.history["loss"], label="Train Loss")
plt.plot(history.history["val_loss"], label="Val Loss")
plt.title("Loss Over Epochs")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.grid(True)
plt.savefig(os.path.join(PLOT_SAVE_DIR, "loss_graph.png"))
plt.close()

print("Loss graph saved")

### b. Accuracy Over Epochs:-

Shows how the model's correctness improved. A rising curve means the model is learning.

In [None]:
plt.figure(figsize=(6,4))
plt.plot(history.history["accuracy"], label="Train Accuracy")
plt.plot(history.history["val_accuracy"], label="Val Accuracy")
plt.title("Accuracy Over Epochs")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.legend()
plt.grid(True)
plt.savefig(os.path.join(PLOT_SAVE_DIR, "accuracy_graph.png"))
plt.close()

print("Accuracy graph saved")

### c. Confusion Matrix:-

Shows actual vs predicted labels:
- **Top-left**: Correctly predicted "With Mask"
- **Top-right**: Incorrectly said "Without Mask" when it was "With Mask" (False Negative)
- **Bottom-left**: Incorrectly said "With Mask" when it was "Without Mask" (False Positive)
- **Bottom-right**: Correctly predicted "Without Mask"

Ideally, diagonal numbers should be high, off-diagonal should be low.

In [None]:
# Load the evaluation CSV
df = pd.read_csv(CLASSES_CSV)

y_true = []
y_pred = []

# Test on evaluation set
for idx, row in df.iterrows():
    filename = row["filename"]
    # Convert: with_mask=1 → label=0, without_mask=0 → label=1
    true_label = 0 if row["with_mask"] == 1 else 1

    img_path = os.path.join(EVAL_DIR, filename)

    # Read and preprocess image
    img = cv2.imread(img_path)
    if img is None:
        print(f"Skipping: {filename}")
        continue

    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = cv2.resize(img, IMG_SIZE)
    img = img.astype("float32") / 255.0
    img = np.expand_dims(img, axis=0)  # Add batch dimension

    # Make prediction
    pred = model.predict(img)[0][0]
    pred_label = 1 if pred > 0.5 else 0

    y_true.append(true_label)
    y_pred.append(pred_label)

# Generate confusion matrix
cm = confusion_matrix(y_true, y_pred)

# Plot confusion matrix
plt.figure(figsize=(6,5))
sns.heatmap(
    cm, annot=True, fmt="d",
    xticklabels=["With Mask", "Without Mask"],
    yticklabels=["With Mask", "Without Mask"]
)
plt.title("Confusion Matrix (Eval Set)")
plt.ylabel("True Label")
plt.xlabel("Predicted Label")
plt.savefig(os.path.join(PLOT_SAVE_DIR, "confusion_matrix.png"))
plt.close()

print("Confusion matrix saved.")

## 7. Interactive Testing Mode
---

> Test the trained model on any image. The model will:
1. Read the image file
2. Preprocess it (resize to 256×256, convert to RGB, normalize)
3. Make a prediction
4. Output "WITH Mask" or "WITHOUT Mask"

The prediction value (0 to 1) closer to 0 means higher confidence for "WITH Mask", closer to 1 means "WITHOUT Mask".

In [None]:
print("\nInteractive testing mode. Enter image path or 'quit' to exit.")

while True:
    path = input("\nImage path: ").strip()

    if path.lower() == "quit":
        print("Exiting testing mode.")
        break

    # Validate path
    if not os.path.exists(path):
        print("Invalid path. Try again.")
        continue

    # Preprocess image
    img = cv2.imread(path)
    if img is None:
        print("Could not read image. Try another file.")
        continue
        
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = cv2.resize(img, IMG_SIZE)
    img = img.astype("float32") / 255.0
    img = np.expand_dims(img, axis=0)  # Add batch dimension

    # Make prediction
    pred = model.predict(img)[0][0]
    label = "WITHOUT Mask" if pred > 0.5 else "WITH Mask"
    confidence = pred if pred > 0.5 else (1 - pred)

    print(f"Prediction: {label}")
    print(f"Confidence: {confidence * 100:.2f}%")