## **Part A** - Binary Classification of Masked vs. Unmasked Faces using Traditional Methods 

In [None]:
import os
import cv2
import numpy as np
from skimage.feature import hog
from sklearn.svm import SVC
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam, SGD
from tensorflow.keras.utils import to_categorica
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV

## Image Loader Function

The following function loads images from a given folder, converts them to grayscale, resizes them to **64x64**, and assigns a label.

In [None]:
def loader(folder, label):
    images = []
    labels = []
    for filename in os.listdir(folder):
        img_path = os.path.join(folder, filename)
        img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE) 
        if img is None:
            print(f"Warning: Unable to read {img_path}. Skipping...")
            continue
        img = cv2.resize(img, (64, 64)) 
        images.append(img)
        labels.append(label)
    return images, labels

## Dataset Paths

We define the paths to the dataset folders containing images with and without masks.

In [3]:
mask_folder = "./dataset/with_mask"
no_mask_folder = "./dataset/without_mask"

## Loading the Dataset

We use the `loader` function to load images from the dataset.  
- **`mask_folder`** contains images of people wearing masks (label: `1`).  
- **`no_mask_folder`** contains images of people without masks (label: `0`).  

In [4]:
maskImages, maskLabels = loader(mask_folder, 1)
noMaskImages, noMaskLabels = loader(no_mask_folder, 0)



### Preparing the Dataset

We combine the images and labels from both categories into NumPy arrays:  
- `X` contains all images (masked and unmasked).  
- `y` contains the corresponding labels (`1` for mask, `0` for no mask).  

In [5]:
X = np.array(maskImages + noMaskImages)
y = np.array(maskLabels + noMaskLabels)

### HOG Feature Extraction

The following function `hogExtractor` extracts **Histogram of Oriented Gradients (HOG)** features from a list of images.  

#### Function Breakdown:
- Iterates through each image in the dataset.
- Computes HOG features using:
  - `pixels_per_cell = (8, 8)`: Each cell is 8×8 pixels.
  - `cells_per_block = (2, 2)`: Blocks contain 2×2 cells.
  - `feature_vector = True`: Outputs a flattened feature vector.
- Returns a NumPy array of extracted features.

In [6]:
def hogExtractor(images):
    hog_features = []
    for img in images:
        features = hog(img, pixels_per_cell=(8, 8), cells_per_block=(2, 2), feature_vector=True)
        hog_features.append(features)
    return np.array(hog_features)

### Extracting HOG Features

We apply the `hogExtractor` function to extract **HOG (Histogram of Oriented Gradients)** features from the image dataset.

In [7]:
X_features = hogExtractor(X)

### Splitting the Dataset

We split the dataset into **training** and **testing** sets using an 80-20 ratio.

In [8]:
X_train, X_test, y_train, y_test = train_test_split(X_features, y, test_size=0.2, random_state=42)

### Hyperparameter Tuning and Training SVM Model

We use **GridSearchCV** to find the best hyperparameters for an **SVM classifier**.  

In [14]:
param_grid = {'C': [0.1, 1, 10, 100], 'kernel': ['linear', 'rbf']}
grid = GridSearchCV(SVC(), param_grid, cv=5)
grid.fit(X_train, y_train)

print(f"Best parameters: {grid.best_params_}")
svm_model = SVC(**grid.best_params_)
svm_model.fit(X_train, y_train)

Best parameters: {'C': 10, 'kernel': 'rbf'}


### Training a Neural Network (MLP Classifier)

We train a **Multi-Layer Perceptron (MLP) classifier** for mask detection.

In [10]:
nn_model = MLPClassifier(hidden_layer_sizes=(128, 64), max_iter=500)
nn_model.fit(X_train, y_train)

### Making Predictions

We use the trained models to predict labels for the test set.

In [15]:
svm_pred = svm_model.predict(X_test)
nn_pred = nn_model.predict(X_test)

### Evaluating the SVM Classifier

In [16]:

print("SVM Classifier Report:")
print(classification_report(y_test, svm_pred))

SVM Classifier Report:
,              precision    recall  f1-score   support
,
,           0       0.95      0.94      0.95       362
,           1       0.96      0.96      0.96       453
,
,    accuracy                           0.95       815
,   macro avg       0.95      0.95      0.95       815
,weighted avg       0.95      0.95      0.95       815
,


### Evaluating the Neural Network Classifier Results

In [13]:
print("Neural Network Classifier Report:")
print(classification_report(y_test, nn_pred))

Neural Network Classifier Report:
,              precision    recall  f1-score   support
,
,           0       0.91      0.92      0.91       362
,           1       0.93      0.93      0.93       453
,
,    accuracy                           0.92       815
,   macro avg       0.92      0.92      0.92       815
,weighted avg       0.92      0.92      0.92       815
,


## **Part B** - Binary Classification of Masked vs. Unmasked Faces with CNN 

## 1. Data Loading and Preprocessing

We assume our dataset is organized into two folders: one for images **with mask** and another for images **without mask**. We load the images in color, resize them to 64x64, and normalize pixel values.

In [None]:
def load_images_from_folder(folder, label, image_size=(64, 64)):
    images = []
    labels = []
    for filename in os.listdir(folder):
        img_path = os.path.join(folder, filename)
        img = cv2.imread(img_path, cv2.IMREAD_COLOR)  
        if img is None:
            print(f"Warning: Unable to read {img_path}. Skipping...")
            continue
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = cv2.resize(img, image_size)
        images.append(img)
        labels.append(label)
    return images, labels

mask_folder = "../dataset/with_mask"
no_mask_folder = "../dataset/without_mask"

mask_images, mask_labels = load_images_from_folder(mask_folder, label=1)
no_mask_images, no_mask_labels = load_images_from_folder(no_mask_folder, label=0)

print(f"Loaded {len(mask_images)} images with mask and {len(no_mask_images)} images without mask.")

Loaded 2142 images with mask and 1930 images without mask.


In [None]:
X = np.array(mask_images + no_mask_images)
y = np.array(mask_labels + no_mask_labels)
X = X.astype('float32') / 255.0

print('Total images:', X.shape[0])
print('Image shape:', X.shape[1:])

Total images: 4072
Image shape: (64, 64, 3)


## 2. Split Data into Training and Testing Sets

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
print('Training set size:', X_train.shape[0])
print('Testing set size:', X_test.shape[0])

Training set size: 3257
Testing set size: 815


## 3. Building the CNN Model

We created a function `build_cnn_model` that accepts several hyperparameters:
- **learning_rate**: for the optimizer
- **optimizer_choice**: e.g., `'adam'` or `'sgd'`
- **batch_size**: used during training
- **activation**: activation function for the final classification layer (for binary classification, typically `sigmoid` is used)

For binary classification, we use a final Dense layer with 1 neuron and a `sigmoid` activation. We can also further extend this by experimenting with other activations.

In [None]:
def build_cnn_model(learning_rate=0.001, optimizer_choice='adam', final_activation='sigmoid', input_shape=(64, 64, 3)):
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dropout(0.5))
    
    model.add(Dense(1, activation=final_activation))
    
    if optimizer_choice.lower() == 'adam':
        optimizer = Adam(learning_rate=learning_rate)
    elif optimizer_choice.lower() == 'sgd':
        optimizer = SGD(learning_rate=learning_rate)
    
    model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['accuracy'])
    return model

# Build a baseline model
baseline_model = build_cnn_model()
baseline_model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## 4. Training the CNN Model with Hyperparameter Variations

Here we train the CNN using various hyperparameters. We experimented by changing the learning rate, optimizer, batch size, and even the final activation.

In [None]:
# Define hyperparameter experiments
experiments = [
    {'name': 'Baseline', 'learning_rate': 0.001, 'optimizer': 'adam', 'batch_size': 32, 'final_activation': 'sigmoid'},
    {'name': 'Low LR', 'learning_rate': 0.0001, 'optimizer': 'adam', 'batch_size': 32, 'final_activation': 'sigmoid'},
    {'name': 'High LR', 'learning_rate': 0.01, 'optimizer': 'adam', 'batch_size': 32, 'final_activation': 'sigmoid'},
    {'name': 'SGD Optimizer', 'learning_rate': 0.001, 'optimizer': 'sgd', 'batch_size': 32, 'final_activation': 'sigmoid'},
    {'name': 'Larger Batch', 'learning_rate': 0.001, 'optimizer': 'adam', 'batch_size': 64, 'final_activation': 'sigmoid'}
]

results = {}
num_epochs = 30

for exp in experiments:
    print(f"\nRunning experiment: {exp['name']}")
    model = build_cnn_model(learning_rate=exp['learning_rate'], optimizer_choice=exp['optimizer'], final_activation=exp['final_activation'])
    history = model.fit(X_train, y_train, epochs=num_epochs, batch_size=exp['batch_size'], validation_data=(X_test, y_test), verbose=2)
    test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
    results[exp['name']] = test_acc
    print(f"Test Accuracy for {exp['name']}: {test_acc:.4f}")

print("\nSummary of CNN Experiments on Test Dataset:")
for name, acc in results.items():
    print(f"{name}: {acc:.4f}")


Running experiment: Baseline
Epoch 1/30
102/102 - 9s - 85ms/step - accuracy: 0.7958 - loss: 0.4267 - val_accuracy: 0.8969 - val_loss: 0.2549
Epoch 2/30
102/102 - 6s - 58ms/step - accuracy: 0.9100 - loss: 0.2491 - val_accuracy: 0.9264 - val_loss: 0.2106
Epoch 3/30
102/102 - 6s - 54ms/step - accuracy: 0.9297 - loss: 0.1921 - val_accuracy: 0.9092 - val_loss: 0.2078
Epoch 4/30
102/102 - 5s - 54ms/step - accuracy: 0.9401 - loss: 0.1663 - val_accuracy: 0.9092 - val_loss: 0.2569
Epoch 5/30
102/102 - 6s - 58ms/step - accuracy: 0.9524 - loss: 0.1437 - val_accuracy: 0.9558 - val_loss: 0.1210
Epoch 6/30
102/102 - 6s - 55ms/step - accuracy: 0.9592 - loss: 0.1135 - val_accuracy: 0.9632 - val_loss: 0.1026
Epoch 7/30
102/102 - 6s - 58ms/step - accuracy: 0.9705 - loss: 0.0882 - val_accuracy: 0.9509 - val_loss: 0.1319
Epoch 8/30
102/102 - 6s - 54ms/step - accuracy: 0.9718 - loss: 0.0824 - val_accuracy: 0.9509 - val_loss: 0.1395
Epoch 9/30
102/102 - 6s - 60ms/step - accuracy: 0.9708 - loss: 0.0772 - va