
# Gender Classification Using CNN and DOE

This notebook demonstrates a pipeline for binary gender classification from facial images using Convolutional Neural Networks (CNNs). A Design of Experiments (DOE) approach is used to systematically explore various architectural and training hyperparameters to optimize model performance.


In [None]:

import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from src.data_loader import load_datasets
from src.model_builder import create_model, get_callbacks
from src.training_utils import train_and_evaluate, plot_training_curves



## Load Dataset

We begin by loading the dataset using a custom data loader. The dataset consists of facial portraits labeled by gender. The images have been preprocessed to a uniform size and split into training, validation, and test sets.


In [None]:

train_ds, val_ds, test_ds = load_datasets("data_path_here", img_size=130)



## Build and Train the CNN Model

We now define a custom CNN architecture with configurable parameters such as learning rate, dropout rate, number of dense units, and number of filters in convolutional layers. These parameters have been optimized based on our DOE analysis.


In [None]:

model = create_model(img_size=130, learning_rate=0.0005, dropout=0.2, n_units=4096, filters=(32, 64))
history = train_and_evaluate(model, train_ds, val_ds, callbacks=get_callbacks(0.89, 0.89))



## Evaluate Model Performance

We assess the model's performance using standard classification metrics, including precision, recall, and F1-score. This helps quantify the model's ability to generalize on unseen validation data.


In [None]:

from sklearn.metrics import classification_report
predictions = model.predict(val_ds)
y_pred = (predictions > 0.5).astype(int).flatten()
y_true = np.concatenate([y.numpy() for _, y in val_ds])
print(classification_report(y_true, y_pred, target_names=['Female', 'Male']))



## Visualize Training Progress

To better understand the learning behavior, we visualize training and validation accuracy and loss over epochs. This can help diagnose underfitting or overfitting during training.


In [None]:

plot_training_curves(history)
