# Notebook 2: Model Development and Training

**Objective:** To define, compile, and train two distinct Convolutional Neural Network (CNN) architectures to solve our hand gesture recognition problem.

With a clean dataset prepared in the previous step, we can now focus on the models. The assessment brief requires us to experiment with two approaches:
1.  **A CNN built from scratch:** This establishes a baseline and demonstrates foundational knowledge of neural network architecture.
2.  **A model using Transfer Learning:** This leverages a powerful, pre-trained network (MobileNetV2) to achieve higher performance with less data and training time.

This notebook will cover:
- Setting up data generators with augmentation.
- Defining the architecture for both models.
- Compiling the models with an optimizer and loss function.
- Running the training loop (`.fit()` method).
- Saving the final trained models for later evaluation.

### Setup: Imports and Configuration

We'll import TensorFlow and other necessary modules. We also define key hyperparameters for training, such as image size, batch size, and the number of epochs.

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import MobileNetV2
from pathlib import Path
import logging

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', datefmt='%H:%M:%S')
logger = logging.getLogger(__name__)

# Configuration
PROJECT_ROOT = Path.cwd().parent
DATA_DIR = PROJECT_ROOT / "data"
SAVED_MODELS_DIR = PROJECT_ROOT / "saved_models" / "v2_cropped"
SAVED_MODELS_DIR.mkdir(parents=True, exist_ok=True)

# Model Hyperparameters
IMAGE_SIZE = (150, 150)
BATCH_SIZE = 32
EPOCHS = 25
COLOR_MODE = 'rgb'
INPUT_SHAPE = IMAGE_SIZE + (3,)

### Part 1: Data Preparation with Augmentation

Before training, we need to create data generators. These tools efficiently load images from disk in batches and feed them to the model. For the training set, we apply **data augmentation**.

Data augmentation artificially expands our training dataset by creating modified versions of existing images (rotating, shearing, zooming, flipping). This helps the model become more robust and prevents overfitting by ensuring it learns the general features of a hand gesture, not just the specific pixels of our training images.

The validation and test sets are *not* augmented; they are only rescaled, as we need to evaluate the model on unmodified data.

In [None]:
# This function is from src/utils/image_processing.py

def create_data_generators(train_dir, validation_dir, test_dir, image_size, batch_size, color_mode='rgb'):
    logger.info("Initializing Data Generators")

    train_datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=25,
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest'
    )

    val_test_datagen = ImageDataGenerator(rescale=1./255)

    train_generator = train_datagen.flow_from_directory(
        directory=train_dir,
        target_size=image_size,
        batch_size=batch_size,
        color_mode=color_mode,
        class_mode='categorical'
    )

    validation_generator = val_test_datagen.flow_from_directory(
        directory=validation_dir,
        target_size=image_size,
        batch_size=batch_size,
        color_mode=color_mode,
        class_mode='categorical'
    )
    
    return train_generator, validation_generator

# Create the generators
train_generator, validation_generator = create_data_generators(
    train_dir=DATA_DIR / "train",
    validation_dir=DATA_DIR / "validation",
    test_dir=DATA_DIR / "test", # Test generator not needed for training
    image_size=IMAGE_SIZE,
    batch_size=BATCH_SIZE
)

NUM_CLASSES = len(train_generator.class_indices)

### Part 2: Model #1 - A CNN from Scratch

Our first model is a classic CNN architecture. It consists of a stack of `Conv2D` and `MaxPooling2D` layers that act as feature extractors. These layers learn to identify low-level features like edges and textures, which are combined in deeper layers to form more complex patterns (like fingers and palms).

The feature maps are then flattened into a 1D vector and passed through a `Dense` classification head with `Dropout` to prevent overfitting.

In [None]:
# This function is from src/models/architectures.py
def create_scratch_model(input_shape, num_classes):
    model = models.Sequential([
        layers.Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
        layers.MaxPooling2D((2, 2)),
        layers.Conv2D(64, (3, 3), activation='relu'),
        layers.MaxPooling2D((2, 2)),
        layers.Conv2D(128, (3, 3), activation='relu'),
        layers.MaxPooling2D((2, 2)),
        layers.Flatten(),
        layers.Dense(512, activation='relu'),
        layers.Dropout(0.4),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model

# Create, compile, and summarize the model
scratch_model = create_scratch_model(input_shape=INPUT_SHAPE, num_classes=NUM_CLASSES)
scratch_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
scratch_model.summary()

#### Training the Scratch Model

Now we train the model using the `.fit()` method, passing our training and validation data generators. Keras will handle the training loop, backpropagation, and weight updates for the specified number of epochs.

In [None]:
logger.info("Starting training for Model #1: Scratch CNN")
history_scratch = scratch_model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=validation_generator,
    validation_steps=validation_generator.samples // BATCH_SIZE
)

# Save the trained model
model_save_path = SAVED_MODELS_DIR / "scratch_model.keras"
scratch_model.save(model_save_path)
logger.info(f"Model #1 saved to {model_save_path}")

### Part 3: Model #2 - Transfer Learning with MobileNetV2

Transfer learning allows us to use a model that has already been trained on a massive dataset (like ImageNet) and adapt it for our specific task. We use **MobileNetV2** as our base modelâ€”it's lightweight and highly effective for image classification.

The process involves:
1.  **Instantiating the base model** without its original classification head (`include_top=False`).
2.  **Freezing the base model's layers** (`base_model.trainable = False`). This prevents the pre-trained weights from being modified during initial training, preserving the learned features.
3.  **Adding a new classification head** on top. We add a `GlobalAveragePooling2D` layer to reduce dimensionality, followed by a `Dense` layer with `softmax` activation for our specific number of classes.

In [None]:
# This function is from src/models/architectures.py
def create_transfer_model(input_shape, num_classes):
    base_model = MobileNetV2(input_shape=input_shape,
                             include_top=False,
                             weights='imagenet')
    base_model.trainable = False # Freeze the base

    model = models.Sequential([
        base_model,
        layers.GlobalAveragePooling2D(),
        layers.Dropout(0.3),
        layers.Dense(num_classes, activation='softmax')
    ])
    return model

# Create, compile, and summarize the model
transfer_model = create_transfer_model(input_shape=INPUT_SHAPE, num_classes=NUM_CLASSES)
transfer_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)
transfer_model.summary()

#### Training the Transfer Learning Model

The training process is identical to the scratch model. We fit the new model on our data generators. Only the weights of our custom classification head will be updated during this training.

In [None]:
logger.info("Starting training for Model #2: Transfer Learning (MobileNetV2)")
history_transfer = transfer_model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // BATCH_SIZE,
    epochs=EPOCHS,
    validation_data=validation_generator,
    validation_steps=validation_generator.samples // BATCH_SIZE
)

# Save the trained model
model_save_path = SAVED_MODELS_DIR / "transfer_model.keras"
transfer_model.save(model_save_path)
logger.info(f"Model #2 saved to {model_save_path}")

### Conclusion

We have successfully trained and saved two different models for our hand gesture recognition task. The `history` objects contain the training metrics, and the final model weights are saved to `.keras` files in the `saved_models/` directory. 

The next and final step is to load these saved models and perform a rigorous evaluation on the unseen test dataset to determine which one performs better.