<a href="https://colab.research.google.com/github/RudyMartin/dsai-2024/blob/main/MVPS/Camp-Rock-Paper-Scissors/team_GG_AI/change_log_21_37_baseline_gestures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# change_log_21_37_baseline_gestures.ipynb

## Additional explainations of changes made to model and why

1) Re-instated use of relu activation
2) Unfroze base model layers to get performance boost (metrics)

Let's break down the code step by step to understand what's happening:

### 1. **Loading MobileNetV2 and Adding Custom Layers**:
   ```python
   base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(128, 128, 3))
   ```
   - **MobileNetV2**: The `MobileNetV2` model is loaded with pre-trained weights from the ImageNet dataset. The `include_top=False` parameter excludes the top classification layers, so you're only using the feature extraction part of the model. The `input_shape=(128, 128, 3)` specifies that the input images will have a shape of 128x128 pixels with 3 color channels (RGB).
   
   ```python
   x = base_model.output
   x = GlobalAveragePooling2D()(x)
   x = Dense(128, activation='sigmoid')(x)
   predictions = Dense(5, activation='softmax')(x)
   ```
   - **GlobalAveragePooling2D**: This layer reduces the spatial dimensions of the feature maps output by the base model to a single value per feature map. This reduces the dimensionality and prepares the data for the dense layers.
   - **Dense Layer with Sigmoid Activation**: A dense layer with 128 units and a `sigmoid` activation function is added. The `sigmoid` activation squashes the outputs to a range between 0 and 1.
   - **Output Layer with Softmax Activation**: The final dense layer has 5 units (one for each class) and uses a `softmax` activation function, which is typically used for multi-class classification problems to output probabilities that sum to 100%.

### 2. **Freezing the Base Model**:
   ```python
   for layer in base_model.layers:
       layer.trainable = False
   ```
   - **Freezing Layers**: The layers of the base MobileNetV2 model are frozen, meaning their weights will not be updated during training. This is common when using pre-trained models, as the base layers are already trained to recognize general features.

### 3. **Compiling and Training the Model**:
   ```python
   model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
   model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=15, batch_size=32)
   ```
   - **Compiling**: The model is compiled using the `Adam` optimizer, `categorical_crossentropy` loss (since it's a multi-class classification problem), and accuracy as the evaluation metric.
   - **Training**: The model is trained for 15 epochs with a batch size of 32, but only the custom layers (added after the base model) are being trained because the base model's layers are frozen.

### 4. **Fine-Tuning (Commented Out)**:
   ```python
   '''
   for layer in base_model.layers:
       layer.trainable = True

   model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

   h = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=4, batch_size=32)
   '''
   ```
   - **Fine-Tuning**: The code to unfreeze the base model's layers and fine-tune the entire model has been commented out. Fine-tuning involves unfreezing the layers of the pre-trained model and re-training the entire model with a lower learning rate. This allows the model to adjust the pre-trained weights slightly to better fit the specific task.
   - **Commenting Out**: Since this code block is commented out, the model does not perform fine-tuning after the initial training.

### 5. **Saving the Model**:
   ```python
   model.save(f'{model_dir}/{model_name}.keras')
   ```
   - The model, including its architecture, weights, and optimizer state, is saved to a file for later use.

### Analysis of the Code:
- **Sigmoid Activation in Dense Layer**:
   - Typically, a `relu` (Rectified Linear Unit) activation function is used in dense layers when adding custom layers to a pre-trained model, especially when using them for feature extraction. `relu` helps in learning non-linear relationships and avoids issues like vanishing gradients.
   - Using `sigmoid` can be suboptimal in this context, as it squashes the output to a range between 0 and 1, which might not be ideal before the final `softmax` layer. This could potentially slow down the learning process or limit the ability of the model to learn complex features.

- **Not Unfreezing the Layers**:
   - Not unfreezing the base model layers means that the model is not fine-tuning the pre-trained weights. While the custom layers can still learn, the model might not perform as well as it could with fine-tuning because it’s not adapting the pre-trained features to the specific dataset.
   - Fine-tuning usually leads to better performance, especially when the pre-trained model's features need slight adjustments for the specific task.

### Conclusion:
- **Sigmoid Activation**: The use of `sigmoid` in this context is unusual and may not be optimal. Replacing it with `relu` is generally recommended.
- **Unfreezing Layers**: Fine-tuning the model by unfreezing the base model layers is a standard practice to improve performance, especially after training the added layers. You should consider uncommenting that section to allow the model to better adapt to your specific data.