# (Project Write-up) Tensorflow Object Detection API and AWS Sagemaker

# Model 1 - EfficientDet D1 640x640 (Starter Code)


### Model Performance
The model was evaluated using various metrics:
- Overall mAP	`0.080`.
- DetectionBoxes Precision (mAP)** at different thresholds, with scores ranging from `0.03` to `0.2`.
- DetectionBoxes Recall** across multiple recall metrics (e.g., AR@1, AR@10, AR@100).
- The overall mAP values indicate that the model struggles to accurately detect objects, especially small ones.

### Training vs. Validation Loss
- The **Training Loss** decreased steadily, indicating effective learning from the training data.
- However, the **Validation Loss** began to increase towards the end, suggesting overfitting as the model starts to memorize training patterns rather than generalize.

### Observed Behavior and Improvements
This behavior is expected when models overfit or when the dataset does not adequately cover the problem domain. To improve performance:
1. **Data Augmentation** to increase dataset diversity.
2. **Regularization** (e.g., dropout, weight decay) to combat overfitting.
3. **Hyperparameter Tuning** to find an optimal setup.
4. Consider **simplifying or enhancing model complexity** based on observed underfitting or overfitting.

### Best Model Suggestion
At this stage, no model performs optimally. Refining hyperparameters, using transfer learning, or increasing data quantity could improve results.

### TensorBoard
![1](train_eval/m1/imgs/1.png)
![2](train_eval/m1/imgs/2.png)
![3](train_eval/m1/imgs/3.png)
![4](train_eval/m1/imgs/4.png)



# Model 2 - EfficientDet D1 640x640 (Improving optimizer and Data Augmentation)

### Model Performance and Optimization Changes
After optimizing several hyperparameters, the following changes were observed:
- **Learning Rate**: Increased the base learning rate to 0.1 for potentially faster convergence.
- **Total Steps**: Set to 350,000 to allow extended training with gradual decay, aiming for better generalization.
- **Warmup Parameters**: Reduced `warmup_learning_rate` to 0.0005 and increased `warmup_steps` to 5000 for smoother initial training.
- **Momentum**: Raised `momentum_optimizer_value` to 0.95 for enhanced stability in training.
- **Moving Average**: Enabled moving average to smooth model convergence further.

### Results
The optimized model showed the following performance metrics:
- **mAP (mean Average Precision)** across categories:
  - mAP@0.5IOU: `~0.25`
  - mAP@0.75IOU: `~0.08`
  - Precision for small, medium, and large objects ranged from `0.03` to `0.35`.
- **Recall** improved for larger objects but remained lower for smaller detections.

### Training vs. Validation Loss
- **Training Loss** continued to decrease, showcasing learning progress.
- **Validation Loss** showed a more stable trend post-optimization, with a smaller gap compared to training loss, suggesting reduced overfitting.
  
### Expected Behavior
This reduction in the gap between training and validation loss aligns with expectations for smoother, more generalized training. With the added steps, momentum, and moving average, the model has stabilized, as seen in smoother loss curves.

### Suggestions for Further Improvement
1. **Data Augmentation**: Further increase data diversity for smaller object detection.
2. **Experiment with Dropout**: Add dropout layers to reduce overfitting.
3. **Increase Batch Size**: A higher batch size could stabilize gradients, especially beneficial with increased momentum.
4. **Hyperparameter Tuning**: Further fine-tune learning rates, decay schedules, and optimizer parameters to balance performance for all object sizes.

### Best Model Recommendation
The optimized model is currently the best version. However, incorporating additional data or using a pre-trained model could yield even better results, especially for smaller objects.

### TensorBoard
![1](train_eval/m2/imgs/1.png)
![2](train_eval/m2/imgs/2.png)
![3](train_eval/m2/imgs/3.png)
![4](train_eval/m2/imgs/4.png)
![5](train_eval/m2/imgs/5.png)



# Model 3 - SSD ResNet50 V1 FPN 640x640 (RetinaNet50)	

### Model Performance
The second model (Model 3) was trained and evaluated, resulting in the following performance metrics:
- **mAP (mean Average Precision)**:
  - mAP overall: `0.0837`
  - mAP for large objects: `0.2447`
  - mAP for medium objects: `0.278`
  - mAP for small objects: `0.038`
  - mAP@0.5IOU: `0.1797`
  - mAP@0.75IOU: `0.0706`
  
- **Recall** values indicated improvement in detecting larger objects with AR@100 reaching `0.2827` for large objects, while smaller objects maintained lower recall scores.

### Training vs. Validation Loss
- **Training Loss**: Continued to decrease smoothly, indicating steady learning.
- **Validation Loss**: Showed stability, though a slight increase was observed at certain points, suggesting minor overfitting but with a manageable gap relative to training loss.

### Observed Behavior
The increased mAP for medium and large objects indicates that Model 3 performs better on larger objects than smaller ones, consistent with lower recall and precision for small objects. The validation loss trend reflects a model that could still benefit from further regularization and fine-tuning, especially for small-object detection.

### Recommendations for Improvement
1. **Further Regularization**: Employ dropout or weight decay to reduce overfitting.
2. **Augment Data for Small Objects**: Introduce more small-object variations in the training dataset to improve detection performance.
3. **Parameter Tuning**: Adjust learning rates or use a more gradual learning rate decay to explore more stable convergence.

### Conclusion
Model 3 shows improvement in detection of medium and large objects, but its performance on small objects remains low. Further fine-tuning and targeted data augmentation could help improve its generalization.

### TensorBoard
![1](train_eval/m3/imgs/1.png)
![2](train_eval/m3/imgs/2.png)
![3](train_eval/m3/imgs/3.png)


#  Model 4  - Faster R-CNN ResNet152 V1 640x640	

### Model Performance
The third model (Model 4) achieved the following results across various metrics:
- **mAP (mean Average Precision)**:
  - Overall mAP: `0.1144`
  - Large objects: `0.6913`
  - Medium objects: `0.4021`
  - Small objects: `0.0491`
  - mAP@0.5IOU: `0.2303`
  - mAP@0.75IOU: `0.0977`

- **Recall** values:
  - The model performed well on large and medium objects, with AR@100 for large objects reaching `0.7050` and medium objects `0.4836`.
  - Small objects had lower recall, with AR@100 for small objects at `0.0832`.

### Training vs. Validation Loss
- **Classification Loss** and **Localization Loss** showed decreasing trends, indicating the model is learning from the data.
- **Total Loss** stabilized over time, although it maintained a slightly higher value, which could suggest the model might be struggling with complex classes or smaller objects.

### Observed Behavior
The model's higher performance on large and medium objects compared to small ones aligns with expectations, given the generally lower mAP and recall for small object detection. The learning rate progression shows a smooth increase and plateau, contributing to stable convergence in losses over time. The object detection visualizations demonstrate accurate bounding boxes and labels for large and medium-sized objects, with some successful detections for smaller objects.

### Recommendations for Improvement
1. **Data Augmentation for Small Objects**: Increase small-object variety in the training data to boost detection performance.
2. **Hyperparameter Tuning**: Experiment with lower learning rates or slower decay to improve small-object recognition.
3. **Regularization Techniques**: Add dropout or weight decay to prevent overfitting on large and medium objects.

### Conclusion
Model 4 displays robust performance for large and medium object detection, while small object performance remains an area for improvement. The model could benefit from additional data augmentation and fine-tuning, particularly to enhance recall and precision for smaller objects.

### TensorBoard
![1](train_eval/m4/imgs/1.png)
![2](train_eval/m4/imgs/2.png)
![3](train_eval/m4/imgs/3.png)
![4](train_eval/m4/imgs/4.png)
![5](train_eval/m4/imgs/5.png)


## Model Comparison Summary

Below is a comparison of the four models based on mean Average Precision (mAP), recall, and other relevant metrics to identify the best-performing model.

| Metric                      | Model 1   | Model 2   | Model 3   | Model 4   |
|-----------------------------|-----------|-----------|-----------|-----------|
| **Overall mAP**             | 0.080     | 0.084     | 0.0837    | **0.1144** |
| **mAP for Large Objects**   | 0.25      | 0.6913    | 0.2447    | **0.6913** |
| **mAP for Medium Objects**  | -         | 0.4021    | 0.278     | **0.4021** |
| **mAP for Small Objects**   | -         | 0.0491    | 0.038     | **0.0491** |
| **mAP@0.5 IOU**             | 0.18      | 0.2303    | 0.1797    | **0.2303** |
| **mAP@0.75 IOU**            | 0.07      | 0.0977    | 0.0706    | **0.0977** |
| **AR@1**                    | -         | 0.0279    | 0.0215    | **0.0279** |
| **AR@10**                   | -         | 0.1189    | 0.0881    | **0.1189** |
| **AR@100 (Large)**          | 0.2827    | 0.7050    | -         | **0.7050** |
| **AR@100 (Medium)**         | 0.3466    | 0.4836    | -         | **0.4836** |
| **AR@100 (Small)**          | 0.0875    | 0.0832    | -         | **0.0832** |

### Summary of Findings
- **Model 4** outperforms Models 1, 2, and 3 across nearly all metrics, with the highest overall mAP (0.1144) and strong performance in precision and recall for large and medium objects.
- **Model 1** has the lowest performance, particularly in mAP for all object sizes.
- **Model 2** (previously Model 3’s data) performs slightly better than Model 3 (previously Model 2’s data) in most metrics but falls short of Model 4.

### Conclusion
**Model 4** is the best-performing model based on mAP and recall scores, especially for larger and medium-sized objects. Future improvements could focus on augmenting data for small objects and applying further regularization techniques to enhance performance in small-object detection.


# Augmentation inside pipeline.config. used in models 2,3 and 4

```
data_augmentation_options {
    random_scale_crop_and_pad_to_square {
      output_size: 640
      scale_min: 0.10000000149011612
      scale_max: 2.0
    }
  }
  data_augmentation_options {
    random_crop_image {
      min_object_covered: 0.0
      min_aspect_ratio: 0.75
      max_aspect_ratio: 3.0
      min_area: 0.75
      max_area: 1.0
      overlap_thresh: 0.0
    }
  }
  data_augmentation_options {
   random_adjust_brightness {
     max_delta: 0.2
   }
  }
data_augmentation_options {
  random_adjust_contrast {
    min_delta: 0.8
    max_delta: 1.25
  }
 }
 ```