# 7144COMP/CW2: Bird Multiple Object Detection Using Faster R-CNN and SSD
## PART IV: Model evaluation and deployment

### Overview

In this notebook, I will evaluate my model through TensorBoard while using the generated metrics to determine model convergence (both validation loss and Intersection over Union (IoU) at both 0.5 and 0.75 are considered). 

The number of epochs to train the model is set to 1, the reason for this choice was explained in the training notebook. In addition, during the 1st epoch of training, the model converged around the final loss value (smoothed loss value with a weight of 0.8).



For the current task, the following steps have been undertaken: 

- Launch TensorBoard displaying both the train and evaluation metrics for the given session. 
- Provide justification for the number of epochs used for training your object detection model

### Next

In the next notebook which is an extension to the present, I will:

- Freeze my trained model in correct format for model inferencing
- Develop a Jupyter Notebook to perform inference on the frozen model using unseen test images
- Discuss my results.

### Prerequisites
This notebook runs locally on the environment *tf-gpu*.
- Environment Setup (see Part 0)
- Preprocessing (see Part 1)
- Training (see Part 2)
- Run the necessary evaluation scripts (see Part 3)

## 1. Import the necessary packages

In [1]:
import os

In [2]:
# Current directory
current_dir = os.getcwd()

# Faster R-CNN
# Model training directory and config pipeline
model_dir = os.path.join(current_dir, 'training')
pipeline_config_path = 'fasterrcnn_config.config'

# SSD
# Model training directory and config pipeline
model_dir_ssd = os.path.join(current_dir, 'training_ssd')
pipeline_config_path_ssd = 'ssd_config.config'

## 2. TensorBoard 
### 2.1. Monitor region proposal losses, evaluation metrics
Here ```logdir``` points to the training directory, by launching the next cell, different loss graphs for region proposal network will be imported by TensorBoard from ```training/train``` folder, whereas evaluation metrics for the given session will be imported from the ```training/eval``` folder.

The losses for the Region Proposal Network:

- ```Loss/RPNLoss/localization_loss```: Localization Loss or the Loss of the Bounding Box regressor for the RPN

- ```Loss/RPNLoss/objectness_loss```: Loss of the Classifier that classifies if a bounding box is an object of interest or background

The losses for the Final Classifier:

- ```Loss/BoxClassifierLoss/classification_loss```: Loss for the classification of detected objects into various classes: Cat, Dog, Airplane etc

- ```BoxClassifierLoss/localization_loss```: Localization Loss or the Loss of the Bounding Box regressor



### Display the train and evaluation metrics for the given session 


#### Faster RCNN

In [None]:
!tensorboard --logdir $current_dir'/training/'

2023-01-09 06:39:31.064229: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-09 06:39:32.780153: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-01-09 06:39:32.780253: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-01-09 06:39:34.912356: E tensorflow/compiler/xla/stream_executor/cuda/c

#### SSD

In [None]:
!tensorboard --logdir $current_dir'/training_ssd/'

2023-01-11 23:27:46.112863: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-01-11 23:27:46.771068: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-01-11 23:27:46.771116: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-01-11 23:27:47.556493: E tensorflow/compiler/xla/stream_executor/cuda/c

## 3. Discussion

### Training vs Validation losses

- The training total loss was ```train_total_loss = 0.09```, whereas the validation total loss was ```val_total_loss = 0.3``` at the last step of training. 
- Both losses seem to converge down towards lower error levels, which is a good sign that the model is learning.

- The validation loss was higher than the training loss because the model was evaluated on data it has not seen during training

<img src="https://ayoubb.com/wp-content/uploads/2023/01/Flowchart-4.jpg" />

### Precision Metrics


#### Faster R-CNN

<table>
	<thead>
		<tr>
			<th>Metric</th>
			<th>Value</th>
		</tr>
	</thead>
	<tbody>
		<tr>
			<td>Average Precision (AP) @ [IoU=0.50:0.95, area=all]</td>
			<td>0.559</td>
		</tr>
		<tr>
			<td>Average Precision (AP) @ [IoU=0.50, area=all]</td>
			<td>0.851</td>
		</tr>
		<tr>
			<td>Average Precision (AP) @ [IoU=0.75, area=all]</td>
			<td>0.657</td>
		</tr>
		<tr>
			<td>Average Precision (AP) @ [IoU=0.50:0.95, area=small]</td>
			<td>0.125</td>
		</tr>
		<tr>
			<td>Average Precision (AP) @ [IoU=0.50:0.95, area=medium]</td>
			<td>0.402</td>
		</tr>
		<tr>
			<td>Average Precision (AP) @ [IoU=0.50:0.95, area=large]</td>
			<td>0.643</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @ [IoU=0.50:0.95, area=all, maxDets=1]</td>
			<td>0.624</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @ [IoU=0.50:0.95, area=all, maxDets=10]</td>
			<td>0.665</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @ [IoU=0.50:0.95, area=all, maxDets=100]</td>
			<td>0.670</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @ [IoU=0.50:0.95, area=small, maxDets=100]</td>
			<td>0.270</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @ [IoU=0.50:0.95, area=medium, maxDets=100]</td>
			<td>0.561</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @ [IoU=0.50:0.95, area=large, maxDets=100]</td>
			<td>0.732</td>
		</tr>
	</tbody>
</table>

<p>&nbsp;</p>


- The ```mAP``` at an ```IoU``` threshold of ```0.50``` is ```0.851```, which is relatively high. This suggests that the model is detecting a high fraction of objects correctly and has a low false positive rate.

- The ```mAP``` at an ```IoU``` threshold of ```0.75``` is ```0.657```, which is lower than the mAP at an IoU threshold of 0.50. This may indicate that the model is more conservative (less prone to false positives) when the overlap threshold is stricter.

- The average recall ```AR``` at an IoU threshold of ```0.50:0.95``` is ```0.670```, which is relatively high. This suggests that the model is detecting a high fraction of actual objects.



### Per-Class Precision Metrics (mAP@.5)



<table>
	<thead>
		<tr>
			<th>Class</th>
			<th>AP@0.5IOU</th>
		</tr>
	</thead>
	<tbody>
		<tr>
			<td>Erithacus Rubecula</td>
			<td>0.001856</td>
		</tr>
		<tr>
			<td>Periparus ater</td>
			<td>0.843017</td>
		</tr>
		<tr>
			<td>Pica pica</td>
			<td>0.885294</td>
		</tr>
		<tr>
			<td>Turdus merula</td>
			<td>0.829829</td>
		</tr>
	</tbody>
</table>


- The classes ```Pica_pica``` and ```Periparus_ater``` have the highest average precision, followed by ```Turdus_merula```. ```Erithacus_Rubecula``` has the lowest average precision.</p>

- It may be worth investigating why ```Erithacus_Rubecula``` has a lower average precision and consider ways to improve the model's performance for that class.

#### SSD

<table>
	<thead>
		<tr>
			<th>Metric</th>
			<th>Value</th>
		</tr>
	</thead>
	<tbody>
		<tr>
			<td>Average Precision (AP) @[ IoU=0.50:0.95 ]</td>
			<td>0.001</td>
		</tr>
		<tr>
			<td>Average Precision (AP) @[ IoU=0.50 ]</td>
			<td>0.005</td>
		</tr>
		<tr>
			<td>Average Precision (AP) @[ IoU=0.75 ]</td>
			<td>0.000</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @[ IoU=0.50:0.95 ]</td>
			<td>0.028</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @[ IoU=0.50:0.95 ]</td>
			<td>0.047</td>
		</tr>
		<tr>
			<td>Average Recall (AR) @[ IoU=0.50:0.95 ]</td>
			<td>0.076</td>
		</tr>
	</tbody>
</table>


The values provided in the table are very low, specifically all the APs are in the range of 0.001 to 0.002 and the AR in the range of 0.028 to 0.076. This indicates that the model is not performing well and it has a poor detection rate.



### Justification for the number of epochs used for training your object detection model

```num_epochs``` : A number of epochs of ```7``` means that the model will make seven passes through the entire training dataset. While there is no formal definition of the term epoch, we used the following formula to calculate ```num_epochs```: 

$$
\text { num_epochs }=\text { training_steps * batch_size } / \text { training_set_size }
$$


Taking into consideration some limitations such as time, hardware (Memory, Number of GPUs available), the following approach was applied to find the optimal ```num_epochs```:

- Incremental training: by gradually increasing the ```num_epochs``` from 1 to 10, the model automatically starts training from the last checkpoint, this way, we can continuously monitor the model's performance on the validation set (```validation_loss```) as it is training, and stop training when the performance on the validation set stops improving. This can help to prevent overfitting and ensure that the model is not trained for more epochs than necessary.

- I found that at the step ```21000```, which is the 7th epoch, the validation loss and train loss curves stoped decreasing, which is a sign that our model stopped learning from the training dataset at that particular step.

Increasing the number of epochs (num_epochs) during training can help improve the performance of an object detection model in a few different ways:

- **Increased model convergence**: by allowing the model to see the training data more times, which can help it learn more effectively and converge on a better solution.

- **Improved generalisation**: A model that has been trained for more epochs may be better able to generalize to new, unseen data. This is because the model has been exposed to more diverse examples during training and has had more opportunities to learn about the underlying patterns in the data.

- **More time for optimisation**: Training for more epochs gives the model more time to adjust its weights and biases through the optimization process. This can lead to improved model performance, especially if the learning rate is set appropriately.

Nevertheless, increasing the number of epochs also increases the training time and can lead to overfitting if the model is trained for too many epochs. Overfitting occurs when a model becomes *too specialized to the training data and performs poorly on new, unseen data*.

Note: It was not possible to increase the batch size due to memory limitations. 

#### **Comparison with Roboflow Cloud-based experiment (using the same model and dataset)**

- Using ```num_epochs=40``` and ```batch_size=50```, same augmentation steps and the same hyperparameters the model achieved ```90.1% mAP``` ```88.4% precision``` and ```82.8% recall``` with an average class ```precision``` of ```90%``` on the validation dataset.

- We can conclude that increasing the number of epochs and the batch size would lead to better precision and faster training.

<img src="https://storage.googleapis.com/roboflow-platform-cache/RjBpFWbVLQdI2NaOrqg24Eooatr2/qYHiTyjFVuJ6MWIK56Sh/4/results.png" width="800" />

### Next

- Freeze the trained model in correct format for model inferencing
- Develop a Jupyter Notebook to perform inference on the frozen model using unseen test images
- Discuss my results.