# Training 

## Clean the XML


To execute the xmlconversion.py script, the following command can be used. This command initiates the cleaning process for XML files by creating a new directory named "cleaned" within the "images" directory. It then proceeds to copy all the XML files into the "cleaned" directory, performing the necessary cleaning operations by removing empty spaces. Finally, the cleaned XML files are copied back to the "images" directory, replacing the old XML files.

In [1]:
!python xmlconversion.py --verbose

images/pica_pica978.xml
images/periparus_ater_337.xml
images/ErithacusRubecula0060.xml
images/periparus_ater_3.xml
images/periparus_ater_469.xml
images/periparus_ater_157.xml
images/pica_pica545.xml
images/pica_pica897.xml
images/ErithacusRubecula1062.xml
images/periparus_ater_941.xml
images/ErithacusRubecula0099.xml
images/periparus_ater_402.xml
images/pica_pica231.xml
images/ErithacusRubecula0219.xml
images/pica_pica229.xml
images/pica_pica772.xml
images/pica_pica221.xml
images/ErithacusRubecula0290.xml
images/ErithacusRubecula0621.xml
images/periparus_ater_41.xml
images/periparus_ater_99.xml
images/periparus_ater_72.xml
images/ErithacusRubecula0759.xml
images/pica_pica918.xml
images/ErithacusRubecula1044.xml
images/periparus_ater_854.xml
images/periparus_ater_78.xml
images/pica_pica38.xml
images/periparus_ater_967.xml
images/ErithacusRubecula0719.xml
images/pica_pica154.xml
images/periparus_ater_220.xml
images/periparus_ater_809.xml
images/periparus_ater_577.xml
images/ErithacusRube

## Patition the train / test to 90/10

Split the data set into two pieces a training set and a testing set. This consists of random sampling without replacement about 90 percent of the images and putting them into training set. The remaining 10 percent is put into test set. Below command performs the partition of dataset into train 90% and test 10%

In [2]:
!python partition_dataset.py -x -i ./images -r 0.1

## Create the TF Record


Tensorflow Object Detection API requires the input data to be in TFRecord format. 
The TFRecord format is a simple format for storing a sequence of binary records.Here we create TF record for train and test.Data in the TFRecord format can take up less space than the original data.TensorFlow can read data in the TFRecord format with parallel I/O operations.Hence, the efficiency will high.

 Update the .PBTXT file

A .pbtxt file is a simple text file that maps labels to some integer values. The Tensorflow Object detection API requires this file for training and detection using the model. Below command opens the .pbtxt file which can be modified accordingly.

.PBTXT file contains the labels of our species : Periparus Ater, Erithacus Rubecula and Pica pica

In [3]:
!code './data/label_map.pbtxt'

Create the TF Record (Train)

In [3]:
!python generate_tfrecord.py -x images/train -l data/label_map.pbtxt -o data/train.record

Successfully created the TFRecord file: data/train.record


 Create the TF Record (Test)

In [4]:
!python generate_tfrecord.py -x images/test -l data/label_map.pbtxt -o data/test.record

Successfully created the TFRecord file: data/test.record


### 1) Faster R-CNN Model

Set the model path

We have specify the model path. Here I choose the faster cnn model.

In [5]:
PATH_TO_MODEL = "faster_rcnn_resnet101_v1_1024x1024_coco17_tpu-8"

### Configure the config file

The TensorFlow Object Detection API utilizes a configuration file to define the hyperparameters for the object detection task. This file allows for customization of various parameters. The following command opens a code editor with the configuration file, enabling modifications specific to the project:

The model configuration for training includes several important parameters:

num_classes: This parameter determines the number of classes the model will be trained on. In this case, it will be set to 3 since I training the model on 3 classes of birds.

batch_size: It represents the number of samples the network takes for training at a time. Since my images are of high resolution and Faster R-CNN requires more computational power, I set the batch_size to 1. However, if there is sufficient memory available, this value can be increased for faster training.

num_steps: This parameter defines the number of steps the model will be trained for. Initially, I set it to 24000 steps and monitor the loss during training using TensorBoard. Based on the loss analysis, we can determine whether the model needs to be trained for more steps or not.

fine_tune_checkpoint: It specifies the path of the checkpoint file of the pre-trained model that will be used for fine-tuning. This file contains the weights and parameters of the pre-trained model.

fine_tune_checkpoint_type: This parameter should be set to "detection" as Ie performing object detection.

label_map_path: It denotes the path of the label_map.pbtxt file that contains the mapping between class labels and their corresponding numerical IDs.

input_path: This parameter represents the path of the training TFRecord file, which contains the preprocessed and formatted training data in the TFRecord format.

By appropriately configuring these parameters, train the Faster R-CNN model on  bird species dataset and optimize its performance for accurate object detection and classification.



In [7]:
!code './training/TF2/training/'{PATH_TO_MODEL}'/pipeline.config'

### Hyperparameters

This command launches the training process with the provided parameters:

model_dir: Specifies the directory where the checkpoints and training progress will be saved.

pipeline_config_path: Indicates the path to the model configuration file that defines the architecture and hyperparameters of the model.

num_train_steps: Specifies the total number of steps to train the model. This value determines the duration and extent of the training process.

By executing this command, the TensorFlow Object Detection API will start training the model based on the provided configuration, saving the checkpoints and tracking the training progress in the specified directory.

In [6]:
!python model_main_tf2.py --model_dir=training/TF2/training/{PATH_TO_MODEL} --pipeline_config_path=training/TF2/training/{PATH_TO_MODEL}/pipeline.config --num_train_steps=5000 --alsologtostderr


2023-06-30 00:12:18.372399: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-06-30 00:12:18.372413: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-06-30 00:12:19.762637: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2023-06-30 00:12:19.796004: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-30 00:12:19.796466: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1

## Exporting a Trained Inference Graph

Once the model has been trained, it is necessary to extract the trained inference graph. The following command accomplishes this task:

This command includes the following parameters:

input_type: Specifies the type of input for the inference graph, which should be set as image_tensor for this case.

pipeline_config_path: Refers to the path of the model configuration file used during the training process.

trained_checkpoint_dir: Points to the directory where the checkpoints were saved during the training process.

output_directory: Indicates the directory where the resulting model inference graph should be saved.

By executing this command, the trained inference graph will be extracted and saved in the specified output directory. This graph can then be used for inference and object detection tasks on new images.


In [9]:
!python exporter_main_v2.py --input_type image_tensor --pipeline_config_path ./training/TF2/training/{PATH_TO_MODEL}/pipeline.config --trained_checkpoint_dir ./training/TF2/training/{PATH_TO_MODEL}/ --output_directory ./training/TF2/training/{PATH_TO_MODEL}/saved_model/


2023-06-29 22:44:40.341135: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-06-29 22:44:40.341150: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-06-29 22:44:41.410296: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2023-06-29 22:44:41.436098: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-29 22:44:41.436561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1

## SSD ResNet50

Set the model path


In [7]:
PATH_TO_MODEL = "ssd_resnet50_v1_fpn_640x640_coco17_tpu-8"

### Hyperparameters and Configuration of the config file

To modify the training parameters for the SSD model, the following changes will be made:

Batch Size: The batch size will be increased from 1 to 2. Since SSD models are computationally less demanding, this system's 25 GB GPU memory can accommodate a larger batch size, which can potentially improve training efficiency.

Fine-Tune Checkpoint: The path to the fine-tune checkpoint will be updated to the pretrained weights specific to the SSD model being used. This will ensure that the model starts from a pre-trained state and benefits from transfer learning.

Number of Steps: The training will start with 10,000 steps. During the training process, the loss will be monitored using TensorBoard to assess the model's progress. Based on the loss analysis, further training steps can be determined if necessary.

By incorporating these changes into the training setup, the SSD model can be optimized for improved performance and accuracy.

In [11]:
!code './training/TF2/training/'{PATH_TO_MODEL}'/pipeline.config'

In [8]:
!python model_main_tf2.py --model_dir=training/{PATH_TO_MODEL} --pipeline_config_path=training/{PATH_TO_MODEL}/pipeline.config --num_train_steps=5000 �alsologtostderr


2023-06-30 07:57:30.847384: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-11.2/lib64
2023-06-30 07:57:30.847398: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2023-06-30 07:57:32.812321: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2023-06-30 07:57:32.844524: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2023-06-30 07:57:32.844986: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3090 computeCapability: 8.6
coreClock: 1