---

# Object detection train with TensorFlow 2.4.1
This notebook will train a model for the object detection purpouse.
It can be run as a Jupyter notebook in the Google Colab environment or exported as a Python file and run from a command line.

This software detects automatically if you are working on a Colab environment or in your local machine.

For a local machine it just requires a Python >= 3.7 installed.

All the operations for installing all the required libraries and for preparing the data needed by the train algoritm will be done effortlessly for you.
## Train preparation:
*   Collect a set of images containing the objects that you want to train.
*   Split the set in two different folders; one for the train and the other for the evaluation. The number of the evaluation images could be from 10% to 30% of the train images.
*   Label the images using a standard images annotation tool as [labelImg](https://github.com/tzutalin/labelImg), [VoTT](https://github.com/microsoft/VoTT), etc... and save the xml for each picture in the Pascal VOC format. 
*   Copy the folders with the prepared images set in your GDrive (if you are working on a Colab environment).
*   Configure the train parameters listed in the next notebook's cell.

## Train:
Run the process and enjoy your time waiting the completion of the train.

You can also stop the train and restart again later; if you didn't clean the output directory for the model the train will restart from the last checkpoint, continuing the fine tuning of the model.
The progress of the train can be followed by the Tensorboard (already included in this notebook).

### For Colab environment train:
The notebook needs to mount your GDrive. It will ask you the access authorization. Follow the instructions.

---





In [None]:
#@title #Notebook configuration
#begin-module: default_cfg.py
class Cfg(object):
    #@markdown ## Data on Google Drive:
    #@markdown (The data will be treated in a Google Drive space if enabled)
    data_on_drive = True #@param {type:"boolean"}
    #@markdown ---
    #@markdown ## Base model:
    #@markdown (The base model from which the train will start)
    model_type = 'SSD MobileNet v2 320x320' #@param ['CenterNet HourGlass104 512x512', 'CenterNet HourGlass104 1024x1024', 'CenterNet Resnet50 V1 FPN 512x512', 'CenterNet Resnet101 V1 FPN 512x512', 'CenterNet Resnet50 V2 512x512', 'CenterNet MobileNetV2 FPN 512x512', 'EfficientDet D0 512x512', 'EfficientDet D1 640x640', 'EfficientDet D2 768x768', 'EfficientDet D3 896x896', 'EfficientDet D4 1024x1024', 'EfficientDet D5 1280x1280', 'EfficientDet D6 1280x1280', 'EfficientDet D7 1536x1536', 'SSD MobileNet v2 320x320', 'SSD MobileNet V1 FPN 640x640', 'SSD MobileNet V2 FPNLite 320x320', 'SSD MobileNet V2 FPNLite 640x640', 'SSD ResNet50 V1 FPN 640x640 (RetinaNet50)', 'SSD ResNet50 V1 FPN 1024x1024 (RetinaNet50)', 'SSD ResNet101 V1 FPN 640x640 (RetinaNet101)', 'SSD ResNet101 V1 FPN 1024x1024 (RetinaNet101)', 'SSD ResNet152 V1 FPN 640x640 (RetinaNet152)', 'SSD ResNet152 V1 FPN 1024x1024 (RetinaNet152)', 'Faster R-CNN ResNet50 V1 640x640', 'Faster R-CNN ResNet50 V1 1024x1024', 'Faster R-CNN ResNet50 V1 800x1333', 'Faster R-CNN ResNet101 V1 640x640', 'Faster R-CNN ResNet101 V1 1024x1024', 'Faster R-CNN ResNet101 V1 800x1333', 'Faster R-CNN ResNet152 V1 640x640', 'Faster R-CNN ResNet152 V1 1024x1024', 'Faster R-CNN ResNet152 V1 800x1333', 'Faster R-CNN Inception ResNet V2 640x640', 'Faster R-CNN Inception ResNet V2 1024x1024', 'Mask R-CNN Inception ResNet V2 1024x1024']
    #@markdown ---
    #@markdown ## Images directories:
    #@markdown The GDrive directory (Colab execution) or the local directory (machine execution) where are located the images set for the train and the one for the evaluation.
    train_images_dir = 'images/train' #@param {type:"string"}
    eval_images_dir = 'images/eval' #@param {type:"string"}
    #@markdown ---
    #@markdown ## Train directory:
    #@markdown The GDrive directory (Colab execution) or the local directory (machine execution) where the checkpoints will be saved.
    trained_model_dir = 'trained-model' #@param {type:"string"}
    #@markdown ---
    #@markdown ## Export directory:
    #@markdown The GDrive directory (Colab execution) or the local directory (machine execution) where the exported model will be saved.
    exported_model_dir = 'exported-model' #@param {type:"string"}
    #@markdown ---
    #@markdown ## Export ONNX:
    #@markdown The name of the exported ONNX model. It will be created in the exported_model_dir
    exported_onnx = 'saved_model.onnx' #@param {type:"string"}
    #@markdown ---
    #@markdown ## Export frozen:
    #@markdown The name of the exported frozen graph. It will be created in the exported_model_dir
    exported_frozen_graph = 'frozen_graph.pb' #@param {type:"string"}
    #@markdown ---
    #@markdown ## Maximum training steps:
    #@markdown The maximun number of train steps. If < 0 it will be limited by the base model configuration.
    max_train_steps = -1 #@param {type:"integer"}
    #@markdown ---
    #@markdown ## Batch size:
    #@markdown The size of the batch. If < 1 the value contained in the model pipeline configuration will be used
    batch_size = 16 #@param {type:"integer"}
    #@markdown ---
    # TensorFlow version
    tensorflow_version = 'tensorflow==2.4.1' # or for example tf-nightly==2.5.0.dev20210315
    # SHA1 for the checkout of the TensorFlow object detection api
    od_api_git_sha1 = 'e356598a5b79a768942168b10d9c1acaa923bdb4'
    # SHA1 for the checkout of the onnx convertion tool
    tf2onnx_git_sha1 = '596f23741b1b5476e720089ed0dfd5dbcc5a44d0'
#end-module


In [None]:
#@title #Mount Google Drive
#@markdown Mounting of the Google Drive (if enabled in the configuration).
#begin-module: mount_google_drive.py
import  os
import  sys

try:    from    default_cfg import Cfg
except: pass

def mount_google_drive():
    if (not os.path.exists('/mnt/MyDrive')):
        print('Mounting the GDrive')
        from google.colab import drive
        drive.mount('/mnt')
    else:
        print('GDrive already mounted')

if __name__ == '__main__':
    if (Cfg.data_on_drive and 'google.colab' in sys.modules):
        mount_google_drive()
#end-module: mount_google_drive.py
#@markdown ---


In [None]:
#@title #Setup { form-width: "10%" }
#@markdown Installation of the object detection API
# Clone the object detection model builder API.
import os

# Kill any running processes and disconnect symbolic links
if (os.path.exists('/mnt/MyDrive')):
    if (os.path.exists('/content/eval-images')):
        os.unlink('/content/eval-images')
    if (os.path.exists('/content/train-images')):
        os.unlink('/content/train-images')
    if (os.path.exists('/content/trained-model')):
        os.unlink('/content/trained-model')
    if (os.path.exists('/content/eval.log')):
        os.unlink('/content/eval.log')
    if (os.path.exists('/content/train.log')):
        os.unlink('/content/train.log')
    if (os.path.exists('/content/train.pid')):
        with open('/content/train.pid', 'r') as f:
            lines = f.reaadlines()
            os.system(f'kill -9 {lines[0]}')
        os.unlink('/content/train.pid')
    if (os.path.exists('/content/eval.pid')):
        with open('/content/eval.pid', 'r') as f:
            lines = f.reaadlines()
            os.system(f'kill -9 {lines[0]}')
        os.unlink('/content/eval.pid')

# Clone the repository and install environment
#program_dir = '/mnt/MyDrive/ODModelBuilderTF' if (os.path.exists('/mnt/MyDrive')) else '/usr/src/ODModelBuilderTF'
program_dir = '/usr/src/ODModelBuilderTF'
if (not os.path.isdir(program_dir)):
    !git clone --recursive https://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@github.com/darth-vader-lg/ODModelBuilderTF.git {program_dir}
!cd {program_dir}; python install_virtual_environment.py
#@markdown ---

In [None]:
#@title #Train { form-width: "10%" }
#@markdown Train of the model with the configured parameters.
# Train
model_type = Cfg.model_type
train_images_dir = Cfg.train_images_dir
eval_images_dir = Cfg.eval_images_dir
model_dir = Cfg.trained_model_dir
max_train_steps = Cfg.max_train_steps
batch_size = Cfg.batch_size
print ("Start of train.")
!nohup bash -c "cd {program_dir}; python3 main.py --eval_on_train_data --model_type='{model_type}' --train_images_dir={train_images_dir} --eval_images_dir={eval_images_dir} --model_dir={model_dir} --max_train_steps={max_train_steps} --batch_size={batch_size}" 2>&1> train.log & echo $! > train.pid
#@markdown ---


In [None]:
#@title #Evaluation { form-width: "10%" }
#@markdown Evaluate the model with the configured parameters.
# Evaluate
import os
checkpoint_dir = os.path.join('/mnt/MyDrive', Cfg.trained_model_dir) if Cfg.data_on_drive else Cfg.trained_model_dir
if (not os.path.isdir(checkpoint_dir)):
    print ("Waiting the start of train...")
    import time
    while (not os.path.isdir(checkpoint_dir)):
        time.sleep(1)
    print ("Train started.")
checkpoint_dir = Cfg.trained_model_dir
!nohup bash -c "cd '{program_dir}' && python3 main.py --max_train_steps=0 --checkpoint_dir={checkpoint_dir}" 2>&1> eval.log & echo $! > eval.pid
#@markdown ---


In [None]:
#@title #Tensorboard { form-width: "10%" }
#@markdown Display the tensorboard.
# Tensorboard (optional)
trained_model_dir = os.path.join('/mnt/MyDrive', Cfg.trained_model_dir) if Cfg.data_on_drive else Cfg.trained_model_dir
%load_ext tensorboard
%tensorboard --logdir {trained_model_dir}

In [None]:
#@title #Progress { form-width: "10%" }
#@markdown Show the progress of the training.
#@markdown It could be stopped by the user to export the model and after restarted.<br/>
#@markdown The train process it's never stopped in any case; it continues in background.
# Progress display
import os
import time
train_lines_count = 0
eval_lines_count = 0
train_printed = False
while (True):
    if (os.path.isfile('/content/train.log')):
        with open('/content/train.log', 'rt') as f:
            lines = f.read().splitlines()
            n_lines = len(lines)
            if (n_lines > train_lines_count):
                if (not train_printed):
                    print('=' * 80)
                    print('Train')
                    train_printed = True
                for i in range(train_lines_count, n_lines):
                    if (lines[i].startswith('INFO:tensorflow:')):
                        print(lines[i].replace('INFO:tensorflow:', ''))
                train_lines_count = n_lines    
    if (os.path.isfile('/content/eval.log')):
        with open('/content/eval.log', 'rt') as f:
            lines = f.read().splitlines()
            n_lines = len(lines)
            if (n_lines > eval_lines_count):
                if (train_printed):
                    print('=' * 80)
                    print('Eval')
                    train_printed = False
                for i in range(eval_lines_count, n_lines):
                    if (lines[i].startswith('INFO:tensorflow:')):
                        print(lines[i].replace('INFO:tensorflow:', ''))
                eval_lines_count = n_lines    
    time.sleep(5)
#@markdown ---


In [None]:
#@title #Export { form-width: "10%" }
#@markdown Export the model with the configured parameters.
# Train
trained_checkpoint_dir = Cfg.trained_model_dir
exported_model_dir = Cfg.exported_model_dir
exported_onnx = Cfg.exported_onnx
exported_frozen_graph = Cfg.exported_frozen_graph
!cd {program_dir} && python main.py --num_train_steps=0 --trained_checkpoint_dir={trained_checkpoint_dir} --output_directory={exported_model_dir} --onnx={exported_onnx} --frozen_graph={exported_frozen_graph}
#@markdown ---
