# Train the model

On this section is focus to create a model to detect the different classes.

Steps to create and evaluate the  model: 
1. Setup  

    1.1. Setup Paths  
    1.2. Get Pretrained Models  
    1.3. Create Label Map  
    1.4. Create TensorFlow records  
    1.5. Copy Model config to Training Folder  
    1.6. Update config for Transfer Learning
    
    
2. Train the Model
3. Evaluate the model
4. Freezing the Graph


## 1. Setup

* Define the path of the directories and file
* Download TF Object Detection
* Create label Map
* Create TF records
* Update config file for the transfer learning

### 1.1 Setup Paths

Setting up constants:
* Model name
* Pretrained Model name
* [Pretrained Model url](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md)
* TODO: change name of script name
* Label map name

In [None]:
# Model Name
CUSTOM_MODEL_NAME = 'my_ssd_mobnet_v5'

# Pretrained model name
PRETRAINED_MODEL_NAME = 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8'
PRETRAINED_MODEL_URL = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz'

# Define labels
labels = ['Cola', 'IceTea', 'Pepsi']

Setting paths: 
* Workspace
* Scripts
* API Model Name
* Annotation
* Image
* Model
* Pretrained model
* Checkpoint
* Output
* Protoc

In [None]:
import os 

# Paths
paths = {
    'WORKSPACE_PATH': os.path.join('Tensorflow', 'workspace'),
    'SCRIPTS_PATH': os.path.join('Tensorflow','scripts'),
    'APIMODEL_PATH': os.path.join('Tensorflow','models'),
    'ANNOTATION_PATH': os.path.join('Tensorflow', 'workspace','annotations'),
    'IMAGE_PATH': os.path.join('Tensorflow', 'workspace','images'),
    'MODEL_PATH': os.path.join('Tensorflow', 'workspace','models'),
    'PRETRAINED_MODEL_PATH': os.path.join('Tensorflow', 'workspace','pre-trained-models'),
    'CHECKPOINT_PATH': os.path.join('Tensorflow', 'workspace','models',CUSTOM_MODEL_NAME), 
    'OUTPUT_PATH': os.path.join('Tensorflow', 'workspace','models',CUSTOM_MODEL_NAME, 'export'),   
    'PROTOC_PATH':os.path.join('Tensorflow','protoc')
 }

# Create all the paths from paths dictionary
for path in paths.values():
    if not os.path.exists(path):
        os.mkdir(path)

### 1.2. Download the TFOD utils

Clone the repo TFOD utils. From this repo we will have access to:

* Generate tf records
* Update config file


In [None]:
# Define files for the scripts & labelmap
files = {
    'PIPELINE_CONFIG':os.path.join('Tensorflow', 'workspace','models', CUSTOM_MODEL_NAME, 'pipeline.config'),
    'TF_RECORD_SCRIPT': os.path.join(paths['SCRIPTS_PATH'], 'generate_tfrecord.py'),
    'LABELMAP_SCRIPT': os.path.join(paths['SCRIPTS_PATH'], 'generate_labelmap.py'), 
    'UPDATE_CONFIG_SCRIPT': os.path.join(paths['SCRIPTS_PATH'], 'update_config_file.py'),
    'LABELMAP': os.path.join(paths['ANNOTATION_PATH'], 'label_map.pbtxt')
}

# Clone repo for utils
if not any(os.scandir(paths['SCRIPTS_PATH'])):
    !git clone https://github.com/JPCLima/TFOD-utils {paths['SCRIPTS_PATH']}   

### 1.3. Get Pretrained Models

* Download the [TensorFlow Model Garden](https://github.com/tensorflow/models) from github
* Install Tensorflow Object Detection

In [None]:
# Get wget to download files
if os.name=='nt':
    !pip install wget
    import wget

# Download models and save them on the APIMODEL_PATH 
if not os.path.exists(os.path.join(paths['APIMODEL_PATH'], 'research', 'object_detection')):
    !git clone https://github.com/tensorflow/models {paths['APIMODEL_PATH']}

Install **TensorFlow Object Detection**

In [None]:
# Install Tensorflow Object Detection 
if os.name=='posix':  
    !apt-get install protobuf-compiler
    !cd Tensorflow/models/research && protoc object_detection/protos/*.proto --python_out=. && cp object_detection/packages/tf2/setup.py . && python -m pip install . 
    
if os.name=='nt':
    url="https://github.com/protocolbuffers/protobuf/releases/download/v3.15.6/protoc-3.15.6-win64.zip"
    wget.download(url)
    !move protoc-3.15.6-win64.zip {paths['PROTOC_PATH']}
    !cd {paths['PROTOC_PATH']} && tar -xf protoc-3.15.6-win64.zip
    os.environ['PATH'] += os.pathsep + os.path.abspath(os.path.join(paths['PROTOC_PATH'], 'bin'))   
    !cd Tensorflow/models/research && protoc object_detection/protos/*.proto --python_out=. && copy object_detection\\packages\\tf2\\setup.py setup.py && python setup.py build && python setup.py install
    !cd Tensorflow/models/research/slim && pip install -e . 

Script to verify if all the dependencies are correctly installed

In [None]:
VERIFICATION_SCRIPT = os.path.join(paths['APIMODEL_PATH'], 'research', 'object_detection', 'builders', 'model_builder_tf2_test.py')
# Verify Installation
!python {VERIFICATION_SCRIPT}

Verify if object detection can be imported

Install extra dependencies 

In [None]:
!pip install tensorflow --upgrade
!pip install PyYAML
!pip install pytz
!pip install tensorflow-gpu
!pip install Pillow

In [None]:
import object_detection

Download and import **Pretrained model**

In [None]:
if os.name =='posix':
    !wget {PRETRAINED_MODEL_URL}
    !mv {PRETRAINED_MODEL_NAME+'.tar.gz'} {paths['PRETRAINED_MODEL_PATH']}
    !cd {paths['PRETRAINED_MODEL_PATH']} && tar -zxvf {PRETRAINED_MODEL_NAME+'.tar.gz'}
if os.name == 'nt':
    wget.download(PRETRAINED_MODEL_URL)
    !move {PRETRAINED_MODEL_NAME+'.tar.gz'} {paths['PRETRAINED_MODEL_PATH']}
    !cd {paths['PRETRAINED_MODEL_PATH']} && tar -zxvf {PRETRAINED_MODEL_NAME+'.tar.gz'}

### 1.4. Create Label Map

Creating the label map. 
* The label names must to be the same as the label from the xml file 
* Each of the classes must to have unique ID

In [None]:
!python {files['LABELMAP_SCRIPT']} -l Cola IceTea Pepsi

### 1.5. Create TensorFlow records
Now it's time to convert the annotations into TFRecord format. On this step we are converting the .xml files to .record

The tar command is used to compress a group of files into an archive. 

* -z : compresses the tar file using gzip
* -x : Extracts the archive
* -v : Displays verbose information
* -f : creates archive with given filename


In [None]:
# Get the archive files
ARCHIVE_FILES = os.path.join(paths['IMAGE_PATH'], 'archive.tar.gz')
if os.path.exists(ARCHIVE_FILES):
  !tar -zxvf {ARCHIVE_FILES}

# Create the TF records
!python {files['TF_RECORD_SCRIPT']} -x {os.path.join(paths['IMAGE_PATH'], 'train')} -l {files['LABELMAP']} -o {os.path.join(paths['ANNOTATION_PATH'], 'train.record')} 
!python {files['TF_RECORD_SCRIPT']} -x {os.path.join(paths['IMAGE_PATH'], 'test')} -l {files['LABELMAP']} -o {os.path.join(paths['ANNOTATION_PATH'], 'test.record')} 

### 1.6. Update config for Transfer Learning

Copy the config file from pretrained model path to the checkpoint path

In [None]:
# Copy the pipeline to PRETRAINED_MODEL_NAME
# cp src_file dest_directory
if os.name =='posix':
    !cp {os.path.join(paths['PRETRAINED_MODEL_PATH'], PRETRAINED_MODEL_NAME, 'pipeline.config')} {os.path.join(paths['CHECKPOINT_PATH'])}
if os.name == 'nt':
    !copy {os.path.join(paths['PRETRAINED_MODEL_PATH'], PRETRAINED_MODEL_NAME, 'pipeline.config')} {os.path.join(paths['CHECKPOINT_PATH'])}

Change config file:
* Number of classes
* Batch Size
* Fine Tune Checkpoint
* Fine Tune Checkpoint type
* Label map path
* Annotations path

In [None]:
!python {files['UPDATE_CONFIG_SCRIPT']} -p {files['PIPELINE_CONFIG']} -m {files['LABELMAP']} -t {paths['PRETRAINED_MODEL_PATH']} -n {PRETRAINED_MODEL_NAME} -a {paths['ANNOTATION_PATH']} -c {len(labels)} -b {4}   

## 2. Train the Model

Get the training script and command

Inputs of the script model_main_tf2.py: 
* Model directory - where is the pipeline config 
* Path to the pipeline config
* Number of train steps 

In [None]:
TRAINING_SCRIPT = os.path.join(paths['APIMODEL_PATH'], 'research', 'object_detection', 'model_main_tf2.py')
command = "python {} --model_dir={} --pipeline_config_path={} --num_train_steps=3000".format(TRAINING_SCRIPT, 
                                                                                             paths['CHECKPOINT_PATH'],
                                                                                             files['PIPELINE_CONFIG'])
!{command}

## 3. Evaluate the Model

Get the command to run the model evaluation

Inputs of the script:
* Path for the model
* File path for the pipeline config
* Path for the checkpoint

In [None]:
command = "python {} --model_dir={} --pipeline_config_path={} --checkpoint_dir={}".format(TRAINING_SCRIPT, 
                                                                                          paths['CHECKPOINT_PATH'],
                                                                                          files['PIPELINE_CONFIG'], 
                                                                                          paths['CHECKPOINT_PATH'])
!{command}

### 3.1. Tensor Board

TODO: Change to command line code

In TensorFlow\workspace\models\my_ssd_mobnet_tuned run the command:

tensorboard --logdir=.

## 4. Freezing the Graph

Get command to freeze model... TODO: better explanation 

In [None]:
FREEZE_SCRIPT = os.path.join(paths['APIMODEL_PATH'], 'research', 'object_detection', 'exporter_main_v2.py ')
command = "python {} --input_type=image_tensor --pipeline_config_path={} --trained_checkpoint_dir={} --output_directory={}".format(FREEZE_SCRIPT ,
                                                                                                                                   files['PIPELINE_CONFIG'], 
                                                                                                                                   paths['CHECKPOINT_PATH'], 
print(command)                                                                                                                               paths['OUTPUT_PATH'])

Run the freeze command

In [None]:
!{command}

## 5 Zip the model
After train the model in Google Colab, that folder will be extracted on the models folder in the local machine

In [None]:
!tar -czf models.tar.gz {paths['CHECKPOINT_PATH']}

from google.colab import drive
drive.mount('/content/drive')