# This notebook will help you go through the below steps (1). model setting (2). training (3). exporting tflite
---
- The `training_demo` is user defined folder name under `workspace`. You can check `image_dataset\create_data.ipynb` for how to create your own training folder.
- In this notebook step, you have alreay finish the dataset prepared. If not, please go to `image_dataset\create_data.ipynb`.
- It is recommended to copy the cmds below and use CMD, PowerShell or terminal outside this notebook.
- All the commands below are needed excuted under `workspace\training_demo`.
- \<Advanced>: The more detail is in this link [tensorflow-object-detection-api-tutorial](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html).

# Training the Model
---

## Download Pre-Trained Model
- The model in this examples is the `ssd_mobilenet_v3_small_coco`
- All of the tensorflow1 pre-trained models are listed in [TensorFlow 1 Detection Model Zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md), and you can choose other models and download it.
- The download file is `*.tar.gz`, and please decompression it (e.g. 7zip, WinZIP, etc.).
- Move `ssd_mobilenet_v3_small_coco_2020_01_14` inside the folder `training_demo/pre-trained-models`

- <pre>training_demo/
├─ ...
├─ pre-trained-models/
│  └─ ssd_mobilenet_v3_small_coco_2020_01_14/
│     ├─ checkpoint
│     ├─ frozen_inference_graph.pb
│     ├─ pipeline.config
│     └─ ... 
└─ ...
</pre>

## Configure the Training Pipeline
- The parameters below is basing on your files/folders nameing. Please update them if any change.
    1. `training_dir`: The folder name of user defined working directory
    2. `my_model_directory_name`: The file location of user defined which save the training weights, checkpoints and *.config
    3. `fine_tune_checkpoint`: The file location of user downloaded pre-trained-models checkpoint
    4. `train_record_fname`: The file location of user created tfrecord for training
    5. `test_record_fname`: The file location of user created tfrecord for testing
    6. `label_map_pbtxt_fname`: The file location of label map
    7. `batch_size`: Increase/Decrease this value depending on the available memory
    8. `num_steps`: How many the training steps.  
- Please excute the below 2 blocks.
- This is for `ssd_mobilenet_v3_small_coco` pipeline.config, if you use other model, the pipeline.config maybe have minor different. However, these attributes should be the same and mattered.
- \<Advanced>: If you want to tunning more parameters, please update `pipeline.config` directly.

In [1]:
training_dir = 'training_demo_tf1'

my_model_directory_name = 'models/my_ssd_mobilenet_v3' 
fine_tune_checkpoint = 'pre-trained-models/ssd_mobilenet_v3_small_coco_2020_01_14/model.ckpt' 
train_record_fname = 'annotations/train.record' 
test_record_fname = 'annotations/test.record' 
label_map_pbtxt_fname = "annotations/label_map.pbtxt" 
batch_size = 32 
num_steps = 80000 

In [2]:
import tensorflow as tf
import regex as re
import shutil
import json
import os

home_path = os.getcwd() 
path_para_list = [my_model_directory_name, fine_tune_checkpoint, train_record_fname, test_record_fname, label_map_pbtxt_fname]
update_path_para_list = list(map(lambda x : os.path.join(home_path, training_dir, x), path_para_list))  #update the

def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())
num_classes = get_num_classes(update_path_para_list[4])
print('The number of class from label_map.pbtxt: {}'.format(num_classes))

def create_user_folder(dir_path):
    try:
        os.mkdir(dir_path)
    except OSError as error:
        print(error)
        print('skip create...')
def copy_user_file(src, dst):
    try:
        shutil.copy(src, dst)
    except OSError as error:
        print(error)
def update_config(src_fld, dst_fld):
    print('writing custom configuration file...')

    with open(os.path.join(src_fld, 'pipeline.config')) as f:
        s = f.read()
    print('The train config file is at: {}'.format(os.path.join(dst_fld, 'pipeline.config')))    
    with open(os.path.join(dst_fld, 'pipeline.config'), 'w') as f:
                
        # label_map_path
        s = re.sub(
            'label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)
        # Set training batch_size.
        s = re.sub('batch_size: [0-9]+',
                   'batch_size: {}'.format(batch_size), s)
        
        # Set training steps, num_steps
        if(not re.search('fine_tune_checkpoint: ".*?"', s)):
            s = re.sub('num_steps: [0-9]+',
                   'num_steps: {}\n  fine_tune_checkpoint: "{}"'.format(num_steps, fine_tune_checkpoint), s)
        else:
            s = re.sub('num_steps: [0-9]+',
                   'num_steps: {}'.format(num_steps), s)
        # fine_tune_checkpoint
            s = re.sub('fine_tune_checkpoint: ".*?"',
                   'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
        
        # Set number of classes num_classes.
        s = re.sub('num_classes: [0-9]+',
                   'num_classes: {}'.format(num_classes), s)
        #fine-tune checkpoint type
        s = re.sub(
            'fine_tune_checkpoint_type: "classification"', 'fine_tune_checkpoint_type: "{}"'.format('detection'), s)
        
        # tfrecord files train and test. (the train section must before test section)
        s = re.sub(
            '(input_path: ".*?)(PATH_TO_BE_CONFIGURED)(.*?")', 'input_path: "{}"'.format(train_record_fname), s, 1)
        s = re.sub(
            '(input_path: ".*?)(PATH_TO_BE_CONFIGURED)(.*?")', 'input_path: "{}"'.format(test_record_fname), s, 1)
        
        f.write(s)            
# create model_directory            
create_user_folder(update_path_para_list[0])
# copy pipeline.config
update_config(update_path_para_list[1].split(r'model.ckpt')[-2], update_path_para_list[0])
#copy_user_file(os.path.join(update_path_para_list[1].split(r'checkpoint')[-2], 'pipeline.config'), update_path_para_list[0]) 

The number of class from label_map.pbtxt: 2
writing custom configuration file...
The train config file is at: C:\Users\USER\Desktop\ML_tf2_object_detection_nu\workspace\training_demo_tf1\models/my_ssd_mobilenet_v3\pipeline.config


## Training the Model
- Please open CMD.exe Prompt or  PowerShell Prompt and `cd` inside your working folder, for example, `training_demo_tf1` folder.
    - for example: `cd ML_tf2_object_detection_nu\workspace\training_demo_tf1`
- Train Commands Help:
    - `--model_dir` is user defined folder which is user defined my_model_directory_name. The training processes and variables are saved in here.
    - `--pipeline_config_path` is the location of user defined pipeline.config.
- <pre> training_demo/
├─ ...
├─ models/
│  └─ my_ssd_mobilenet_v3/
│     └─ pipeline.config
└─ ...
</pre>
- The output will normally look like it has “frozen”, but DO NOT rush to cancel the process. The training outputs logs only every 100 steps by default, therefore if you wait for a while, you should see a log for the loss at step 100.


- <img src="train_exmple_plots/train_process_tf1.png" width="400" height="300">

- Tf1 version will run evaluating at the same time. If there is an error, please check [help](#id-IH)

In [None]:
python model_main.py --model_dir=models/my_ssd_mobilenet_v3 --pipeline_config_path=models/my_ssd_mobilenet_v3/pipeline.config

# Evaluating the Model (Optional)
---
- The tf1 version will run evaluating when training, so no need to use this command again.
- Please open another CMD.exe Prompt or PowerShell Prompt to run the command when running the train step.
- The operation step is same as training the model, and you need inside your working folder to excute the command.
    - `--checkpoint_dir` is the location of each training save point, and it is saved in models folder.
- If there is an error, please check [help](#id-IH)

In [None]:
python model_main.py --model_dir=models/my_ssd_mobilenet_v3 --pipeline_config_path=models/my_ssd_mobilenet_v3/pipeline.config --checkpoint_dir=models/my_ssd_mobilenet_v3

# Monitor Training Job Progress using TensorBoard
---
- Please open another CMD.exe Prompt or PowerShell Prompt to run the command when running the train step.
- You need inside your working folder to excute the command.
    - `--logdir` is the location of each training save point, and it is saved in models folder.
- Copy the URL and paste it on browser (except IE) as below:
- <img src="train_exmple_plots/tf_board_url.png" width="400" height="300">
- The board is as below:
- <img src="train_exmple_plots/tf_board_tf1.png" width="400" height="300">
- In tf1 version, there is a chance that can't open TensorBoard, please check [help](#id-IH)


In [None]:
tensorboard --logdir=models/my_ssd_mobilenet_v3 --host localhost --port 8088

# Export a TFLite inference graph
---
 To deploy on edge device, we should use this command (output TFLite inference graph).
- Please open another CMD.exe Prompt or PowerShell Prompt to run the command.
- The operation step is same as training the model, and you need inside your working folder to excute the command.
    - `--pipeline_config_path` is the location of user defined pipeline.config.
    - `--trained_checkpoint_prefix` is the location of each training save point, and it is saved in models folder. Please check the `models/` to update `.../model.ckpt-xxxx` the location and ckpt's step.
    - `--output_directory` is the user defined folder to save your output model graph, for example `tflite_infer_graph_XX`. In this way, it is easy to distinguish different model. 

In [None]:
python export_tflite_ssd_graph.py --pipeline_config_path models/my_ssd_mobilenet_v3/pipeline.config --trained_checkpoint_prefix models/my_ssd_mobilenet_v3/model.ckpt-568 --output_directory exported-models/inference_graph_tflite --add_postprocessing_op=true

# Export a Trained Model graph (Optional)
---
- Output the model as normal graph model at user defined folder, for example `.\exported-models\infer_graph` 

In [None]:
python export_inference_graph.py --pipeline_config_path models/my_ssd_mobilenet_v3/pipeline.config --trained_checkpoint_prefix models/my_ssd_mobilenet_v3/model.ckpt-1004 --output_directory exported-models/inference_graph_tflite --add_postprocessing_op=true

# Convert to tflite
---
- Please update your `source_graph_model_folder` and `output_tflite_location`.
- `dynamic_quant_enable` is dynamic quantization with 8-bit weights and activations. The model size will smaller, but the performance maybe worse.
- Please directly excute the next block.
- <pre> training_demo/
├─ ...
├─ exported-models/
│  └─ inference_graph_tflite/
│     ├─ tflite_graph.pb
│     ├─ tflite_graph.pbtxt   
│     └─ mobilenetv3_ssd_v1.tflite (the output file after excuting below)
└─ ...
</pre>

In [7]:
#source_graph_model_folder = "training_demo_tf1/exported-models/inference_graph_tflite/tflite_graph.pb"
#output_tflite_location = "training_demo_tf1/exported-models/inference_graph_tflite/mobilenetv3_ssd_v1.tflite"
source_graph_model_folder = "training_demo_tf1/exported-models/inference_graph_tflite/tflite_graph.pb"
output_tflite_location = "training_demo_tf1/exported-models/inference_graph_tflite/mobilenetv3_ssd_f16.tflite"
rep_dataset_loc = r"C:\Users\USERNAME\image_detection\image_dataset\COCO\images\val2017\*.jpg"
dynamic_quant_enable = False
float16_quant = True

In [8]:
import tensorflow as tf

input_arrays = ["normalized_input_image_tensor"]
output_arrays = ['TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3']
#output_arrays = ["detection_boxes", "detection_classes", "detection_scores", "num_boxes"]
#input_arrays = ["serving_default_input:0"]
#output_arrays = ['StatefulPartitionedCall','StatefulPartitionedCall:1','StatefulPartitionedCall:2','StatefulPartitionedCall:3']


converter = tf.lite.TFLiteConverter.from_frozen_graph(
  source_graph_model_folder, 
  input_arrays, 
  output_arrays, 
  input_shapes={'normalized_input_image_tensor':[1, 320, 320, 3]}
  )
converter.allow_custom_ops = True
if dynamic_quant_enable or float16_quant:
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
if float16_quant:   
    converter.target_spec.supported_types = [tf.float16]    

tflite_model = converter.convert()
open(output_tflite_location, "wb").write(tflite_model)

1976484

# Convert to int8 tflite
- We need tensorflow2 to convert to int8 tflite
- So please run this section in TF-2 python env

In [None]:
#import tensorflow as tf
import tensorflow.compat.v1 as tf
import random
import numpy as np
from glob import glob
import gc
import os

class my_tflite_trans():
    def __init__(self,source_model_folder, output_tflite_location, rep_dataset_loc):
        self.source_model_folder = source_model_folder
        self.output_tflite_location = output_tflite_location
        self.rep_dataset_loc = rep_dataset_loc

    def tflite_preprocess(self, image, height, width):
        if image.dtype != tf.float32:
            image = tf.image.convert_image_dtype(image, dtype=tf.float32)
    
        # Resize the image to the specified height and width.
        image = tf.expand_dims(image, 0)
        image = tf.compat.v1.image.resize_bilinear(image, [height, width],
                                       align_corners=False)
        #image = tf.squeeze(image, [0])
    
        image = tf.subtract(image, 0.5)
        image = tf.multiply(image, 2.0)
        return image
    
    def representative_dataset(self):
        files = glob(self.rep_dataset_loc)
        random.shuffle(files)
        files = files[:256]
        for file in files:
            #print(file)
            image = tf.io.read_file(file)
            image = tf.compat.v1.image.decode_jpeg(image)
            if image.get_shape()[2] == 3: # skip the not correct channel pictures
                image = self.tflite_preprocess(image, 320, 320)
            else:
                continue
            
            yield [image]
    
    #def representative_dataset(self):
    #    for _ in range(100):
    #        data = np.random.rand(1, 320, 320, 3)
    #        yield [data.astype(np.float32)]

    def run_tflite(self):
        
        input_arrays = ["normalized_input_image_tensor"]
        output_arrays = ['TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3']
        converter = tf.lite.TFLiteConverter.from_frozen_graph(
                    self.source_model_folder, 
                    input_arrays, 
                    output_arrays, 
                    input_shapes={'normalized_input_image_tensor':[1, 320, 320, 3]}
                    )
        # Refer to: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_on_mobile_tf2.md#step-2-convert-to-tflite
        #converter = tf.lite.TFLiteConverter.from_saved_model(self.source_model_folder)
        
        converter.optimizations = [tf.lite.Optimize.DEFAULT]
        converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8, tf.lite.OpsSet.TFLITE_BUILTINS]
        converter.representative_dataset = self.representative_dataset
        converter.inference_input_type = tf.int8  # or tf.uint8
        #converter.inference_output_type = tf.int8  # or tf.uint8
        
        converter.allow_custom_ops = True
        tflite_model = converter.convert()
        
        #del(converter)
        #del(representative_dataset)
        #gc.collect()
        
        # remove the original file
        #try:
        #    os.remove(self.output_tflite_location)
        #except OSError as e:
        #    print(e)
        
        # Save the model.
        with open(self.output_tflite_location, 'wb') as f:
            f.write(tflite_model)

In [None]:
x = my_tflite_trans(source_graph_model_folder, output_tflite_location, rep_dataset_loc)
x.run_tflite()

- The input shape can be changed as follow commands.

In [None]:
tflite_convert --graph_def_file=training_demo_tf1\exported-models\tf12tf2\tflite_graph.pb --output_file=training_demo_tf1\exported-models\tf12tf2\ssd_mobilenetv3_1126.tflite --input_arrays='normalized_input_image_tensor' --output_arrays='TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1','TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3' --input_shape=1,320,320,3 --allow_custom_ops

<a id="id-IH"></a>
# Issue Help
---