# Introduction

We will take the following steps to implement YOLOv4 on our custom data:
* Install TensorFlow2 Object Detection Dependencies
* Download Custom TensorFlow2 Object Detection Dataset
* Write Custom TensorFlow2 Object Detection Training Configuation
* Train Custom TensorFlow2 Object Detection Model
* Export Custom TensorFlow2 Object Detection Weights
* Use Trained TensorFlow2 Object Detection For Inference on Test Images

# Install TensorFlow2 Object Detection Dependencies

In [1]:
#!pip install -U --pre tensorflow=="2.2.0"

In [2]:
#import tensorflow as tf

# Why is the version not 2.2.0?
# Need to use a virtualenvironment to better manage which package used in the kernel?

#print(tf.__version__)

In [3]:
!python --version

Python 3.8.10


In [4]:
!pip install gitpython

You should consider upgrading via the '/anaconda/envs/py38_default/bin/python -m pip install --upgrade pip' command.[0m


In [5]:
import os
import pathlib
from git import Repo
import logging

logging.basicConfig(level=logging.INFO)

models_folder = os.path.join(os.getcwd(), "models")

# Clone the tensorflow models repository if it doesn't already exist in this folder
if "models" in pathlib.Path.cwd().parts:
    while "models" in pathlib.Path.cwd().parts:
        os.chdir('..')

    logging.info("The models have already been uploaded. Change working directory to the models folder.")

elif not pathlib.Path('models').exists():
    os.mkdir("./models")
    repo = Repo.clone_from(
        'http://RebSolcia:Clementinabookie18121998!@github.com/tensorflow/models.git',
        models_folder,
        depth=1,
        branch='master',
    )

    logging.info("The models have now been loaded from the tensorflow/models.git repo.")

# PyCoco library

The following steps are needed in order to avoid having problems with Pycoco.

1. Clone the official repository
2. Navigate to the PythonAPI folder and open the setup.py file
3. Edit line 12 to be extra_compile_args=[]. The rationale here is to remove the Clang specific arguments, which don’t work on MVCC.
4. On a CMD terminal at the PythonAPI folder, run:
python setup.py build_ext --inplace

In [6]:
pycoco_folder = os.path.join(os.getcwd(), "pycoco")

# Clone the pycoco repository if it doesn't exist. It is needed to avoid clashes with the TF2API
if "pycoco" in pathlib.Path.cwd().parts:
    while "pycoco" in pathlib.Path.cwd().parts:
        os.chdir('..')

    logging.info("The models have already been uploaded. Change working directory to the models folder.")

elif not pathlib.Path('pycoco').exists():
    os.mkdir("./pycoco")
    repo = Repo.clone_from(
        'http://RebSolcia:Clementinabookie18121998!@github.com/cocodataset/cocoapi.git',
        pycoco_folder, 
        branch="master"
    )

    logging.info("The models have now been loaded from the coco repo.")

This final command will build and install the package within your current environment, ready to go. To test if the installation succeeded, fire up Python and import it as: import pycocotools.
You might ask if we shouldn’t have added MVCC specific flags to replace Clang ones. I wondered the same, but it worked nicely, and no error has popped up for me with this. The original Clang flags, from what I can tell, are there to disable some warnings and force C99 compliance.

In [7]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pycoco/PythonAPI
!python setup.py build_ext --inplace

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pycoco/PythonAPI
running build_ext
skipping 'pycocotools/_mask.c' Cython extension (up-to-date)
copying build/lib.linux-x86_64-3.8/pycocotools/_mask.cpython-38-x86_64-linux-gnu.so -> pycocotools


# TF2 API: Changes to the Models folder

1. Navigate to “./research/object_detection/packages/tf2/” and edit the setup.py file. From the REQUIRED_PACKAGES list, delete the pycocotools reference (line 20). This change will prevent the installation process from trying to reinstall pycocotools from pip, which would fail and abort the whole process.
2. Copy this setup.py file to the “./research” folder, replacing the setup.py that was already there.
3. With the CMD open at the research folder, compile protocol buffers with:
protoc object_detection/protos/*.proto --python_out=.
4. If the previous command worked, nothing should appear. Not the most intuitive of things, I know. After it, run the following:
python -m pip install .

In [8]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research
!protoc object_detection/protos/*.proto --python_out=.
%cp /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/object_detection/packages/tf2/setup.py .
!pip install .

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research
Processing /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research
[33m  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.[0m


Collecting h5py~=3.1.0
  Using cached h5py-3.1.0-cp38-cp38-manylinux1_x86_64.whl (4.4 MB)
Collecting numpy>=1.15.4
  Using cached numpy-1.19.5-cp38-cp38-manylinux2010_x86_64.whl (14.9 MB)


Building wheels for collected packages: object-detection
  Building wheel for object-detection (setup.py) ... [?25ldone
[?25h  Created wheel for object-detection: filename=object_detection-0.1-py3-none-any.whl size=1679000 sha256=cc6987c3f5fc33b0e0d9572ba16b044db8dc7d37f98ca337e0db2acc0a94127d
  Stored in directory: /tmp/pip-ephem-wheel-cache-j_mu11xt/wheels/a4/27/31/b41a2f9b118ebb35237b34adc3f408b0c60bd7f122d0a7eb79
Successfully built object-detection
Installing collected packages: numpy, h5py, object-detection
  Attempting uninstall: numpy
    Found existing installation: numpy 1.21.4
    Uninstalling numpy-1.21.4:
      Successfully uninstalled numpy-1.21.4
  Attempting uninstall: h5py
    Found existing installation: h5py 2.9.0
    Uninstalling h5py-2.9.0:
      Successfully uninstalled h5py-2.9.0
  Attempting uninstall: object-detection
    Found existing installation: object-detection 0.1
    Uninstalling object-detection-0.1:
      Successfully uninstalled object-detection-0.1

Make sure tu uninstall h5py and re-install it in the 2.9 version

In [9]:
!pip uninstall h5py -y

Found existing installation: h5py 3.1.0
Uninstalling h5py-3.1.0:
  Successfully uninstalled h5py-3.1.0


In [10]:
!pip install h5py==2.9

Collecting h5py==2.9
  Using cached h5py-2.9.0-cp38-cp38-manylinux1_x86_64.whl (2.8 MB)
Installing collected packages: h5py
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.6.1 requires h5py~=3.1.0, but you have h5py 2.9.0 which is incompatible.
tensorflow-gpu 2.5.0 requires grpcio~=1.34.0, but you have grpcio 1.41.1 which is incompatible.
tensorflow-gpu 2.5.0 requires h5py~=3.1.0, but you have h5py 2.9.0 which is incompatible.
tensorflow-gpu 2.5.0 requires tensorflow-estimator<2.6.0,>=2.5.0rc0, but you have tensorflow-estimator 2.2.0 which is incompatible.[0m
Successfully installed h5py-2.9.0
You should consider upgrading via the '/anaconda/envs/py38_default/bin/python -m pip install --upgrade pip' command.[0m


# Fire the model up

Once everything is installed, import all the libraries that are needed and launch a sample training to check that everything works smoothly. 

In [11]:
import matplotlib
import matplotlib.pyplot as plt

import os
import random
import io
import imageio
import glob
import scipy.misc
import numpy as np
from six import BytesIO
from PIL import Image, ImageDraw, ImageFont
from IPython.display import display, Javascript
from IPython.display import Image as IPyImage

import tensorflow as tf

from models.research.object_detection.utils import label_map_util
from models.research.object_detection.utils import config_util
from models.research.object_detection.utils import visualization_utils as viz_utils
from models.research.object_detection.builders import model_builder

%matplotlib inline

Run a pip freeze to see whether tensorflow-gpu is installed, and run the test to see everything works smoothly.

In [12]:
#run model builder test
!python /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/object_detection/builders/model_builder_tf2_test.py

Running tests under Python 3.8.10: /anaconda/envs/py38_default/bin/python
[ RUN      ] ModelBuilderTF2Test.test_create_center_net_deepmac
2021-11-17 14:29:41.150092: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-17 14:30:01.115639: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11567 MB memory:  -> device: 0, name: Tesla K80, pci bus id: 0001:00:00.0, compute capability: 3.7
  logging.warn(('Building experimental DeepMAC meta-arch.'
W1117 14:30:01.711951 140271730762688 model_builder.py:1100] Building experimental DeepMAC meta-arch. Some features may be omitted.
INFO:tensorflow:time(__main__.ModelBuilderTF2Test.test_create_center_net_deepmac

I1117 14:30:06.866880 140271730762688 efficientnet_model.py:147] round_filter input=16 output=16
I1117 14:30:06.866980 140271730762688 efficientnet_model.py:147] round_filter input=24 output=24
I1117 14:30:07.017105 140271730762688 efficientnet_model.py:147] round_filter input=24 output=24
I1117 14:30:07.017230 140271730762688 efficientnet_model.py:147] round_filter input=40 output=40
I1117 14:30:07.164209 140271730762688 efficientnet_model.py:147] round_filter input=40 output=40
I1117 14:30:07.164330 140271730762688 efficientnet_model.py:147] round_filter input=80 output=80
I1117 14:30:07.535388 140271730762688 efficientnet_model.py:147] round_filter input=80 output=80
I1117 14:30:07.535560 140271730762688 efficientnet_model.py:147] round_filter input=112 output=112
I1117 14:30:07.758821 140271730762688 efficientnet_model.py:147] round_filter input=112 output=112
I1117 14:30:07.758981 140271730762688 efficientnet_model.py:147] round_filter input=192 output=192
I1117 14:30:08.055815 14

I1117 14:30:11.291751 140271730762688 efficientnet_model.py:147] round_filter input=112 output=120
I1117 14:30:11.291911 140271730762688 efficientnet_model.py:147] round_filter input=192 output=208
I1117 14:30:11.660157 140271730762688 efficientnet_model.py:147] round_filter input=192 output=208
I1117 14:30:11.660293 140271730762688 efficientnet_model.py:147] round_filter input=320 output=352
I1117 14:30:11.804625 140271730762688 efficientnet_model.py:147] round_filter input=1280 output=1408
I1117 14:30:11.832585 140271730762688 efficientnet_model.py:458] Building model efficientnet with params ModelConfig(width_coefficient=1.1, depth_coefficient=1.2, resolution=260, dropout_rate=0.3, blocks=(BlockConfig(input_filters=32, output_filters=16, kernel_size=3, num_repeat=1, expand_ratio=1, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=16, output_filters=24, kernel_size=3, num_repeat=2, expand_ratio=6, strides=(2, 2), se_rati

I1117 14:30:16.538998 140271730762688 ssd_efficientnet_bifpn_feature_extractor.py:142] EfficientDet EfficientNet backbone version: efficientnet-b5
I1117 14:30:16.539172 140271730762688 ssd_efficientnet_bifpn_feature_extractor.py:144] EfficientDet BiFPN num filters: 288
I1117 14:30:16.539244 140271730762688 ssd_efficientnet_bifpn_feature_extractor.py:145] EfficientDet BiFPN num iterations: 7
I1117 14:30:16.540906 140271730762688 efficientnet_model.py:147] round_filter input=32 output=48
I1117 14:30:16.556143 140271730762688 efficientnet_model.py:147] round_filter input=32 output=48
I1117 14:30:16.556233 140271730762688 efficientnet_model.py:147] round_filter input=16 output=24
I1117 14:30:16.729480 140271730762688 efficientnet_model.py:147] round_filter input=16 output=24
I1117 14:30:16.729623 140271730762688 efficientnet_model.py:147] round_filter input=24 output=40
I1117 14:30:17.095922 140271730762688 efficientnet_model.py:147] round_filter input=24 output=40
I1117 14:30:17.096084 14

I1117 14:30:23.394538 140271730762688 efficientnet_model.py:147] round_filter input=16 output=32
I1117 14:30:23.394654 140271730762688 efficientnet_model.py:147] round_filter input=24 output=48
I1117 14:30:23.911428 140271730762688 efficientnet_model.py:147] round_filter input=24 output=48
I1117 14:30:23.911602 140271730762688 efficientnet_model.py:147] round_filter input=40 output=80
I1117 14:30:24.731386 140271730762688 efficientnet_model.py:147] round_filter input=40 output=80
I1117 14:30:24.731559 140271730762688 efficientnet_model.py:147] round_filter input=80 output=160
I1117 14:30:25.465126 140271730762688 efficientnet_model.py:147] round_filter input=80 output=160
I1117 14:30:25.465283 140271730762688 efficientnet_model.py:147] round_filter input=112 output=224
I1117 14:30:26.212736 140271730762688 efficientnet_model.py:147] round_filter input=112 output=224
I1117 14:30:26.212894 140271730762688 efficientnet_model.py:147] round_filter input=192 output=384
I1117 14:30:27.186359 

Change current directory to be sure everything works smoothly. This process of directory change will take place often to ensure code compatibility when constructing paths. 

# Import the data

Eventually, remember to change the names of the files so that they are compatible with yours. 

In [13]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training


In [14]:
import zipfile

if 'output_tfrecords' not in os.listdir("pictures"):
    with zipfile.ZipFile("./pictures/output_tfrecords.zip","r") as zip_ref:
        zip_ref.extractall("./pictures/")
else:
    pass

picture_files_directory = "/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pictures/"

test_record_fname = os.path.join(picture_files_directory,"output_tfrecords/valid/merged_logos.tfrecord")
train_record_fname = os.path.join(picture_files_directory,"output_tfrecords/train/merged_logos.tfrecord")
label_map_pbtxt_fname = os.path.join(picture_files_directory, "output_tfrecords/train/logos_label_map.pbtxt")

print(train_record_fname,label_map_pbtxt_fname, sep="\n")

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pictures/output_tfrecords/train/merged_logos.tfrecord
/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pictures/output_tfrecords/train/logos_label_map.pbtxt


# Configure Custom TensorFlow2 Object Detection Training Configuration




> In this section you can specify any model in the [TF2 OD model zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md) and set up your training configuration.



In [15]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training


In [16]:
!pip install prettyprinter

You should consider upgrading via the '/anaconda/envs/py38_default/bin/python -m pip install --upgrade pip' command.[0m


In [17]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training


In [18]:
# Step 1: import model settings
# For each model, this file returns important info to actually use the model

from ModelSettings import Model_Setting
from prettyprinter import pprint

MODELS_CONFIG = Model_Setting()

pprint(MODELS_CONFIG)

{
    'efficientdet-d0': {
        'model_name': 'efficientdet_d0_coco17_tpu-32',
        'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config',
        'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz',
        'batch_size': 16
    },
    'faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8': {
        'model_name': 'faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8',
        'base_pipeline_file': 'faster_rcnn_resnet50_v1_640x640_coco17_tpu-8.config',
        'pretrained_checkpoint': 'faster_rcnn_inception_resnet_v2_640x640_coco17_tpu-8.tar.gz',
        'batch_size': 2
    },
    'ssd_resnet152_v1': {
        'model_name': 'ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8',
        'base_pipeline_file': 'ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.config',
        'pretrained_checkpoint': 'ssd_resnet152_v1_fpn_1024x1024_coco17_tpu-8.tar.gz',
        'batch_size': 2
    }
}


In [19]:
# Step 2: chose the model and extract relevant info

chosen_model = 'efficientdet-d0'

model_name = MODELS_CONFIG[chosen_model]['model_name']
pretrained_checkpoint = MODELS_CONFIG[chosen_model]['pretrained_checkpoint']
base_pipeline_file = MODELS_CONFIG[chosen_model]['base_pipeline_file']
batch_size = MODELS_CONFIG[chosen_model]['batch_size'] #if you can fit a large batch in memory, it may speed up your training

In [20]:
# The more steps, the longer the training. 
# Increase if your loss function is still decreasing and validation metrics are increasing. 
num_steps = 400000

#Perform evaluation after so many steps
num_eval_steps = 3000 

#### Deploy folder structure

Inside the "research" folder, we will create a "deploy" folder in which we will dump all the data related to the model used and its specific configuration. <br>
For this reason, the structure of the deploy folder is as follows:

Deploy:<br>
&nbsp;&nbsp;&nbsp;&nbsp;Model A:<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Config 1<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;Config 2<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;...<br>

This means that, once we choose a model:
1. if there is no folder within "deploy" with the model name, then create it and create the the Config 1 folder within the model folder.
2. if there is a folder with the name of the model, we need to check if the configurtion of the current model is the same as the one in the folder. If not, create a new Config folder.

In [21]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research


In [22]:
# If it does not exist already, create the 'deploy' folder inside training/models/research

main_deploy_folder = '/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/deploy'

if "deploy" not in os.listdir(os.getcwd()):
    os.mkdir(main_deploy_folder)

In [23]:
import re

def extract_configs_for_model(chosen_model):
    
    # This is a dict with config folder names as keys and values of the config as values
    folder_to_values = dict()
    
    model_path = os.path.join(main_deploy_folder, chosen_model)
    
    for config_folder in os.listdir(model_path):
        if not config_folder == ".ipynb_checkpoints":
            print(config_folder)
            config_path = os.path.join(model_path, config_folder)

            config_file = os.path.join(config_path,r'pipeline_file.config')
            config_values = list()

            with open(config_file) as f:
                file = f.read()

                # Extract all values except the path of the data
                # This mean that if we train the same config of a model on a different version of the data, this will overwrite the results
                # TODO: add path of the data as well?
                # TODO: add fine tune check points?
                config_values.append(re.search('batch_size: [0-9]+', file).group()[len('batch_size: '):])
                config_values.append(re.search('num_steps: [0-9]+', file).group()[len('num_steps: '):])
                config_values.append(re.search('num_classes: [0-9]+', file).group()[len('num_classes: '):])

            folder_to_values[config_folder] = config_values
        
    return folder_to_values

In [24]:
def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())

In [25]:
num_classes = get_num_classes(label_map_pbtxt_fname)
print(num_classes)

9


In [26]:
current_config = [
    str(batch_size),
    str(num_steps),
    str(num_classes)
]

current_config

['16', '400000', '9']

In [27]:
"""model_name = 'ModelA'"""

"model_name = 'ModelA'"

In [28]:
#current_config = 

In [29]:
chosen_model

'efficientdet-d0'

In [30]:
# if model never used, then create folder for the model and for the current config, the latter inside the former

def update_repo_structure(chosen_model):
    
    model_folder = main_deploy_folder + '/' + chosen_model
    
    # TODO: it has to be folder, not file
    if chosen_model not in os.listdir(main_deploy_folder):
        # Case 1: model never used
        os.mkdir(model_folder)

        config_folder = model_folder + '/config_1'
        os.mkdir(config_folder)

        print('case1')
        print(config_folder)

    else:
        # Case 2: model already used

        list_configs = extract_configs_for_model(chosen_model)
        print(list_configs)

        if current_config in list(list_configs.values()):
            
            # Case A: Specifics configs per model already used
            for key in list(list_configs.keys()):
                if list_configs[key] == current_config:
                    config_folder = key
                    print('case a')
                    print(config_folder)

        else:
            # Case B: new configs
            config_folder = model_folder + f'/config_{len(list_configs)+1}'
            os.mkdir(config_folder)
            print('case b')
            print(config_folder)
            
    return config_folder

In [31]:
# Obtain the proper config folder to use in the next cells 

config_subfolder = update_repo_structure(chosen_model)

config_1
config_6
config_3
config_4
config_2
config_5
{'config_1': ['15', '40010', '9'], 'config_6': ['16', '400000', '9'], 'config_3': ['15', '40010', '9'], 'config_4': ['16', '40080', '9'], 'config_2': ['13', '40020', '9'], 'config_5': ['16', '40100', '9']}
case a
config_6


In [32]:
config_folder = os.path.join(os.path.join(main_deploy_folder, chosen_model),config_subfolder)
config_folder

'/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/deploy/efficientdet-d0/config_6'

In [33]:
# Step 3.a: using info from step 2, download the weights of the model

import tarfile
import requests

download_tar = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' + pretrained_checkpoint

file_to_be_opened = os.path.join(config_folder, pretrained_checkpoint)

# Unzip the tar.gz
response = requests.get(download_tar, stream=True)
if response.status_code == 200:
    with open(file_to_be_opened, 'wb') as f:
        f.write(response.raw.read())

tar = tarfile.open(file_to_be_opened)
tar.extractall(path=config_folder)
tar.close()

# TODO: once the tar has been extracted, delete the tar file

In [34]:
# Step 3.b: using info from step 2, download base training configuration file

download_config = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' + base_pipeline_file

abrir = os.path.join(config_folder, base_pipeline_file)

response = requests.get(download_config, stream=True)
if response.status_code == 200:
    with open(abrir, 'wb') as f:
        f.write(response.content)

In [35]:
pipeline_fname = os.path.join(config_folder, base_pipeline_file)
print(pipeline_fname)

fine_tune_checkpoint = os.path.join(config_folder, model_name,"checkpoint", "ckpt-0")
print(fine_tune_checkpoint)

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/deploy/efficientdet-d0/config_6/ssd_efficientdet_d0_512x512_coco17_tpu-8.config
/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/deploy/efficientdet-d0/config_6/efficientdet_d0_coco17_tpu-32/checkpoint/ckpt-0


In [36]:
#write custom configuration file by slotting our dataset, model checkpoint, and training parameters into the base pipeline file

import re

print('writing custom configuration file')

with open(pipeline_fname) as f:
    s = f.read()

with open(os.path.join(config_folder, r'pipeline_file.config'), 'w') as f:
    
    # fine_tune_checkpoint
    s = re.sub('fine_tune_checkpoint: ".*?"',
               f'fine_tune_checkpoint: "{fine_tune_checkpoint}"', s)
    
    logging.info("Written fine tune checkpoint")
    
    # tfrecord files train and test.
    s = re.sub(
        '(input_path: ".*?)(PATH_TO_BE_CONFIGURED/train)(.*?")', f'input_path: "{train_record_fname}"', s)
    s = re.sub(
        '(input_path: ".*?)(PATH_TO_BE_CONFIGURED/val)(.*?")', f'input_path: "{test_record_fname}"', s)
    
    logging.info("Written input path")

    # label_map_path
    s = re.sub(
        'label_map_path: ".*?"', f'label_map_path: "{label_map_pbtxt_fname}"', s)
    
    logging.info("Written label map")

    # Set training batch_size.
    s = re.sub('batch_size: [0-9]+',
               f'batch_size: {batch_size}', s)

    # Set training steps, num_steps
    s = re.sub('num_steps: [0-9]+',
               f'num_steps: {num_steps}', s)
    
    # Set number of classes num_classes.
    s = re.sub('num_classes: [0-9]+',
               f'num_classes: {num_classes}', s)
    
    # Set number of classes num_classes.
    s = re.sub('learning_rate_base: [a-z.0-9-]+',
               f'learning_rate_base: 0.08', s)
    
    # Set number of classes num_classes.
    s = re.sub('warmup_learning_rate: [a-z.0-9-]+',
               f'warmup_learning_rate: 0.001', s)
    
    #fine-tune checkpoint type
    s = re.sub(
        'fine_tune_checkpoint_type: "classification"', 
        'fine_tune_checkpoint_type: "{}"'.format('detection'), s)
    
    f.write(s)

INFO:root:Written fine tune checkpoint
INFO:root:Written input path
INFO:root:Written label map


writing custom configuration file


In [37]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training


In [38]:
pipeline_file = os.path.join(config_folder, 'pipeline_file.config')
pipeline_file

'/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/deploy/efficientdet-d0/config_6/pipeline_file.config'

In [39]:
if "TENSOR_RESULTS" not in os.listdir(os.getcwd()):
    os.mkdir(os.path.join(os.getcwd(),"TENSOR_RESULTS"))
    logging.info("Creating the directory TENSOR_RESULTS because it did not exist") 
else:
    logging.info("The directory TENSOR_RESULTS is already present, files will be stored there") 

INFO:root:The directory TENSOR_RESULTS is already present, files will be stored there


In [40]:
tensor_results_directory = '/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS'

model_run_directory = os.path.join('/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS',
                                  chosen_model)

if chosen_model not in os.listdir(tensor_results_directory):
    try:
        os.mkdir(os.path.join(tensor_results_directory, chosen_model))
        logging.info(f"The folder model_run_directory is set to be: \n {model_run_directory}")
    except FileExistsError:
        logging.info(f"FILEEXISTSERROR: The folder model_run_directory is set to be: \n {model_run_directory}")
else:
    logging.info(f"The folder model_run_directory WAS ALREADY PRESENT and is set to be: \n {model_run_directory}")

model_dir = os.path.join(model_run_directory, config_subfolder.split("/")[-1])

if config_subfolder.split("/")[-1] not in os.listdir(model_run_directory):
    try:
        os.mkdir(os.path.join(model_run_directory, config_subfolder))
        logging.info(f"The folder model_dir is set to be: \n {model_dir}")
    except FileExistsError:
        logging.info(f"FILEEXISTSERROR: The folder model_dir WAS ALREADY PRESENT and is set to be: \n {model_dir}")
else:
    logging.info(f"The folder model_dir WAS ALREADY PRESENT and is set to be: \n {model_dir}")

INFO:root:The folder model_run_directory WAS ALREADY PRESENT and is set to be: 
 /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS/efficientdet-d0
INFO:root:The folder model_dir WAS ALREADY PRESENT and is set to be: 
 /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS/efficientdet-d0/config_6


# Train Custom TF2 Object Detector

* pipeline_file: defined above in writing custom training configuration
* model_dir: the location tensorboard logs and saved model checkpoints will save to
* num_train_steps: how long to train for
* num_eval_steps: perform eval on validation set after this many steps







In [41]:
print("PIPELINE FILE: " + str(pipeline_file), 
      "MODEL DIRECTORY: " + str(model_dir), 
      "NUMBER OF STEPS: " + str(num_steps), 
      "NUMBER OF EVALUATION STEPS: " + str(num_eval_steps), 
      sep="\n\n")

PIPELINE FILE: /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/deploy/efficientdet-d0/config_6/pipeline_file.config

MODEL DIRECTORY: /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS/efficientdet-d0/config_6

NUMBER OF STEPS: 400000

NUMBER OF EVALUATION STEPS: 3000


In [42]:
!pip install --upgrade numpy

Collecting numpy
  Using cached numpy-1.21.4-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
Installing collected packages: numpy
  Attempting uninstall: numpy
    Found existing installation: numpy 1.19.5
    Uninstalling numpy-1.19.5:
      Successfully uninstalled numpy-1.19.5
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fbprophet 0.7.1 requires cmdstanpy==0.9.5, which is not installed.
fbprophet 0.7.1 requires setuptools-git>=1.2, which is not installed.
tensorflow 2.6.1 requires h5py~=3.1.0, but you have h5py 2.9.0 which is incompatible.
tensorflow 2.6.1 requires numpy~=1.19.2, but you have numpy 1.21.4 which is incompatible.
tensorflow-gpu 2.5.0 requires grpcio~=1.34.0, but you have grpcio 1.41.1 which is incompatible.
tensorflow-gpu 2.5.0 requires h5py~=3.1.0, but you have h5py 2.9.0 which is incompatible.
tensorflow-gpu 2.5.0 

In [43]:
gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(gpus[0], True)

In [44]:
!python -u /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={pipeline_file} \
    --model_dir={model_dir} \
    --alsologtostderr \
    --num_train_steps={num_steps} \
    --sample_1_of_n_eval_examples=1 \
    --num_eval_steps={num_eval_steps} 2>&1 | sed -e "/nan/q9";echo $? > exitcode

2021-11-17 14:30:58.524391: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-17 14:30:59.005794: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11567 MB memory:  -> device: 0, name: Tesla K80, pci bus id: 0001:00:00.0, compute capability: 3.7
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
I1117 14:30:59.142397 140180763812800 mirrored_strategy.py:369] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:GPU:0',)
INFO:tensorflow:Maybe overwriting train_steps: 400000
I1117 14:30:59.146613 140180763812800 config_util.py:552] Maybe overwriting train_steps: 400000
INFO:te

INFO:tensorflow:Reading unweighted datasets: ['/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pictures/output_tfrecords/train/merged_logos.tfrecord']
I1117 14:31:01.349594 140180763812800 dataset_builder.py:163] Reading unweighted datasets: ['/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pictures/output_tfrecords/train/merged_logos.tfrecord']
INFO:tensorflow:Reading record datasets for input file: ['/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pictures/output_tfrecords/train/merged_logos.tfrecord']
I1117 14:31:01.349800 140180763812800 dataset_builder.py:80] Reading record datasets for input file: ['/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/pictures/output_tfrecords/train/merged_logos.tfrecord']
INFO:tensorflow:Number of filenames to read: 1
I1117 14:31:01.349889 140180763812800 dataset_builder.py:81] Number of filenames to read: 1
W1117 14:31:01.349958 140180763812800 da

INFO:tensorflow:Step 36400 per-step time 2.752s
I1117 14:53:10.615199 140180763812800 model_lib_v2.py:698] Step 36400 per-step time 2.752s
INFO:tensorflow:{'Loss/classification_loss': 0.27686253,
 'Loss/localization_loss': 0.14377001,
 'Loss/regularization_loss': 0.071173504,
 'Loss/total_loss': 0.49180606,
 'learning_rate': 0.077464215}
I1117 14:53:10.615471 140180763812800 model_lib_v2.py:701] {'Loss/classification_loss': 0.27686253,
 'Loss/localization_loss': 0.14377001,
 'Loss/regularization_loss': 0.071173504,
 'Loss/total_loss': 0.49180606,
 'learning_rate': 0.077464215}
INFO:tensorflow:Step 36500 per-step time 2.770s
I1117 14:57:47.622115 140180763812800 model_lib_v2.py:698] Step 36500 per-step time 2.770s
INFO:tensorflow:{'Loss/classification_loss': 0.27966288,
 'Loss/localization_loss': 0.24064836,
 'Loss/regularization_loss': 0.07121254,
 'Loss/total_loss': 0.59152377,
 'learning_rate': 0.077449396}
I1117 14:57:47.622406 140180763812800 model_lib_v2.py:701] {'Loss/classificat

INFO:tensorflow:Step 37900 per-step time 2.767s
I1117 16:02:22.294371 140180763812800 model_lib_v2.py:698] Step 37900 per-step time 2.767s
INFO:tensorflow:{'Loss/classification_loss': 0.25367445,
 'Loss/localization_loss': 0.17999874,
 'Loss/regularization_loss': 0.071216345,
 'Loss/total_loss': 0.50488955,
 'learning_rate': 0.07723752}
I1117 16:02:22.294661 140180763812800 model_lib_v2.py:701] {'Loss/classification_loss': 0.25367445,
 'Loss/localization_loss': 0.17999874,
 'Loss/regularization_loss': 0.071216345,
 'Loss/total_loss': 0.50488955,
 'learning_rate': 0.07723752}
INFO:tensorflow:Step 38000 per-step time 2.775s
I1117 16:06:59.779243 140180763812800 model_lib_v2.py:698] Step 38000 per-step time 2.775s
INFO:tensorflow:{'Loss/classification_loss': 0.30014488,
 'Loss/localization_loss': 0.13751608,
 'Loss/regularization_loss': 0.071184434,
 'Loss/total_loss': 0.5088454,
 'learning_rate': 0.07722207}
I1117 16:06:59.779517 140180763812800 model_lib_v2.py:701] {'Loss/classification

In [45]:
#run model evaluation to obtain performance metrics

!python /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/object_detection/model_main_tf2.py \
    --pipeline_config_path={pipeline_file} \
    --model_dir={model_dir} \
    --checkpoint_dir={model_dir} \

#Not yet implemented for EfficientDet

W1117 16:57:55.216971 139901781500864 model_lib_v2.py:1081] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: None
I1117 16:57:55.217181 139901781500864 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I1117 16:57:55.217309 139901781500864 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I1117 16:57:55.217396 139901781500864 config_util.py:552] Maybe overwriting eval_num_epochs: 1
W1117 16:57:55.217512 139901781500864 model_lib_v2.py:1099] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
2021-11-17 16:57:55.221450: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions

Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W1117 16:58:01.523155 139901781500864 deprecation.py:339] From /anaconda/envs/py38_default/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:206: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
Instructions for updating:
Use `tf.cast` instead.
W1117 16:58:02.729033 139901781500864 deprecation.py:339] From /anaconda/envs/py38_default/lib/python3.8/site-packages/tensorflow/python/autograph/impl/api.py:464: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Waiting for new checkpoint at /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS/efficientdet-d0/config_6
I1117 

I1117 17:09:18.988135 139901781500864 model_lib_v2.py:1010] 	+ DetectionBoxes_Recall/AR@1: 0.627302
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@10: 0.688869
I1117 17:09:18.989261 139901781500864 model_lib_v2.py:1010] 	+ DetectionBoxes_Recall/AR@10: 0.688869
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100: 0.692615
I1117 17:09:18.990267 139901781500864 model_lib_v2.py:1010] 	+ DetectionBoxes_Recall/AR@100: 0.692615
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100 (small): 0.464991
I1117 17:09:18.991269 139901781500864 model_lib_v2.py:1010] 	+ DetectionBoxes_Recall/AR@100 (small): 0.464991
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100 (medium): 0.757221
I1117 17:09:18.992248 139901781500864 model_lib_v2.py:1010] 	+ DetectionBoxes_Recall/AR@100 (medium): 0.757221
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100 (large): 0.707454
I1117 17:09:18.993275 139901781500864 model_lib_v2.py:1010] 	+ DetectionBoxes_Recall/AR@100 (large): 0.707454
INFO:tensorflow:	+ Loss/localization_loss: 0.081886


# Run Tensorboard
Remember to pass the path into the magic command as follows https://stackoverflow.com/questions/14409167/how-to-pass-a-variable-to-magic-%C2%B4run%C2%B4-function-in-ipython

In [46]:
current_training_directory = os.path.join(model_dir, "train")

## Exporting a Trained Inference Graph
Still to come for TF2 models, we will be updating this Colab notebook accordingly as the functionality is added. 

In [47]:
#see where our model saved weights
%ls $model_dir

checkpoint                   ckpt-38.data-00000-of-00001
ckpt-34.data-00000-of-00001  ckpt-38.index
ckpt-34.index                ckpt-39.data-00000-of-00001
ckpt-35.data-00000-of-00001  ckpt-39.index
ckpt-35.index                ckpt-40.data-00000-of-00001
ckpt-36.data-00000-of-00001  ckpt-40.index
ckpt-36.index                [0m[01;34meval[0m/
ckpt-37.data-00000-of-00001  [01;34mtrain[0m/
ckpt-37.index


In [48]:
%cd /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/

/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training


In [49]:
model_dir

'/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS/efficientdet-d0/config_6'

In [50]:
#run conversion script
import re
import numpy as np

fine_tuned_directory = '/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/FINE_TUNED_MODEL'

if "FINE_TUNED_MODEL" not in os.listdir(os.getcwd()):
    os.mkdir(fine_tuned_directory)
    logging.info("Creating the directory TENSOR_RESULTS because it did not exist") 
else:
    logging.info("The directory FINE_TUNED_MODEL is already present, files will be stored there")
    
model_fine_tuned_directory = os.path.join(fine_tuned_directory, chosen_model)

if chosen_model not in os.listdir(fine_tuned_directory):
    try:
        os.mkdir(model_fine_tuned_directory)
        logging.info(f"The folder model_fine_tuned_directory is set to be: \n {model_fine_tuned_directory}")
    except FileExistsError:
        logging.info(f"FILEEXISTSERROR: The folder model_fine_tuned_directory is set to be: \n {model_fine_tuned_directory}")
else:
    logging.info(f"The folder model_run_directory WAS ALREADY PRESENT and is set to be: \n {model_fine_tuned_directory}")

output_directory = os.path.join(model_fine_tuned_directory, config_subfolder.split("/")[-1])

if config_subfolder.split("/")[-1] not in os.listdir(model_fine_tuned_directory):
    try:
        os.mkdir(output_directory)
        logging.info(f"The folder output_directory is set to be: \n {output_directory}")
    except FileExistsError:
        logging.info(f"FILEEXISTSERROR: The folder output_directory WAS ALREADY PRESENT and is set to be: \n {output_directory}")

#place the model weights you would like to export here
last_model_path = model_dir
print(last_model_path)

INFO:root:The directory FINE_TUNED_MODEL is already present, files will be stored there
INFO:root:The folder model_run_directory WAS ALREADY PRESENT and is set to be: 
 /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/FINE_TUNED_MODEL/efficientdet-d0
INFO:root:The folder output_directory is set to be: 
 /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/FINE_TUNED_MODEL/efficientdet-d0/config_6


/home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/TENSOR_RESULTS/efficientdet-d0/config_6


In [51]:
!python /home/labuser/LogoDet/LogoDetection_DSBAProject/training_process/training/models/research/object_detection/exporter_main_v2.py \
    --trained_checkpoint_dir {last_model_path} \
    --output_directory {output_directory} \
    --pipeline_config_path {pipeline_file}

2021-11-17 17:11:08.043315: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-17 17:11:08.544212: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 11567 MB memory:  -> device: 0, name: Tesla K80, pci bus id: 0001:00:00.0, compute capability: 3.7
I1117 17:11:08.682923 140082794759104 ssd_efficientnet_bifpn_feature_extractor.py:142] EfficientDet EfficientNet backbone version: efficientnet-b0
I1117 17:11:08.683123 140082794759104 ssd_efficientnet_bifpn_feature_extractor.py:144] EfficientDet BiFPN num filters: 64
I1117 17:11:08.683199 140082794759104 ssd_efficientnet_bifpn_feature_extractor.py:145] EfficientDet BiFPN num iterations: 3
I1117 17:11:08.6

In [52]:
saved_model_directory = os.path.join(output_directory, "saved_model")

In [53]:
%ls $saved_model_directory

[0m[01;34massets[0m/  saved_model.pb  [01;34mvariables[0m/
