# **Roadmap**

1. Install the TensorFlow Object Detection API
2. Setup folder structure
3. Generate the TFRecord files required for training
4. Edit the model pipeline config file and download the pre-trained model checkpoint
5. Train and evaluate the model

⚠️ This notebook is meant to be run in Google Colab for training in order to use GPU capacity.

In [None]:
# Check GPU setup

gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Not connected to a GPU')
else:
  print(gpu_info)

Thu Sep 29 16:18:43 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   40C    P0    27W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [None]:
# Check RAM setup

from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

Your runtime has 27.3 gigabytes of available RAM

You are using a high-RAM runtime!


# **Import libraries**

In [2]:
!pip install fiftyone

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting fiftyone
  Downloading fiftyone-0.17.2-py3-none-any.whl (2.2 MB)
[K     |████████████████████████████████| 2.2 MB 2.1 MB/s 
[?25hCollecting sseclient-py<2,>=1.7.2
  Downloading sseclient_py-1.7.2-py2.py3-none-any.whl (8.4 kB)
Collecting universal-analytics-python3<2,>=1.0.1
  Downloading universal_analytics_python3-1.1.1-py3-none-any.whl (10 kB)
Collecting fiftyone-brain<0.10,>=0.9.1
  Downloading fiftyone_brain-0.9.1-py3-none-any.whl (47 kB)
[K     |████████████████████████████████| 47 kB 5.1 MB/s 
Collecting aiofiles
  Downloading aiofiles-22.1.0-py3-none-any.whl (14 kB)
Collecting retrying
  Downloading retrying-1.3.3.tar.gz (10 kB)
Collecting mongoengine==0.20.0
  Downloading mongoengine-0.20.0-py3-none-any.whl (106 kB)
[K     |████████████████████████████████| 106 kB 55.3 MB/s 
[?25hCollecting kaleido
  Downloading kaleido-0.2.1-py2.py3-none-manylinux1_x86_64.whl (79.

In [3]:
import os
import glob
import xml.etree.ElementTree as ET
import pandas as pd
import json
import tensorflow as tf
import json
import shutil
import fiftyone
print(tf.__version__)

Migrating database to v0.17.2


INFO:fiftyone.migrations.runner:Migrating database to v0.17.2


2.8.2


# 1. Create customTF2, training and data folders in your google drive

Create a folder named ***customTF2***.

Create another folder named ***training*** inside the ***customTF2*** folder
(***training*** folder is where the checkpoints will be saved during training)

Create another folder named ***data*** inside the ***customTF2*** folder.

# 2. Mount drive and link your folder

In [4]:
from google.colab import drive
drive.mount('/content/gdrive')

# this creates a symbolic link so that now the path /content/gdrive/My\ Drive/ is equal to /mydrive
!ln -s /content/gdrive/Othercomputers/My_MacBook_Pro_Machine_Learning/ /MacBook
!ls /MacBook

Mounted at /content/gdrive
customTF2  Final_Dataset_Test  Final_Dataset_Train  notebooks


# 3. Clone the tensorflow models git repository & Install TensorFlow Object Detection API

In [11]:
# clone the tensorflow models on the colab cloud vm
!git clone --q https://github.com/tensorflow/models.git

#navigate to /models/research folder to compile protos
%cd models/research

# Compile protos.
!protoc object_detection/protos/*.proto --python_out=.

# Install TensorFlow Object Detection API.
!cp object_detection/packages/tf2/setup.py .
!python -m pip install .

/content/models/research
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Processing /content/models/research
[33m  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.[0m
Collecting avro-python3
  Downloading avro-python3-1.10.2.tar.gz (38 kB)
Collecting apache-beam
  Downloading apache_beam-2.41.0-cp37-cp37m-manylinux2010_x86_64.whl (10.9 MB)
[K     |████████████████████████████████| 10.9 MB 2.5 MB/s 
Collecting tf-slim
  Downloading tf_slim-1.1.0-py2.py3-none-any.whl (352 kB)
[K     |████████████████████████████████| 352 kB 66.1 MB/s 
Collecting lvis
  Downloading lvis-0.5.3-py3-none-any.whl (14

In [12]:
# Testing the model builder
!python object_detection/builders/model_builder_tf2_test.py

2022-09-30 10:54:37.045956: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-09-30 10:54:37.915007: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.7/dist-packages/cv2/../../lib64:/usr/lib64-nvidia
2022-09-30 10:54:37.915976: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.7/dist-packages/cv2/../../lib64:/usr/lib64-nvidia
Running tests under Python 3.7.14: /usr/bin/python3
[ RUN      ] ModelBuilderTF2Test.test_create_center_net_deepmac
2022-09-30 10:54:41.157889: W tens

# 4. Train / Test split

We will use the following repo and script (https://github.com/akarazniewicz/cocosplit) to generate a train & test split from the annotations

In [None]:
# Command to split 80%

⬛️ python cocosplit.py --having-annotations -s 0.8 '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/final_annotations_coco_V6.json' '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/train.json' '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/test.json'

We then need to generate the train & test data folders accordingly

In [None]:
# Function to generate the train & test folders

def generate_train_test_folders(json_file,path_to_original_data,path_to_new_split_data,name_of_split):
    with open(json_file) as f:
        json_annotations = json.load(f)
    
    for image in json_annotations['images']:
        path_image = image['file_name']
        os.makedirs(os.path.dirname(f"{path_to_new_split_data}/{name_of_split}/{path_image}"), exist_ok=True)
        shutil.copy2(f"{path_to_original_data}/{path_image}", f"{path_to_new_split_data}/{name_of_split}/{path_image}")
    
    return f"Completed transfer of {name_of_split}"


In [None]:
# Test folder
generate_train_test_folders('/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/test.json', '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset', '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset_Split', 'test')

'Completed transfer of test'

In [None]:
# Train folder
generate_train_test_folders('/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/train.json', '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset', '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset_Split', 'train')

'Completed transfer of train'

# 5. Generate label_map.pbtxt file

In [None]:
# Function used to generate the label_map.pbtxt file from the COCO annotations in JSON

def generate_label_map(path_to_json,path_to_label_map):
    
    with open(path_to_json) as f:
        json_annotations = json.load(f)
    
    pbtxt_content = ""
        
    for category in json_annotations['categories']:
        pbtxt_content = (
            pbtxt_content
            + "item {{\n    id: {0}\n    name: '{1}'\n}}\n\n".format(category['id'] + 1, category['name'])
        )
    pbtxt_content = pbtxt_content.strip()
        
    with open(path_to_label_map, "w") as f:
        f.write(pbtxt_content)
        print('Successfully created label_map.pbtxt ')   

In [None]:
generate_label_map('../data/Final_Dataset/final_annotations_coco_V6.json', '../customTF2/label_map.pbtxt')

Successfully created label_map.pbtxt 


# 6. Convert Train & Test to TFRecord

In [9]:
# CLI command using the fiftyone library to turn test set into TFRecord

!fiftyone convert \
            --input-dir /MacBook/Final_Dataset_Test \
            --input-type fiftyone.types.COCODetectionDataset \
            --output-dir /MacBook/customTF2/data \
            --output-kwargs tf_records_path=test.records \
            --output-type fiftyone.types.TFObjectDetectionDataset

Using input format: fiftyone.types.dataset_types.COCODetectionDataset
Using export format: fiftyone.types.dataset_types.TFObjectDetectionDataset
INFO:fiftyone.utils.data.converters:Using export format: fiftyone.types.dataset_types.TFObjectDetectionDataset
Loading dataset from '/MacBook/Final_Dataset_Test'
INFO:fiftyone.utils.data.converters:Loading dataset from '/MacBook/Final_Dataset_Test'
 100% |███████████████| 1892/1892 [18.7s elapsed, 0s remaining, 365.2 samples/s]      
INFO:eta.core.utils: 100% |███████████████| 1892/1892 [18.7s elapsed, 0s remaining, 365.2 samples/s]      
Exporting dataset to '/MacBook/customTF2/data'
INFO:fiftyone.utils.data.converters:Exporting dataset to '/MacBook/customTF2/data'
Directory '/MacBook/customTF2/data' already exists; export will be merged with existing files
Found multiple fields ['detections', 'segmentations'] with compatible type <class 'fiftyone.core.labels.Detections'>; exporting 'detections'
INFO:fiftyone.core.collections:Found multiple f

In [None]:
# CLI command using the fiftyone library to turn training set into TFRecord

!fiftyone convert \
            --input-dir /MacBook/Final_Dataset_Train \
            --input-type fiftyone.types.COCODetectionDataset \
            --output-dir /MacBook/customTF2/data \
            --output-kwargs tf_records_path=train.records \
            --output-type fiftyone.types.TFObjectDetectionDataset

Using input format: fiftyone.types.dataset_types.COCODetectionDataset
Using export format: fiftyone.types.dataset_types.TFObjectDetectionDataset
INFO:fiftyone.utils.data.converters:Using export format: fiftyone.types.dataset_types.TFObjectDetectionDataset
Loading dataset from '/MacBook/Final_Dataset_Train'
INFO:fiftyone.utils.data.converters:Loading dataset from '/MacBook/Final_Dataset_Train'
 100% |███████████████| 7567/7567 [1.4m elapsed, 0s remaining, 394.3 samples/s]      
INFO:eta.core.utils: 100% |███████████████| 7567/7567 [1.4m elapsed, 0s remaining, 394.3 samples/s]      
Exporting dataset to '/MacBook/customTF2/data'
INFO:fiftyone.utils.data.converters:Exporting dataset to '/MacBook/customTF2/data'
Directory '/MacBook/customTF2/data' already exists; export will be merged with existing files
Found multiple fields ['detections', 'segmentations'] with compatible type <class 'fiftyone.core.labels.Detections'>; exporting 'detections'
INFO:fiftyone.core.collections:Found multiple f

In [50]:
!python create_coco_tf_record_test.py  --logtostderr \
--train_image_dir="/MacBook/Final_Dataset_Train/data" \
--test_image_dir="/MacBook/Final_Dataset_Test/data" \
--train_annotations_file="/MacBook/Final_Dataset_Train/labels.json" \
--test__annotations_file="/MacBook/Final_Dataset_Test/labels.json" \
--output_dir="/MacBook/customTF2/data"

2022-09-30 11:09:28.629093: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-09-30 11:09:29.458822: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.7/dist-packages/cv2/../../lib64:/usr/lib64-nvidia
2022-09-30 11:09:29.458953: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.7/dist-packages/cv2/../../lib64:/usr/lib64-nvidia
FATAL Flags parsing error: Unknown command line flag 'test__annotations_file'. Did you mean: train_annotations_file ?
Pass --helpshort or --helpfull t

In [47]:
%cd /content/models/research/object_detection/dataset_tools

/content/models/research/object_detection/dataset_tools


In [None]:
# read the JSON to get the path
# iterate over and copy the file to new folder