# **Roadmap**

1. Install the TensorFlow Object Detection API
2. Setup folder structure
3. Generate the TFRecord files required for training
4. Edit the model pipeline config file and download the pre-trained model checkpoint
5. Train and evaluate the model

⚠️ This notebook is meant to be run in Google Colab for training in order to use GPU capacity.

In [2]:
# Check GPU setup

gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Not connected to a GPU')
else:
  print(gpu_info)

zsh:1: command not found: nvidia-smi


In [1]:
# Check RAM setup

from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

Your runtime has 27.3 gigabytes of available RAM

You are using a high-RAM runtime!


# **Import libraries**

In [3]:
import os
import glob
import xml.etree.ElementTree as ET
import pandas as pd
# import tensorflow as tf
import json
import shutil
print(tf.__version__)

NameError: name 'tf' is not defined

# 1. Create customTF2, training and data folders in your google drive

Create a folder named ***customTF2***.

Create another folder named ***training*** inside the ***customTF2*** folder
(***training*** folder is where the checkpoints will be saved during training)

Create another folder named ***data*** inside the ***customTF2*** folder.

# 2. Mount drive and link your folder

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

# this creates a symbolic link so that now the path /content/gdrive/My\ Drive/ is equal to /mydrive
!ln -s /content/gdrive/My\ Drive/ /mydrive
!ls /mydrive

# 3. Clone the tensorflow models git repository & Install TensorFlow Object Detection API

In [None]:
# clone the tensorflow models on the colab cloud vm
!git clone --q https://github.com/tensorflow/models.git

#navigate to /models/research folder to compile protos
%cd models/research

# Compile protos.
!protoc object_detection/protos/*.proto --python_out=.

# Install TensorFlow Object Detection API.
!cp object_detection/packages/tf2/setup.py .
!python -m pip install .

In [None]:
# Testing the model builder
!python object_detection/builders/model_builder_tf2_test.py

# 4. Train / Test split

We will use the following repy and script (https://github.com/akarazniewicz/cocosplit)

In [None]:
# Command to split 80%

⬛️ python cocosplit.py --having-annotations -s 0.8 '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/final_annotations_coco_V6.json' '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/train.json' '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/test.json'

# 5. Convert Train & Test to TFRecord

In [1]:
# CLI command we use the fiftyone library for this

fiftyone convert \
            --input-dir data/Final_Dataset/test.json \
            --input-type fiftyone.types.COCODetectionDataset \
            --output-dir data/Final_Dataset/TFRecord \
            --output-type fiftyone.types.TFObjectDetectionDataset

In [None]:
# read the JSON to get the path
# iterate over and copy the file to new folder

In [6]:
def generate_train_test_folders(json_file,path_to_original_data,path_to_new_split_data,name_of_split):
    with open(json_file) as f:
        json_annotations = json.load(f)
    
    for image in json_annotations['images']:
        path_image = image['file_name']
        os.makedirs(os.path.dirname(f"{path_to_new_split_data}/{name_of_split}/{path_image}"), exist_ok=True)
        shutil.copy2(f"{path_to_original_data}/{path_image}", f"{path_to_new_split_data}/{name_of_split}/{path_image}")
    
    return f"Completed transfer of {name_of_split}"
        

In [7]:
generate_train_test_folders('/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/test.json', '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset', '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset_Split', 'test')

'Completed transfer of test'

In [None]:
coco-split \
    --has_annotations \
    --test_ratio .2 \
    --valid_ratio 0 \
    --annotations_file '/Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset/final_annotations_coco_V6.json' \
    --train_name 'Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset_Split/train.json' \
    --test_name 'Users/julienberthomier/code/AmElmo/Main_Projects/Trash Detector (Wall-E)/data/Final_Dataset_Split/test.json'