Welcome to the tutorial. Here we are going to train two different model on the given dataset. YoLo V5 and MobileNet V2 to see which does good, compare the results and proceed.

![mobilenet](https://machinethink.net/images/mobilenet-v2/FeatureExtractor@2x.png)

**LETS START WITH MOBILE NET FIRST:**

We create a directory called mobilenet inside which we will keep all our relevant files.

In [None]:
import os

os.mkdir("./mobilenet")

Now we will start by installing tensorflow object detection API.

# Step 1: Cloning the repository

In [None]:
!git clone https://github.com/tensorflow/models {'./mobilenet/models'}

# Step 2: Install protobufs

Using library wget you can download urls. To install protoc either you can donwload the file and set it to the environment variable of os so that we can use the protoc command. We store it in its seperate directory as shown below. Protoc is used to compile .proto files.

In [None]:
!pip install wget
import wget
url = "https://github.com/protocolbuffers/protobuf/releases/download/v3.15.6/protoc-3.15.6-win64.zip"
wget.download(url)

In [None]:
os.mkdir("./mobilenet/protobuf")

!cp "./protoc-3.15.6-win64.zip" "./mobilenet/protobuf"

!unzip "./mobilenet/protobuf/protoc-3.15.6-win64.zip" -d "./mobilenet/protobuf"

# Copy the file to the labeled directory and unzip it.

In [None]:
os.environ['PATH'] += os.pathsep + "./mobilenet/protobuf/bin/protoc.exe"

# Step 3: Is to compile all the protocs in the object detection directory as shown below and building the object detection API

In [None]:
!cd ./mobilenet/models/research && protoc object_detection/protos/*.proto --python_out=.
!cd ./mobilenet/models/research && cp object_detection/packages/tf2/setup.py . && python -m pip install .

# Step 5: Import the object detection API

Note: If Importing libraries gives error try restarting kernel and import again.

In [None]:
import object_detection
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
from object_detection.utils import dataset_util
import ast
import cv2
import wget
import os
from shutil import copyfile 
from tqdm.notebook import tqdm
tqdm.pandas()

# Step 6: Download the mobile net v2 model from tensorflow.

You can get other models [here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md)

After downloading it store it the right directory and unzip it.

In [None]:
model_url = "http://download.tensorflow.org/models/object_detection/tf2/20200711/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz"
wget.download(model_url)

In [None]:
os.mkdir("./mobilenet/my_model")
MODEL_NAME = "ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz"
!mv ./ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8.tar.gz ./mobilenet/my_model
!cd ./mobilenet/my_model && tar -zxvf {MODEL_NAME}

In the dataset, all the images has height and width as 720 and 1280

In [None]:
TRAIN_PATH = '../input/tensorflow-great-barrier-reef/train_images'

In [None]:
# Lets Load the dataset
df = pd.read_csv('../input/tensorflow-great-barrier-reef/train.csv')
df.head()

In [None]:
df['annotations'].unique()

In [None]:
# Lets see for how many values we have the annotations
df['number_bbox'] = df['annotations'].apply(lambda x: x.count('x'))
df.head()

In [None]:
df['number_bbox'].unique()   # These are the number of bounding boxes

In [None]:
proportion = lambda x: round(x/len(df) * 100,2)
not_available = len(df.loc[df['number_bbox'] == 0])
proportion(not_available)  
# almost for 80% of the data we dont have the bounding boxes so we will use only
# 20% of data to train the model

In [None]:
# Lets seperate those out
df_train = df.loc[df['number_bbox'] != 0,:].reset_index(drop=True)
df_train.sample(5)

In [None]:
df_train['annotations'][0]  # Its in form of string we need to use literal_eval

In [None]:
import ast

eval_ = lambda x:ast.literal_eval(x)
df_train['annotations'] = df_train['annotations'].progress_apply(eval_)

In [None]:
df_train['annotations'][0]  # Looks good now

In [None]:

# Now lets first get the coordinates of the bounding boxes
def give_bbox_loc(instance):   #instance could be a single dictionary or lists of dictionary
    return [list(ins.values()) for ins in instance]

def get_path_image(instance):
    return os.path.join(TRAIN_PATH,f'video_{instance["video_id"]}',f'{instance["video_frame"]}')

def load_image(path):
    return cv2.cvtColor(cv2.imread(path+'.jpg'), cv2.COLOR_BGR2RGB)

In [None]:
df_train['box_location'] = df_train['annotations'].progress_apply(give_bbox_loc)

In [None]:
df_train.sample(5)

In [None]:
df_train["video_id"].value_counts()

In [None]:
df_train['path'] =  df_train.progress_apply(get_path_image, axis=1)

In [None]:
df_train.sample(5)

In [None]:
def normalize_image(bboxes):
    
    bboxes = bboxes.astype('float')
    bboxes[...,[0,2]] = bboxes[...,[0,2]] / 1280   # width is 1280
    bboxes[...,[1,3]] = bboxes[...,[1,3]] / 720   # height is 720
    
    # Now we need to center the image
    # Simply add half of width and height of image to current x and y it will be the center
    bboxes[...,[0,1]] = bboxes[...,[0,1]] + bboxes[...,[2,3]] / 2
    return bboxes


def draw_boxes(image, cordinates, classes = None, legend=False):
    img = image.copy()
    for idx in range(len(cordinates)):
        bbox = cordinates[idx]
        clas = classes[idx]
        colors = [np.random.randint(0,255) for _ in range(3)]
        # Always remeber that we are using yolo format so they are coordinates for center i.e. mid
        x = round(float(bbox[0]) * img.shape[1],2)
        y = round(float(bbox[1]) * img.shape[0],2)
        w = round(float(bbox[2]) * img.shape[1]/2,2)
        h = round(float(bbox[3]) * img.shape[1]/2,2)

        box = (x-w,y-h,x+w,y+h) # Becasue note that the coordinates comes after the normalize function.
        r_a = int(box[0])
        r_b = int(box[1])
        r_c = int(box[2])
        r_d = int(box[3])
        cv2.rectangle(img,(r_a, r_b),(r_c, r_d), thickness = 3, color = colors)
        cv2.putText(img, clas, org=(r_a, r_d-3),color=(255, 0, 0),
                    fontFace = cv2.FONT_HERSHEY_SIMPLEX,thickness = 2,fontScale=0.66)
    return img

In [None]:
# Now lets see some of the datapoints in the image
points = df_train.loc[df_train['number_bbox'] >= 8,:].sample(4)
n_rows = 2
n_cols = 2
plt.figure(figsize=(15,15))
for row in range(n_rows):
    for col in range(n_cols):
        index = row*n_cols + col
        plt.subplot(n_rows, n_cols, index+1)
        
        instance = points.iloc[index]
       
        image = load_image(instance['path'])
        box_cordinates = np.array(instance['box_location'])
        height = 720
        width = 1280
        classes = ['COTS'] * len(box_cordinates)
        normalized_coordinates = normalize_image(box_cordinates)
        image = draw_boxes(image,
                          normalized_coordinates,
                           classes = classes,
                           legend=True
                          )
        plt.imshow(image)
        plt.axis('off')
plt.subplots_adjust(wspace = 0.01, hspace = 0.01)

In [None]:
os.mkdir("./mobilenet/train")
os.mkdir("./mobilenet/val")

In [None]:
df_train['fold'] = 'train'
train_instances = int(len(df_train) * 0.8)
val_instances = len(df_train) - train_instances
val_index = df_train.sample(val_instances).index
df_train.loc[val_index,'fold'] = 'val'

In [None]:
df_train.fold.value_counts()

In [None]:
for idx in tqdm(range(df_train.shape[0])):
    instance = df_train.iloc[idx]
    if instance['fold'] == 'train':
      copyfile(f'{instance["path"]}.jpg', './mobilenet/train/{}.jpg'.format(instance["image_id"]))
    else:
      copyfile(f'{instance["path"]}.jpg', './mobilenet/val/{}.jpg'.format(instance["image_id"]))

In [None]:
print(len(os.listdir('./mobilenet/val')))
print(len(os.listdir('./mobilenet/train')))

# Step 7: Creating Annotations

For every example in your dataset, you should have the following information:

1. An RGB image for the dataset encoded as jpeg or png.

2. A list of bounding boxes for the image. Each bounding box should contain:
A bounding box coordinates (with origin in top left corner) defined by 4 floating point numbers [ymin, xmin, ymax, xmax]. Note that we store the normalized coordinates (x / width, y / height) in the TFRecord dataset. The class of the object in the bounding box.

We will create xml file for every image and store it in the same directory as the image itself.

The format for our annotations will be as [follows](https://drive.google.com/file/d/1V2rBGit_xQ-jjBMrSg2atIhYCbKqnJ5R/view?usp=sharing)
 You might be wondering why to use extra terms like segmented etc. The point is I used the repository  in my other projects and I didnt want to change lines there so I place something extra here so that things dont change. Important thing is folder,path,size,object and bndbox.

For creating XML Annotations we will use below xml library. Its very simple to use. Just follow tree like structure.

X = ET.Element('Hello') # This actually creates a root tag 'Hello'

ET.SubElement(X,'Bye').text = 'Kaggle' # It creates a subtag within Hello named 'Bye' with value Kaggle.

In [None]:
# Lets now create annotations for our dataset first. I will use .xml format to save the data.
import xml.etree.cElementTree as ET

PATH = "./mobilenet/"
for _, instance in tqdm(df_train.iterrows()):
  fold = instance['fold']
  root = ET.Element("annotation")
  ET.SubElement(root,"folder").text = fold
  ET.SubElement(root,"filename").text = instance["image_id"] + '.jpg'
  ET.SubElement(root,"path").text = './mobilenet/{0}/{1}.jpg'.format(fold,instance["image_id"])
  source = ET.SubElement(root,'source')
  ET.SubElement(source, 'database').text = 'Unknown'
  size = ET.SubElement(root,"size")
  ET.SubElement(size, "width").text = str(1280)   # We will convert into int later while ggenerating tfrecord.
  ET.SubElement(size, "height").text = str(720)
  ET.SubElement(size, "depth").text = str(3)
  ET.SubElement(root,'segmented').text = str(0)
  classname = ET.SubElement(root,"object")
  for element in instance['box_location']:
    ET.SubElement(classname, "name").text = 'COTS'
    ET.SubElement(classname, "pose").text = 'Unspecified'
    ET.SubElement(classname, "truncated").text = '0'
    ET.SubElement(classname, "difficult").text = '0'
    bndbox = ET.SubElement(classname, "bndbox")

    # Dont worry about normalizing it when we will create tfrecord we will
    # normalize then and there only

    ET.SubElement(bndbox, "xmin").text = str(element[0])
    ET.SubElement(bndbox, "ymin").text = str(element[1])

    # If xmax goes out of the boundary we scale it. Very Important
    xmax = element[0] + element[2]
    if xmax >= 1280:
      ET.SubElement(bndbox, "xmax").text = str(1280)
    else:
      ET.SubElement(bndbox, "xmax").text = str(xmax)

    ymax = element[1] + element[3]
    if ymax >= 720:
      ET.SubElement(bndbox, "ymax").text = str(720) 
    else:   
      ET.SubElement(bndbox, "ymax").text = str(ymax)

  tree = ET.ElementTree(root)
  tree.write(os.path.join(os.path.join(PATH,fold), instance["image_id"]+'.xml'))

# Step 8: Generating TFRecord

I will be using mthe standard apporach to generate the two tfrecord file one for train and one for val.

Special thanks to [Nick](https://github.com/nicknochnack/) for the file

If you see the repository all we do is to iterate through all the xml file and generating two tfrecord. For getting tfrecord file. We extract certain information like xmin,xmax etc and we encode it it tf_example(in the repository)

To know how it works refer [here](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/training.html)

In [None]:
!git clone https://github.com/Gnopal1132/Generate_TFRecord.git

In [None]:
os.mkdir('./mobilenet/annotations/')

# Step 9: Create the Label Map

Label map simply contains the number of classes and the id

In [None]:
labels = [{'name':'COTS', 'id':1}]
LABEL_MAP = './mobilenet/annotations/label_map.pbtxt'
with open(LABEL_MAP, 'w') as f:
    for label in labels:
        f.write('item { \n')
        f.write('\tname:\'{}\'\n'.format(label['name']))
        f.write('\tid:{}\n'.format(label['id']))
        f.write('}\n')

In [None]:
TF_RECORD_SCRIPT = './Generate_TFRecord/tfrecord.py'
IMAGE_PATH_TRAIN = "./mobilenet/train"
IMAGE_PATH_VAL = "./mobilenet/val"
ANNOTATION_PATH = "./mobilenet/annotations/"
LABELMAP = os.path.join(ANNOTATION_PATH, 'label_map.pbtxt')

!python {TF_RECORD_SCRIPT} -x {IMAGE_PATH_TRAIN} -l {LABELMAP} -o {os.path.join(ANNOTATION_PATH, 'train.record')} 
!python {TF_RECORD_SCRIPT} -x {IMAGE_PATH_VAL} -l {LABELMAP} -o {os.path.join(ANNOTATION_PATH, 'val.record')}

# Step 10: Loading the confiuration file,checkpoints and basic settings for training the model

In [None]:
import tensorflow as tf
from object_detection.utils import config_util
from object_detection.protos import pipeline_pb2
from google.protobuf import text_format

In [None]:

MODEL_NAME = 'ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8'
os.mkdir('./mobilenet/{}'.format(MODEL_NAME))  # This is where we will store our model checkpoint and configuration.
!cp "./mobilenet/my_model/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8/pipeline.config" "./mobilenet/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8"

In [None]:
PIPELINE_CONFIG = './mobilenet/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8/pipeline.config' 
config = config_util.get_configs_from_pipeline_file(PIPELINE_CONFIG)  # Loading the configuration file

In [None]:
pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()
with tf.io.gfile.GFile(PIPELINE_CONFIG, "r") as f:                                                                                                                                                                                                                     
    proto_str = f.read()                                                                                                                                                                                                                                          
    text_format.Merge(proto_str, pipeline_config)

In [None]:
BATCH = 8
EPOCHS = 2000

In [None]:
# Loading basic information to the pipeline
PATH = './mobilenet'
pipeline_config.model.ssd.num_classes = 1
pipeline_config.train_config.batch_size = BATCH
pipeline_config.train_config.fine_tune_checkpoint = os.path.join(os.path.join(PATH, 'my_model'), MODEL_NAME, 'checkpoint', 'ckpt-0')
pipeline_config.train_config.fine_tune_checkpoint_type = "detection"
pipeline_config.train_input_reader.label_map_path= os.path.join(PATH,'annotations','label_map.pbtxt')
pipeline_config.train_input_reader.tf_record_input_reader.input_path[:] = [os.path.join(os.path.join(PATH,'annotations'), 'train.record')]
pipeline_config.eval_input_reader[0].label_map_path = os.path.join(PATH,'annotations','label_map.pbtxt')
pipeline_config.eval_input_reader[0].tf_record_input_reader.input_path[:] = [os.path.join(os.path.join(PATH,'annotations'), 'val.record')]

In [None]:
config_text = text_format.MessageToString(pipeline_config)                                                                                                                                                                                                        
with tf.io.gfile.GFile(PIPELINE_CONFIG, "wb") as f:                                                                                                                                                                                                                     
    f.write(config_text)

In [None]:
TRAINING_SCRIPT = os.path.join("./mobilenet/models", 'research', 'object_detection', 'model_main_tf2.py')

In [None]:
CHECKPOINT_PATH = './mobilenet/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8'
command = "python {} --model_dir={} --pipeline_config_path={} --num_train_steps={}".format(TRAINING_SCRIPT, CHECKPOINT_PATH, PIPELINE_CONFIG, EPOCHS)

In [None]:
command

To Execute the model run the below line. I am not going to train the model. As I already did it on colab I am going to print the highlight of training and tensorboard to see the output of how the model behaved. I uploaded the final checkpoints and event files.

In [None]:
!python ./mobilenet/models/research/object_detection/model_main_tf2.py --model_dir=./mobilenet/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8 --pipeline_config_path=./mobilenet/ssd_mobilenet_v2_fpnlite_640x640_coco17_tpu-8/pipeline.config --num_train_steps=2000


INFO:tensorflow:Step 100 per-step time 2.361s

I1230 19:11:27.819197 140299792009088 model_lib_v2.py:707] Step 100 per-step time 2.361s
INFO:tensorflow:{'Loss/classification_loss': 0.53248906,
 'Loss/localization_loss': 0.4413896,
 'Loss/regularization_loss': 0.1521543,
 'Loss/total_loss': 1.126033,
 'learning_rate': 0.0319994}
I1230 19:11:27.819756 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.53248906,
 'Loss/localization_loss': 0.4413896,
 'Loss/regularization_loss': 0.1521543,
 'Loss/total_loss': 1.126033,
 'learning_rate': 0.0319994}
 
 
INFO:tensorflow:Step 200 per-step time 1.842s
I1230 19:14:31.999863 140299792009088 model_lib_v2.py:707] Step 200 per-step time 1.842s
INFO:tensorflow:{'Loss/classification_loss': 0.57567745,
 'Loss/localization_loss': 0.420439,
 'Loss/regularization_loss': 0.15256938,
 'Loss/total_loss': 1.1486858,
 'learning_rate': 0.0373328}
I1230 19:14:32.000326 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.57567745,
 'Loss/localization_loss': 0.420439,
 'Loss/regularization_loss': 0.15256938,
 'Loss/total_loss': 1.1486858,
 'learning_rate': 0.0373328}
 
 
INFO:tensorflow:Step 300 per-step time 1.867s
I1230 19:17:38.737374 140299792009088 model_lib_v2.py:707] Step 300 per-step time 1.867s
INFO:tensorflow:{'Loss/classification_loss': 0.82316196,
 'Loss/localization_loss': 0.38124648,
 'Loss/regularization_loss': 0.15306622,
 'Loss/total_loss': 1.3574746,
 'learning_rate': 0.0426662}
I1230 19:17:38.737965 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.82316196,
 'Loss/localization_loss': 0.38124648,
 'Loss/regularization_loss': 0.15306622,
 'Loss/total_loss': 1.3574746,
 'learning_rate': 0.0426662}
 
 
INFO:tensorflow:Step 400 per-step time 1.853s
I1230 19:20:44.022718 140299792009088 model_lib_v2.py:707] Step 400 per-step time 1.853s
INFO:tensorflow:{'Loss/classification_loss': 0.5168445,
 'Loss/localization_loss': 0.30190247,
 'Loss/regularization_loss': 0.15343656,
 'Loss/total_loss': 0.9721835,
 'learning_rate': 0.047999598}
I1230 19:20:44.023141 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.5168445,
 'Loss/localization_loss': 0.30190247,
 'Loss/regularization_loss': 0.15343656,
 'Loss/total_loss': 0.9721835,
 'learning_rate': 0.047999598}
 
 
INFO:tensorflow:Step 500 per-step time 1.865s
I1230 19:23:50.505155 140299792009088 model_lib_v2.py:707] Step 500 per-step time 1.865s
INFO:tensorflow:{'Loss/classification_loss': 0.50315356,
 'Loss/localization_loss': 0.4802518,
 'Loss/regularization_loss': 0.15407632,
 'Loss/total_loss': 1.1374817,
 'learning_rate': 0.053333}
I1230 19:23:50.505642 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.50315356,
 'Loss/localization_loss': 0.4802518,
 'Loss/regularization_loss': 0.15407632,
 'Loss/total_loss': 1.1374817,
 'learning_rate': 0.053333}
 
 
INFO:tensorflow:Step 600 per-step time 1.853s
I1230 19:26:55.792762 140299792009088 model_lib_v2.py:707] Step 600 per-step time 1.853s
INFO:tensorflow:{'Loss/classification_loss': 0.2856278,
 'Loss/localization_loss': 0.15508056,
 'Loss/regularization_loss': 0.15469529,
 'Loss/total_loss': 0.5954037,
 'learning_rate': 0.0586664}
I1230 19:26:55.793148 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.2856278,
 'Loss/localization_loss': 0.15508056,
 'Loss/regularization_loss': 0.15469529,
 'Loss/total_loss': 0.5954037,
 'learning_rate': 0.0586664}
 
 
INFO:tensorflow:Step 700 per-step time 1.856s
I1230 19:30:01.431559 140299792009088 model_lib_v2.py:707] Step 700 per-step time 1.856s
INFO:tensorflow:{'Loss/classification_loss': 0.44273925,
 'Loss/localization_loss': 0.26241776,
 'Loss/regularization_loss': 0.15521659,
 'Loss/total_loss': 0.8603736,
 'learning_rate': 0.0639998}
I1230 19:30:01.432031 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.44273925,
 'Loss/localization_loss': 0.26241776,
 'Loss/regularization_loss': 0.15521659,
 'Loss/total_loss': 0.8603736,
 'learning_rate': 0.0639998}
 
 
INFO:tensorflow:Step 800 per-step time 1.853s
I1230 19:33:06.761075 140299792009088 model_lib_v2.py:707] Step 800 per-step time 1.853s
INFO:tensorflow:{'Loss/classification_loss': 0.7442434,
 'Loss/localization_loss': 0.26918525,
 'Loss/regularization_loss': 0.15582325,
 'Loss/total_loss': 1.1692519,
 'learning_rate': 0.069333196}
I1230 19:33:06.761445 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.7442434,
 'Loss/localization_loss': 0.26918525,
 'Loss/regularization_loss': 0.15582325,
 'Loss/total_loss': 1.1692519,
 'learning_rate': 0.069333196}
 
 
INFO:tensorflow:Step 900 per-step time 1.848s
I1230 19:36:11.598434 140299792009088 model_lib_v2.py:707] Step 900 per-step time 1.848s
INFO:tensorflow:{'Loss/classification_loss': 0.47391683,
 'Loss/localization_loss': 0.23772612,
 'Loss/regularization_loss': 0.1564635,
 'Loss/total_loss': 0.8681065,
 'learning_rate': 0.074666604}
I1230 19:36:11.598908 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.47391683,
 'Loss/localization_loss': 0.23772612,
 'Loss/regularization_loss': 0.1564635,
 'Loss/total_loss': 0.8681065,
 'learning_rate': 0.074666604}
 
 
INFO:tensorflow:Step 1000 per-step time 1.866s
I1230 19:39:18.184226 140299792009088 model_lib_v2.py:707] Step 1000 per-step time 1.866s
INFO:tensorflow:{'Loss/classification_loss': 0.56101376,
 'Loss/localization_loss': 0.32822332,
 'Loss/regularization_loss': 0.1571509,
 'Loss/total_loss': 1.0463879,
 'learning_rate': 0.08}
I1230 19:39:18.184702 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.56101376,
 'Loss/localization_loss': 0.32822332,
 'Loss/regularization_loss': 0.1571509,
 'Loss/total_loss': 1.0463879,
 'learning_rate': 0.08}
 
 
INFO:tensorflow:Step 1100 per-step time 1.857s
I1230 19:42:23.911181 140299792009088 model_lib_v2.py:707] Step 1100 per-step time 1.857s
INFO:tensorflow:{'Loss/classification_loss': 0.34899262,
 'Loss/localization_loss': 0.29847386,
 'Loss/regularization_loss': 0.15807542,
 'Loss/total_loss': 0.8055419,
 'learning_rate': 0.07999918}
I1230 19:42:23.911592 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.34899262,
 'Loss/localization_loss': 0.29847386,
 'Loss/regularization_loss': 0.15807542,
 'Loss/total_loss': 0.8055419,
 'learning_rate': 0.07999918}
 
 
INFO:tensorflow:Step 1200 per-step time 1.846s
I1230 19:45:28.488472 140299792009088 model_lib_v2.py:707] Step 1200 per-step time 1.846s
INFO:tensorflow:{'Loss/classification_loss': 0.4211598,
 'Loss/localization_loss': 0.2639899,
 'Loss/regularization_loss': 0.15880023,
 'Loss/total_loss': 0.8439499,
 'learning_rate': 0.079996705}
I1230 19:45:28.488929 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.4211598,
 'Loss/localization_loss': 0.2639899,
 'Loss/regularization_loss': 0.15880023,
 'Loss/total_loss': 0.8439499,
 'learning_rate': 0.079996705}
 
 
INFO:tensorflow:Step 1300 per-step time 1.846s
I1230 19:48:33.054533 140299792009088 model_lib_v2.py:707] Step 1300 per-step time 1.846s
INFO:tensorflow:{'Loss/classification_loss': 0.27099043,
 'Loss/localization_loss': 0.2084263,
 'Loss/regularization_loss': 0.15907411,
 'Loss/total_loss': 0.63849086,
 'learning_rate': 0.0799926}
I1230 19:48:33.054928 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.27099043,
 'Loss/localization_loss': 0.2084263,
 'Loss/regularization_loss': 0.15907411,
 'Loss/total_loss': 0.63849086,
 'learning_rate': 0.0799926}
 
 
INFO:tensorflow:Step 1400 per-step time 1.841s
I1230 19:51:37.191095 140299792009088 model_lib_v2.py:707] Step 1400 per-step time 1.841s
INFO:tensorflow:{'Loss/classification_loss': 0.30937624,
 'Loss/localization_loss': 0.16050465,
 'Loss/regularization_loss': 0.15938759,
 'Loss/total_loss': 0.62926847,
 'learning_rate': 0.07998685}
I1230 19:51:37.191522 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.30937624,
 'Loss/localization_loss': 0.16050465,
 'Loss/regularization_loss': 0.15938759,
 'Loss/total_loss': 0.62926847,
 'learning_rate': 0.07998685}
 
 
INFO:tensorflow:Step 1500 per-step time 1.867s
I1230 19:54:43.897237 140299792009088 model_lib_v2.py:707] Step 1500 per-step time 1.867s
INFO:tensorflow:{'Loss/classification_loss': 0.3307212,
 'Loss/localization_loss': 0.24422823,
 'Loss/regularization_loss': 0.16004217,
 'Loss/total_loss': 0.7349916,
 'learning_rate': 0.07997945}
I1230 19:54:43.897655 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.3307212,
 'Loss/localization_loss': 0.24422823,
 'Loss/regularization_loss': 0.16004217,
 'Loss/total_loss': 0.7349916,
 'learning_rate': 0.07997945}
 
 
INFO:tensorflow:Step 1600 per-step time 1.844s
I1230 19:57:48.281800 140299792009088 model_lib_v2.py:707] Step 1600 per-step time 1.844s
INFO:tensorflow:{'Loss/classification_loss': 0.32862264,
 'Loss/localization_loss': 0.19054511,
 'Loss/regularization_loss': 0.16015424,
 'Loss/total_loss': 0.679322,
 'learning_rate': 0.079970405}
I1230 19:57:48.282171 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.32862264,
 'Loss/localization_loss': 0.19054511,
 'Loss/regularization_loss': 0.16015424,
 'Loss/total_loss': 0.679322,
 'learning_rate': 0.079970405}
 
 
INFO:tensorflow:Step 1700 per-step time 1.866s
I1230 20:00:54.930995 140299792009088 model_lib_v2.py:707] Step 1700 per-step time 1.866s
INFO:tensorflow:{'Loss/classification_loss': 0.30334967,
 'Loss/localization_loss': 0.24827217,
 'Loss/regularization_loss': 0.16041782,
 'Loss/total_loss': 0.7120397,
 'learning_rate': 0.07995972}
I1230 20:00:54.931398 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.30334967,
 'Loss/localization_loss': 0.24827217,
 'Loss/regularization_loss': 0.16041782,
 'Loss/total_loss': 0.7120397,
 'learning_rate': 0.07995972}
 
 
INFO:tensorflow:Step 1800 per-step time 1.878s
I1230 20:04:02.684715 140299792009088 model_lib_v2.py:707] Step 1800 per-step time 1.878s
INFO:tensorflow:{'Loss/classification_loss': 0.29834068,
 'Loss/localization_loss': 0.1236572,
 'Loss/regularization_loss': 0.16077736,
 'Loss/total_loss': 0.58277524,
 'learning_rate': 0.0799474}
I1230 20:04:02.685132 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.29834068,
 'Loss/localization_loss': 0.1236572,
 'Loss/regularization_loss': 0.16077736,
 'Loss/total_loss': 0.58277524,
 'learning_rate': 0.0799474}
 
 
INFO:tensorflow:Step 1900 per-step time 1.853s
I1230 20:07:07.961340 140299792009088 model_lib_v2.py:707] Step 1900 per-step time 1.853s
INFO:tensorflow:{'Loss/classification_loss': 0.3075142,
 'Loss/localization_loss': 0.17589329,
 'Loss/regularization_loss': 0.16102299,
 'Loss/total_loss': 0.6444305,
 'learning_rate': 0.07993342}
I1230 20:07:07.961811 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.3075142,
 'Loss/localization_loss': 0.17589329,
 'Loss/regularization_loss': 0.16102299,
 'Loss/total_loss': 0.6444305,
 'learning_rate': 0.07993342}
 
 
INFO:tensorflow:Step 2000 per-step time 1.860s
I1230 20:10:13.916106 140299792009088 model_lib_v2.py:707] Step 2000 per-step time 1.860s
INFO:tensorflow:{'Loss/classification_loss': 0.18447536,
 'Loss/localization_loss': 0.14183624,
 'Loss/regularization_loss': 0.16143182,
 'Loss/total_loss': 0.4877434,
 'learning_rate': 0.07991781}
I1230 20:10:13.916568 140299792009088 model_lib_v2.py:708] {'Loss/classification_loss': 0.18447536,
 'Loss/localization_loss': 0.14183624,
 'Loss/regularization_loss': 0.16143182,
 'Loss/total_loss': 0.4877434,
 'learning_rate': 0.07991781}

# Evaluate the Model

In [None]:
command = "python {} --model_dir={} --pipeline_config_path={} --checkpoint_dir={}".format(TRAINING_SCRIPT, CHECKPOINT_PATH, PIPELINE_CONFIG, CHECKPOINT_PATH)

In [None]:
command

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.019

 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.055
 
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.007
 
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.006
 
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.024
 
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.128
 
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.059
 
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.150
 
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.208
 
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.005
 
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.250
 
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.592
 
INFO:tensorflow:Eval metrics at step 2000
I1231 18:49:47.775108 139774748125056 model_lib_v2.py:1015] Eval metrics at step 2000
INFO:tensorflow:	+ DetectionBoxes_Precision/mAP: 0.019225
I1231 18:49:47.794445 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Precision/mAP: 0.019225


INFO:tensorflow:	+ DetectionBoxes_Precision/mAP@.50IOU: 0.055222
I1231 18:49:47.796933 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Precision/mAP@.50IOU: 0.055222
INFO:tensorflow:	+ DetectionBoxes_Precision/mAP@.75IOU: 0.006860
I1231 18:49:47.799041 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Precision/mAP@.75IOU: 0.006860


INFO:tensorflow:	+ DetectionBoxes_Precision/mAP (small): 0.005941
I1231 18:49:47.801052 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Precision/mAP (small): 0.005941
INFO:tensorflow:	+ DetectionBoxes_Precision/mAP (medium): 0.023949
I1231 18:49:47.803217 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Precision/mAP (medium): 0.023949


INFO:tensorflow:	+ DetectionBoxes_Precision/mAP (large): 0.127765
I1231 18:49:47.805278 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Precision/mAP (large): 0.127765
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@1: 0.058537
I1231 18:49:47.807327 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Recall/AR@1: 0.058537


INFO:tensorflow:	+ DetectionBoxes_Recall/AR@10: 0.150000
I1231 18:49:47.809374 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Recall/AR@10: 0.150000
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100: 0.207520
I1231 18:49:47.811417 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Recall/AR@100: 0.207520


INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100 (small): 0.004687
I1231 18:49:47.813374 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Recall/AR@100 (small): 0.004687
INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100 (medium): 0.250448
I1231 18:49:47.815356 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Recall/AR@100 (medium): 0.250448


INFO:tensorflow:	+ DetectionBoxes_Recall/AR@100 (large): 0.592308
I1231 18:49:47.817360 139774748125056 model_lib_v2.py:1018] 	+ DetectionBoxes_Recall/AR@100 (large): 0.592308
INFO:tensorflow:	+ Loss/localization_loss: 0.681640
I1231 18:49:47.818885 139774748125056 model_lib_v2.py:1018] 	+ Loss/localization_loss: 0.681640


INFO:tensorflow:	+ Loss/classification_loss: 6.248461
I1231 18:49:47.820402 139774748125056 model_lib_v2.py:1018] 	+ Loss/classification_loss: 6.248461
INFO:tensorflow:	+ Loss/regularization_loss: 0.161301


I1231 18:49:47.821904 139774748125056 model_lib_v2.py:1018] 	+ Loss/regularization_loss: 0.161301
INFO:tensorflow:	+ Loss/total_loss: 7.091401
I1231 18:49:47.823434 139774748125056 model_lib_v2.py:1018] 	+ Loss/total_loss: 7.091401

In [None]:
from IPython.display import Image
Image(filename='../input/mobilenetresulttensorflowbeef/result/board1.png')

In [None]:
Image(filename='../input/mobilenetresulttensorflowbeef/result/board2.png')

In [None]:
Image(filename='../input/mobilenetresulttensorflowbeef/result/board3.png')

In [None]:
Image(filename='../input/mobilenetresulttensorflowbeef/result/board4.png')

In [None]:
Image(filename='../input/mobilenetresulttensorflowbeef/result/board5.png')

In [None]:
Image(filename='../input/mobilenetresulttensorflowbeef/result/board6.png')

In [None]:
from shutil import rmtree
rmtree('./mobilenet')

Now Lets see **YOLO V5**

![yolo](https://user-images.githubusercontent.com/26456083/86477109-5a7ca780-bd7a-11ea-9cb7-48d9fd6848e7.jpg)

In [None]:
# lets first download YOLO V5
# Download YOLOv5
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5

# Install dependencies
%pip install -r requirements.txt  
# change directory

%cd ../
import torch
print(f"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")

In [None]:
!pip install wandb
import wandb
!wandb login 0e3d646e6d324479398418c8659583d46b1013ec

In [None]:
df_train['sequence'].unique()
# The sequence represents ID of a gap-free subset of a given video
df_train['folds'] = -1

In [None]:
# Lets split the dataset now.
from sklearn.model_selection import GroupKFold

split = GroupKFold(n_splits=4)
X = df_train
Y = df_train.video_id.to_list()
groups = df_train.sequence.to_list()
for fold,(train_idx, val_idx) in enumerate(split.split(X,Y,groups)):
    df_train.loc[val_idx, 'folds'] = fold

df_train['folds'].value_counts()

In [None]:
df_train.sample(5)

In [None]:
#Some hyperparameters
BATCH = 16
IMG_SHAPE = 1280
VAL_FOLD = 2
EPOCH = 5

In [None]:
#Lets first create some directories and place our image there accordingly.
os.makedirs('Starfish/images/train', exist_ok=True)
os.makedirs('Starfish/images/val', exist_ok=True)
os.makedirs('Starfish/labels/train', exist_ok=True)
os.makedirs('Starfish/labels/val', exist_ok=True)

In [None]:
for idx in tqdm(range(df_train.shape[0])):
    instance = df_train.iloc[idx]
    if instance["folds"] != VAL_FOLD:
        copyfile(f'{instance["path"]}.jpg', 'Starfish/images/train/{}.jpg'.format(instance["image_id"]))
    else:
        copyfile(f'{instance["path"]}.jpg', 'Starfish/images/val/{}.jpg'.format(instance["image_id"]))

In [None]:
# Lets Analayse the instances
TRAIN_PATH = '/kaggle/working/Starfish/images/train'
VAL_PATH = '/kaggle/working/Starfish/images/val'

print('Number of training instances are: {}'.format(len(os.listdir(TRAIN_PATH))))
print('Number of validation instances are: {}'.format(len(os.listdir(VAL_PATH))))

Next step is to create yaml file which is the configuration file that YOLO V5 will read
It contains three minimal things:

    1. Path to the train/val/test images we can give in .txt format.
    
    2. Number of classes
    
    3. List of the names of classes

In [None]:
import yaml,glob

with open('/kaggle/working/train.txt', 'w') as work_train:
    for path in glob.glob(f"{TRAIN_PATH}/*"):
        work_train.write(path+'\n')
    
with open('/kaggle/working/val.txt', 'w') as work_val:
    for path in glob.glob(f"{VAL_PATH}/*"):
        work_val.write(path+'\n')

In [None]:
config = dict(   # from yolo github repo
    train = TRAIN_PATH,
    val = VAL_PATH,
    nc = 1,  #classes
    names = ['COTS']
)

In [None]:
with open("./yolov5/data/data.yaml",'w') as configuration:
    yaml.dump(config, configuration, default_flow_style=False)

# If you want collections to be always serialized in the block style,
#    set the parameter default_flow_style of dump() to False

In [None]:
os.listdir("./yolov5/data")

# Create Labels for YOLOv5
To label your images,a .txt file with the same name of the image,will be created (if no objects in image, no *.txt file is required) The *.txt file specifications are:

One row per object

Each row is class x_center y_center width height format.

Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.

Class numbers are zero-indexed (start from 0).


📍 Note: We don't have to remove the images without bounding boxes from the training or validation sets.

In [None]:
df_train.sample(2)

In [None]:
All_boxes = []
for idx in tqdm(range(df_train.shape[0])):
    instance = df_train.iloc[idx]
    
    image_id = instance['image_id']
    height = 720
    width = 1280
    boxes = np.array(instance['box_location']).astype(np.float).copy()
    num_boxes = len(boxes)
    names = ['COTS'] * num_boxes
    labels = [0] * num_boxes
    if instance["folds"] != VAL_FOLD:
        filename = '/kaggle/working/Starfish/labels/train/{}.txt'.format(instance["image_id"])
    else:
        filename = '/kaggle/working/Starfish/labels/val/{}.txt'.format(instance["image_id"])
    
    with open(filename, 'w') as file:
        normalized_boxes = normalize_image(boxes)
        cliped_boxes = np.clip(normalized_boxes, 0, 1)
        All_boxes.extend(cliped_boxes)
        for box_idx in range(len(cliped_boxes)):
            bb = str(cliped_boxes[box_idx])[1:-1]   
            annot = str(str(labels[box_idx])) + ' ' + bb + '\n'
            annot = ''.join(annot)
            annot = annot.strip('')
            file.write(annot)

In [None]:
train_instances = len(os.listdir('/kaggle/working/Starfish/labels/train'))
val_instances = len(os.listdir('/kaggle/working/Starfish/labels/val'))
print('training instances are: {}'.format(train_instances))
print('validation instances are: {}'.format(val_instances))

**Training with W&B**

In [None]:
%cd yolov5/

In [None]:
import torch
torch.cuda.empty_cache()

In [None]:
!python train.py --img {IMG_SHAPE} \
                 --batch {BATCH} \
                 --epochs {EPOCH} \
                 --data data.yaml \
                 --weights yolov5s.pt \
                 --project tf-reef

In [None]:
os.listdir("./kaggle/yolov5")

![work in progress](https://simpro.it/wp-content/uploads/2015/07/Work-in-progress-1024x603.png)