# TensorFlow - Help Protect the Great Barrier Reef
### Detect crown-of-thorns starfish in underwater image data
![](https://storage.googleapis.com/kaggle-competitions/kaggle/31703/logos/header.png?t=2021-10-29-00-30-04)

## Data Description:
In this competition, you will predict the presence and position of crown-of-thorns starfish in sequences of underwater images taken at various times and locations around the Great Barrier Reef. Predictions take the form of a bounding box together with a confidence score for each identified starfish. An image may contain zero or more starfish.

This competition uses a hidden test set that will be served by an API to ensure you evaluate the images in the same order they were recorded within each video. When your submitted notebook is scored, the actual test data (including a sample submission) will be availabe to your notebook.

### Files
>- train/ - Folder containing training set photos of the form video_{video_id}/{video_frame_number}.jpg.
>- train/test.csv - Metadata for the images. As with other test files, most of the test metadata data is only available to your notebook upon submission. Just the first few rows available for download.
>- video_id - ID number of the video the image was part of. The video ids are not meaningfully ordered.
>- video_frame - The frame number of the image within the video. Expect to see occasional gaps in the frame number from when the diver surfaced.
>- sequence - ID of a gap-free subset of a given video. The sequence ids are not meaningfully ordered.
>- sequence_frame - The frame number within a given sequence.
>- image_id - ID code for the image, in the format '{video_id}-{video_frame}'
>- annotations - The bounding boxes of any starfish detections in a string format that can be evaluated directly with Python. Does not use the 
same format as the predictions you will submit. Not available in test.csv. A bounding box is described by the pixel coordinate (x_min, y_min) of its upper left corner within the image together with its width and height in pixels.
>- example_sample_submission.csv - A sample submission file in the correct format. The actual sample submission will be provided by the API; this is only provided to illustrate how to properly format predictions. The submission format is further described on the Evaluation page.
>- example_test.npy - Sample data that will be served by the example API.
>- greatbarrierreef - The image delivery API that will serve the test set pixel arrays. You may need Python 3.7 and a Linux environment to run the example offline without errors.

Time-series API Details
The API serves the images one by one, in order by video and frame number, as pixel arrays.

Expect to see roughly 13,000 images in the test set.

The API will require roughly two GB of memory after initialization. The initialization step (env.iter_test()) will require meaningfully more memory than that; we recommend you do not load your model until after making that call. The API will also consume less than ten minutes of runtime for loading and serving the data.

Awesome notebooks referenced:
>- https://www.kaggle.com/andradaolteanu/greatbarrierreef-yolo-full-guide-train-infer
>- https://www.kaggle.com/awsaf49/great-barrier-reef-yolov5-train
>- https://www.kaggle.com/diegoalejogm/great-barrier-reefs-eda-with-animations

# Import packages

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
import cv2
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import matplotlib.patches as patches
from plotly.offline import plot, iplot, init_notebook_mode
init_notebook_mode(connected=True)
import glob
import shutil
import sys
from joblib import Parallel, delayed
from IPython.display import display
from tqdm import tqdm
from matplotlib import animation, rc
from PIL import Image, ImageDraw
import ast
import yaml
import torch
import gc
from torchvision.ops import box_iou

SEED = 42
np.random.seed(SEED)

# Load data

In [None]:
train_data = pd.read_csv('../input/tensorflow-great-barrier-reef/train.csv')
test_data = pd.read_csv('../input/tensorflow-great-barrier-reef/test.csv')
sample_submission_data = pd.read_csv('../input/tensorflow-great-barrier-reef/example_sample_submission.csv')

# EDA

In [None]:
print(train_data.shape)
train_data.head()

In [None]:
# Calculate the number of total annotations
train_data["no_annotations"] = train_data["annotations"].apply(lambda x: len(eval(x)))

In [None]:
print(test_data.shape)
test_data.head()

In [None]:
print(sample_submission_data.shape)
sample_submission_data.head()

## Distribution by # images

In [None]:
temp = train_data["no_annotations"].value_counts().reset_index().rename(columns = {'index':'# Annotations', 'no_annotations': '# Images'})
fig = px.bar(temp, x='# Annotations', y='# Images')
fig.update_yaxes(title="# Images")
fig.show()

In [None]:
temp_1 = train_data.groupby(['video_id']).agg({'image_id':'nunique'}).reset_index().rename(columns = {'image_id':'# Images'})
temp_2 = train_data[(train_data['annotations'].apply(lambda x: len(x))>2)].groupby(['video_id']).agg({'image_id':'nunique'}).reset_index().rename(columns = {'image_id':'# Images with annotations'})
temp = temp_1.merge(temp_2, how = 'left', on = 'video_id')
print('% Images with bounding:', temp['# Images with annotations'].sum()/temp['# Images'].sum() * 100, '%')
temp['% Images with annotations'] = temp['# Images with annotations']/temp['# Images'] * 100
temp['# Images without annotations'] = temp['# Images'] - temp['# Images with annotations']
fig = px.bar(temp, x='video_id', y=['# Images without annotations', '# Images with annotations'])
fig.update_yaxes(title="# Images")
fig.show()

In [None]:
temp_1 = train_data.groupby(['video_id', 'sequence']).agg({'image_id':'nunique'}).reset_index().rename(columns = {'image_id':'# Images'})
temp_2 = train_data[(train_data['annotations'].apply(lambda x: len(x))>2)].groupby(['video_id', 'sequence']).agg({'image_id':'nunique'}).reset_index().rename(columns = {'image_id':'# Images with annotations'})
temp_2.fillna(0, inplace = True)
temp = temp_1.merge(temp_2, how = 'left', on = ['video_id', 'sequence'])
temp['% Images with annotations'] = temp['# Images with annotations']/temp['# Images'] * 100
temp.sort_values(by = ['% Images with annotations'], ascending = False)

## Check if all images for a video are present in the input directory

In [None]:
for i in train_data['video_id'].unique().tolist():
    all_images_for_video_id = [int(x.split('.jpg')[0]) for x in os.listdir(f'../input/tensorflow-great-barrier-reef/train_images/video_{i}')]
    print('\n Number of images in train.csv for video id', i, ':', train_data[train_data['video_id']==i].shape[0])
    print('\n Number of images in train folder for video id', i, ':', len(all_images_for_video_id))
    all_images_for_video_id_in_train_csv = train_data[train_data['video_id']==i]['video_frame'].tolist()
    all_images_for_video_id_in_train_csv_not_in_train_folder = [x for x in all_images_for_video_id_in_train_csv if x not in all_images_for_video_id]
    print('\n Number of images in train.csv but not in train folder for video id', i, ':', len(all_images_for_video_id_in_train_csv_not_in_train_folder))
    all_images_for_video_id_in_train_folder_not_in_train_csv = [x for x in all_images_for_video_id if x not in all_images_for_video_id_in_train_csv]
    print('\n Number of images in train folder but not in train csv for video id', i, ':', len(all_images_for_video_id_in_train_folder_not_in_train_csv))
    print('---'*50)

## See sample images

In [None]:
video_id = 0
temp = train_data[(train_data['video_id']==video_id) & (train_data['annotations'].apply(lambda x: len(x))>2)].reset_index(drop = True)
temp.tail()

In [None]:
sequence_id = 996
temp[temp['sequence']==sequence_id].head()

In [None]:
temp[temp['no_annotations']>4].head()

OpenCV imread, imwrite and imshow indeed all work with the BGR order, so there is no need to change the order when you read an image with cv2.imread and then want to show it with cv2.imshow.

While BGR is used consistently throughout OpenCV, most other image processing libraries use the RGB ordering. If you want to use matplotlib's imshow but read the image with OpenCV, you would need to convert from BGR to RGB.

In [None]:
video_frame = 9651

img = cv2.imread('../input/tensorflow-great-barrier-reef/train_images/' + 'video_' + str(video_id) + '/' + str(video_frame) + '.jpg') 
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

annot = train_data[(train_data['video_id']==video_id) & (train_data['video_frame']==video_frame)]['annotations'].iloc[0]

# Output img with window name as 'image'
# Source: https://www.kaggle.com/andradaolteanu/greatbarrierreef-full-guide-to-bboxaugmentation

fig, axs = plt.subplots(figsize=(23, 8))

axs.imshow(img)

for a in eval(annot):
    rect = patches.Rectangle((a["x"], a["y"]), a["width"], a["height"], 
                             linewidth=3, edgecolor="#FF6103", facecolor='none')
    axs.add_patch(rect)

## Animation

In [None]:
train_data['annotations'] = train_data['annotations'].apply(eval)

In [None]:
rc('animation', html='jshtml')

v_id = 0
# num_images = train_data[train_data['video_id']==v_id]['video_frame'].tolist()
num_images = train_data[(train_data['video_id']==v_id) & (train_data['no_annotations']>0)]['video_frame'].tolist() # Only take images with annotations
# num_images = num_images[:int(len(num_images)/3)] # Take only a fraction of the images in a video

def fetch_image_list(df_tmp, video_id, num_images, start_frame_idx):
    def fetch_image(frame_id):
        path_base = '/kaggle/input/tensorflow-great-barrier-reef/train_images/video_{}/{}.jpg'
        raw_img = Image.open(path_base.format(video_id, frame_id))

        row_frame = df_tmp[(df_tmp.video_id == video_id) & (df_tmp.video_frame == frame_id)].iloc[0]
        bounding_boxes = row_frame.annotations

        for box in bounding_boxes:
            draw = ImageDraw.Draw(raw_img)
            x0, y0, x1, y1 = (box['x'], box['y'], box['x']+box['width'], box['y']+box['height'])
            draw.rectangle((x0, y0, x1, y1), outline=180, width=3)
        return raw_img

    return [np.array(fetch_image(index)) for index in num_images]

images = fetch_image_list(train_data, video_id = v_id, num_images = num_images, start_frame_idx = 0)

def create_animation(ims):
    fig = plt.figure(figsize=(9, 9))
    plt.axis('off')
    im = plt.imshow(ims[0])

    def animate_func(i):
        im.set_array(ims[i])
        return [im]

    return animation.FuncAnimation(fig, animate_func, frames = len(ims), interval = 1000//12)

create_animation(images)

In [None]:
del images

# Modelling

## YOLOv5

https://machinelearningknowledge.ai/introduction-to-yolov5-object-detection-with-tutorial/#Introduction

![](https://machinelearningknowledge.ai/ezoimgfmt/953894.smushcdn.com/2611031/wp-content/uploads/2021/06/YOLOv5-Architecture.jpg?lossy=0&strip=1&webp=1&ezimgfmt=rs:696x519/rscb1/ng:webp/ngcb1)

### Data preprocessing

In [None]:
IMAGE_DIR = '/kaggle/images' # directory to save images
LABEL_DIR = '/kaggle/labels' # directory to save labels
REMOVE_NOBBOX = True

In [None]:
train_data['old_image_path'] = train_data.apply(lambda x: '../input/tensorflow-great-barrier-reef/train_images/' + 'video_' + str(x['video_id']) + '/' + str(x['video_frame']) + '.jpg', axis = 1)
train_data['image_path']  = f'{IMAGE_DIR}/'+ train_data.image_id+'.jpg'
train_data['label_path']  = f'{LABEL_DIR}/'+ train_data.image_id+'.txt'

In [None]:
df = train_data.copy()

if REMOVE_NOBBOX:
    df = df[df['no_annotations']>0].reset_index(drop = True)

In [None]:
print(df.shape)
df.head()

In [None]:
# YOLOv5 requires write access
!mkdir -p {IMAGE_DIR}
!mkdir -p {LABEL_DIR}

def make_copy(row):
    shutil.copyfile(row.old_image_path, row.image_path)
    return

image_paths = df.old_image_path.tolist()
_ = Parallel(n_jobs=-1, backend='threading')(delayed(make_copy)(row) for _, row in tqdm(df.iterrows(), total=len(df)))

In [None]:
df['width'] = 1280
df['height'] = 720

In [None]:
def get_bbox(annots):
    bboxes = [list(annot.values()) for annot in annots]
    return bboxes

def get_imgsize(row):
    row['width'], row['height'] = imagesize.get(row['image_path'])
    return row

df['bboxes'] = df.annotations.apply(get_bbox)
df.head(2)

### Required data structure

Label structure:
    
>- COCO: [xmin, ymin, w, h]
>- VOC: [xmin, ymin, xmax, ymax]
>- YOLO input: [xmid, ymid, w, h] (normalized)
>- YOLO output: [xmin, ymin, xmax, ymax] --> VOC

We need to export our labels to YOLO format, with one *.txt file per image (if no objects in image, no *.txt file is required). 

The txt file specifications are:
>- One row per object
>- Each row is class [x_center, y_center, width, height] format.
>- Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide x_center and width by image width, and y_center and height by image height.
>- Class numbers are zero-indexed (start from 0).

Competition bbox format is COCO hence [x_min, y_min, width, height]. So, we need to convert form COCO to YOLO format.

![](https://editor.analyticsvidhya.com/uploads/95552Screenshot%202021-08-25%20at%2012.06.06%20AM.png)

In [None]:
# Helper functions
def voc2yolo(bboxes, image_height=720, image_width=1280):
    """
    voc  => [x1, y1, x2, y1]
    yolo => [xmid, ymid, w, h] (normalized)
    """
    
    bboxes = bboxes.copy().astype(float) # otherwise all value will be 0 as voc_pascal dtype is np.int
    
    bboxes[..., [0, 2]] = bboxes[..., [0, 2]]/ image_width
    bboxes[..., [1, 3]] = bboxes[..., [1, 3]]/ image_height
    
    w = bboxes[..., 2] - bboxes[..., 0]
    h = bboxes[..., 3] - bboxes[..., 1]
    
    bboxes[..., 0] = bboxes[..., 0] + w/2
    bboxes[..., 1] = bboxes[..., 1] + h/2
    bboxes[..., 2] = w
    bboxes[..., 3] = h
    
    return bboxes

def yolo2voc(bboxes, image_height=720, image_width=1280):
    """
    yolo => [xmid, ymid, w, h] (normalized)
    voc  => [x1, y1, x2, y1]
    
    """ 
    bboxes = bboxes.copy().astype(float) # otherwise all value will be 0 as voc_pascal dtype is np.int
    
    bboxes[..., [0, 2]] = bboxes[..., [0, 2]]* image_width
    bboxes[..., [1, 3]] = bboxes[..., [1, 3]]* image_height
    
    bboxes[..., [0, 1]] = bboxes[..., [0, 1]] - bboxes[..., [2, 3]]/2
    bboxes[..., [2, 3]] = bboxes[..., [0, 1]] + bboxes[..., [2, 3]]
    
    return bboxes

def coco2yolo(bboxes, image_height=720, image_width=1280):
    """
    coco => [xmin, ymin, w, h]
    yolo => [xmid, ymid, w, h] (normalized)
    """
    
    bboxes = bboxes.copy().astype(float) # otherwise all value will be 0 as voc_pascal dtype is np.int
    
    # normolizinig
    bboxes[..., [0, 2]]= bboxes[..., [0, 2]]/ image_width
    bboxes[..., [1, 3]]= bboxes[..., [1, 3]]/ image_height
    
    # convertion (xmin, ymin) => (xmid, ymid)
    bboxes[..., [0, 1]] = bboxes[..., [0, 1]] + bboxes[..., [2, 3]]/2
    
    return bboxes

def yolo2coco(bboxes, image_height=720, image_width=1280):
    """
    yolo => [xmid, ymid, w, h] (normalized)
    coco => [xmin, ymin, w, h]
    
    """ 
    bboxes = bboxes.copy().astype(float) # otherwise all value will be 0 as voc_pascal dtype is np.int
    
    # denormalizing
    bboxes[..., [0, 2]]= bboxes[..., [0, 2]]* image_width
    bboxes[..., [1, 3]]= bboxes[..., [1, 3]]* image_height
    
    # converstion (xmid, ymid) => (xmin, ymin) 
    bboxes[..., [0, 1]] = bboxes[..., [0, 1]] - bboxes[..., [2, 3]]/2
    
    return bboxes

def voc2coco(bboxes, image_height=720, image_width=1280):
    bboxes  = voc2yolo(bboxes, image_height, image_width)
    bboxes  = yolo2coco(bboxes, image_height, image_width)
    return bboxes

def load_image(image_path):
    return cv2.cvtColor(cv2.imread(image_path), cv2.COLOR_BGR2RGB)

In [None]:
cnt = 0
all_bboxes = []
bboxes_info = []
for row_idx in tqdm(range(df.shape[0])):
    row = df.iloc[row_idx]
    image_height = row.height
    image_width  = row.width
    bboxes_coco  = np.array(row.bboxes).astype(np.float32).copy()
    num_bbox     = len(bboxes_coco)
    names        = ['cots']*num_bbox
    labels       = np.array([0]*num_bbox)[..., None].astype(str)
    ## Create Annotation(YOLO)
    with open(row.label_path, 'w') as f:
        if num_bbox<1:
            annot = ''
            f.write(annot)
            cnt+=1
            continue
        
        bboxes_yolo  = coco2yolo(bboxes_coco, image_height, image_width)
        all_bboxes.append(bboxes_yolo.astype(float))
        annots = []
        
        # Write annotations in file
        for i in range(num_bbox):
            annot = ["0"] + \
                    bboxes_yolo[i].astype(str).tolist()
            
            annots.append(annot)
        string = '\n'.join([' '.join(annot) for annot in annots])
        f.write(string.strip())
            
df["yolo_bbox"] = all_bboxes
print('Missing:',cnt)

In [None]:
# Let's read the files
f1 = open('/kaggle/labels/0-8948.txt', 'r')
f1.read()

In [None]:
print('Total # images:', len(os.listdir("/kaggle/images")))

In [None]:
# Retrieve a sample of data
i = 1008
images = os.listdir("/kaggle/images")[i:i+6]
vid_id = [im.split("-")[0] for im in images]
seq_id = [im.split("-")[1].split(".")[0] for im in images]

# Plot
fig, axs = plt.subplots(2, 3, figsize=(23, 10))
axs = axs.flatten()
fig.suptitle(f"Sample of images and YOLO bounding boxes", fontsize = 20)

for k in range(6):
    
    # Get the data
    im = cv2.imread(f"/kaggle/images/{images[k]}")
    im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
    dh, dw, _ = im.shape
    txt = open(f"/kaggle/labels/{vid_id[k]}-{seq_id[k]}.txt", "r").read().split('\n')
    txt = ' '.join(txt).split(' ')[1:]
    no_boxes = int((len(txt)+1)/5)
    
    # Draw boxes
    i = 0
    num_box = 0
    while num_box < no_boxes:
        i = i+4
        num_box += 1
        box = txt[:i][-4:]
        i = i + 1
        
        # Src: https://github.com/pjreddie/darknet/blob/810d7f797bdb2f021dbe65d2524c2ff6b8ab5c8b/src/image.c#L283-L291
        # from YOLO to COCO
        x, y, w, h = box
        x, y, w, h = float(x), float(y), float(w), float(h)

        l = int((x - w / 2) * dw)
        r = int((x + w / 2) * dw)
        t = int((y - h / 2) * dh)
        b = int((y + h / 2) * dh)

        if l < 0: l = 0
        if r > dw - 1: r = dw - 1
        if t < 0: t = 0
        if b > dh - 1: b = dh - 1

        cv2.rectangle(im, (l, t), (r, b), (255,0,0), 3)
        
    # Show image with bboxes
    axs[k].set_title(f"Sample {k}", fontsize = 14)
    axs[k].imshow(im)
    axs[k].set_axis_off()

plt.tight_layout()
plt.show()

### YOLO configuration and training

In [None]:
# Video IDs 0 and 1 have the highest # annotated images. Hence, taking those for training. However, validation is somwwhat low.
train_ids = [0,2]
val_ids = [0]

# Get path to images & labels
train_images = list(df[df['video_id'].isin(train_ids)]["image_path"])
train_labels = list(df[df['video_id'].isin(train_ids)]["label_path"])

val_images = list(df[df['video_id'].isin(val_ids)]["image_path"])
val_labels = list(df[df['video_id'].isin(val_ids)]["label_path"])

print("Train Length:", len(df[df['video_id'].isin(train_ids)]), "\n" +
      "Test Length:", len(df[df['video_id'].isin(val_ids)]))

In [None]:
# Create train and test path data
with open("../working/train_images.txt", "w") as file:
    for path in train_images:
        file.write(path + "\n")
        
with open("../working/val_images.txt", "w") as file:
    for path in val_images:
        file.write(path + "\n")

# Create configuration
config = {'path': '/kaggle/working',
          'train': '/kaggle/working/train_images.txt',
          'val': '/kaggle/working/val_images.txt',
          'nc': 1,
          'names': ['cots']}

with open("../working/cots.yaml", "w") as file:
    yaml.dump(config, file, default_flow_style=False)

        
print("../working AFTER:", os.listdir("../working"))

In [None]:
# ---> YOLOv5 install <---
%cd /kaggle/working     
!cp -r /kaggle/input/yolov5-lib-ds /kaggle/working/yolov5     
%cd yolov5     
%pip install -qr requirements.txt   

from yolov5 import utils
display = utils.notebook_init()

In [None]:
%%writefile /kaggle/working/hyp.yaml
lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1  # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 3.0  # warmup epochs (fractions ok)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 0.05  # box loss gain
cls: 0.5  # cls loss gain
cls_pw: 1.0  # cls BCELoss positive_weight
obj: 1.0  # obj loss gain (scale with pixels)
obj_pw: 1.0  # obj BCELoss positive_weight
iou_t: 0.20  # IoU training threshold
anchor_t: 4.0  # anchor-multiple threshold
# anchors: 3  # anchors per output layer (0 to ignore)
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.10  # image translation (+/- fraction)
scale: 0.5  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.5  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 0.5  # image mosaic (probability)
mixup: 0.5 # image mixup (probability)
copy_paste: 0.0  # segment copy-paste (probability)

In [None]:
MODEL     = 'yolov5s'
BATCH     = 1
EPOCHS    = 2
OPTMIZER  = 'Adam'

PROJECT   = 'great-barrier-reef-public' # w&b in yolov5

REMOVE_NOBBOX = True # remove images with no bbox
ROOT_DIR  = '/kaggle/input/tensorflow-great-barrier-reef/'

SIZE = 4000
WORKERS = 1
RUN_NAME = f"{MODEL}_size{SIZE}_epochs{EPOCHS}_batch{BATCH}_simple"

In [None]:
# Training
!python train.py --img {SIZE}\
                --batch {BATCH}\
                --epochs {EPOCHS}\
                --optimizer {OPTMIZER}\
                --data /kaggle/working/cots.yaml\
                --hyp /kaggle/working/hyp.yaml\
                --weights ../../input/yolo-5s6/best.pt\
                --workers {WORKERS}\
                --project {PROJECT}\
                --name {RUN_NAME}\
                --exist-ok

### Check model performance

In [None]:
IMG_SIZE  = 10000 
CONF      = 0.275
IOU       = 0.2

In [None]:
OUTPUT_DIR = '{}/{}'.format(PROJECT, RUN_NAME)
!ls {OUTPUT_DIR}

In [None]:
plt.figure(figsize=(12,10))
plt.axis('off')
plt.imshow(plt.imread(f'{OUTPUT_DIR}/F1_curve.png'));

In [None]:
plt.figure(figsize=(12,10))
plt.axis('off')
plt.imshow(plt.imread(f'{OUTPUT_DIR}/confusion_matrix.png'));

In [None]:
for metric in ['F1', 'PR', 'P', 'R']:
    print(f'Metric: {metric}')
    plt.figure(figsize=(12,10))
    plt.axis('off')
    plt.imshow(plt.imread(f'{OUTPUT_DIR}/{metric}_curve.png'));
    plt.show()

In [None]:
fig, ax = plt.subplots(3, 2, figsize = (2*9,3*5), constrained_layout = True)
for row in range(3):
    ax[row][0].imshow(plt.imread(f'{OUTPUT_DIR}/val_batch{row}_labels.jpg'))
    ax[row][0].set_xticks([])
    ax[row][0].set_yticks([])
    ax[row][0].set_title(f'{OUTPUT_DIR}/val_batch{row}_labels.jpg', fontsize = 12)
    
    ax[row][1].imshow(plt.imread(f'{OUTPUT_DIR}/val_batch{row}_pred.jpg'))
    ax[row][1].set_xticks([])
    ax[row][1].set_yticks([])
    ax[row][1].set_title(f'{OUTPUT_DIR}/val_batch{row}_pred.jpg', fontsize = 12)
plt.show()

In [None]:
# Change our position within the directory back
%cd /kaggle/working

# --- Trained Model ---
MODEL_PATH = f'./yolov5/great-barrier-reef-public/{RUN_NAME}/weights/best.pt'
# MODEL_PATH ='../input/yolo-5s6/best.pt'

# Load the model
model = torch.hub.load("../input/yolov5-lib-ds", "custom",
                       path=MODEL_PATH,
                       source='local', force_reload=True)

# BoundingBox Confidence
model.conf = CONF
# Intersection Over Union
model.iou = IOU

model.classes = None   # (optional list) filter by class, i.e. = [0, 15, 16] for persons, cats and dogs
model.multi_label = False  # NMS multiple labels per box
model.max_det = 1000  # maximum number of detections per image

In [None]:
# Helper functions
# Source: https://github.com/awsaf49/bbox/blob/main/bbox/utils.py 

def plot_one_box(x, img, color=None, label=None, line_thickness=None):
    # Plots one bounding box on image img
    tl = line_thickness or round(0.002 * (img.shape[0] + img.shape[1]) / 2) + 1  # line/font thickness
    color = color or [random.randint(0, 255) for _ in range(3)]
    c1, c2 = (int(x[0]), int(x[1])), (int(x[2]), int(x[3]))
    cv2.rectangle(img, c1, c2, color, thickness=tl, lineType=cv2.LINE_AA)
    if label:
        tf = max(tl - 1, 1)  # font thickness
        t_size = cv2.getTextSize(label, 0, fontScale=tl / 3, thickness=tf)[0]
        c2 = c1[0] + t_size[0], c1[1] - t_size[1] - 3
        cv2.rectangle(img, c1, c2, color, -1, cv2.LINE_AA)  # filled
        cv2.putText(img, label, (c1[0], c1[1] - 2), 0, tl / 3, [225, 255, 255], thickness=tf, lineType=cv2.LINE_AA)
        
def draw_bboxes(img, bboxes, classes, class_ids, colors = None, show_classes = None, bbox_format = 'yolo', class_name = False, line_thickness = 2):  
     
    image = img.copy()
    show_classes = classes if show_classes is None else show_classes
    colors = (255, 0 ,0) if colors is None else colors
    
    if bbox_format == 'yolo':
        
        for idx in range(len(bboxes)):  
            
            bbox  = bboxes[idx]
            cls   = classes[idx]
            cls_id = class_ids[idx]
            color = colors[cls_id] if type(colors) is list else colors
            
            if cls in show_classes:
            
                x1 = round(float(bbox[0])*image.shape[1])
                y1 = round(float(bbox[1])*image.shape[0])
                w  = round(float(bbox[2])*image.shape[1]/2) #w/2 
                h  = round(float(bbox[3])*image.shape[0]/2)

                voc_bbox = (x1-w, y1-h, x1+w, y1+h)
                plot_one_box(voc_bbox, 
                             image,
                             color = color,
                             label = cls if class_name else str(get_label(cls)),
                             line_thickness = line_thickness)
            
    elif bbox_format == 'coco':
        
        for idx in range(len(bboxes)):  
            
            bbox  = bboxes[idx]
            cls   = classes[idx]
            cls_id = class_ids[idx]
            color = colors[cls_id] if type(colors) is list else colors
            
            if cls in show_classes:            
                x1 = int(round(bbox[0]))
                y1 = int(round(bbox[1]))
                w  = int(round(bbox[2]))
                h  = int(round(bbox[3]))

                voc_bbox = (x1, y1, x1+w, y1+h)
                plot_one_box(voc_bbox, 
                             image,
                             color = color,
                             label = cls if class_name else str(cls_id),
                             line_thickness = line_thickness)

    elif bbox_format == 'voc':
        
        for idx in range(len(bboxes)):  
            
            bbox  = bboxes[idx]
            cls   = classes[idx]
            cls_id = class_ids[idx]
            color = colors[cls_id] if type(colors) is list else colors
            
            if cls in show_classes: 
                x1 = int(round(bbox[0]))
                y1 = int(round(bbox[1]))
                x2 = int(round(bbox[2]))
                y2 = int(round(bbox[3]))
                voc_bbox = (x1, y1, x2, y2)
                plot_one_box(voc_bbox, 
                             image,
                             color = color,
                             label = cls if class_name else str(cls_id),
                             line_thickness = line_thickness)
    else:
        raise ValueError('wrong bbox format')

    return image

In [None]:
def predict(model, img, size=768, augment=False):
    height, width = img.shape[:2]
    results = model(img, size=size, augment=augment)  # custom inference size
    preds   = results.pandas().xyxy[0]
    bboxes  = preds[['xmin','ymin','xmax','ymax']].values
    if len(bboxes):
        bboxes  = voc2coco(bboxes,height,width).astype(int)
        confs   = preds.confidence.values
        return bboxes, confs
    else:
        return [],[]
    
def format_prediction(bboxes, confs):
    annot = ''
    if len(bboxes)>0:
        for idx in range(len(bboxes)):
            xmin, ymin, w, h = bboxes[idx]
            conf             = confs[idx]
            annot += f'{conf} {xmin} {ymin} {w} {h}'
            annot +=' '
        annot = annot.strip(' ')
    return annot

def show_img(img, bboxes, bbox_format='yolo', colors = (255, 0 ,0)):
    names  = ['starfish']*len(bboxes)
    labels = [0]*len(bboxes)
    
    img = draw_bboxes(img = img,
                           bboxes = bboxes, 
                           classes = names,
                           class_ids = labels,
                           class_name = True, 
                           colors = colors, 
                           bbox_format = bbox_format,
                           line_thickness = 2)
    
    return Image.fromarray(img).resize((800, 400))

In [None]:
def calculate_score(preds, gts, iou_th=0.5):
    num_tp = 0
    num_fp = 0
    num_fn = 0
    
    if len(preds)!=0:
        for p, gt in zip(preds, gts):
            gt[2] = gt[2] + gt[0]
            gt[3] = gt[3] + gt[1]
            p[2] = p[2] + p[0]
            p[3] = p[3] + p[1]
            gt = torch.Tensor(np.array(gt).reshape([1,4]))
            p = torch.Tensor(p.reshape([1,4]))
            if len(p) and len(gt):
                iou_matrix = box_iou(p, gt)[0].numpy()
                tp = iou_matrix[0] >= iou_th
                tp = tp.sum()
                fp = len(p) - tp
                fn = iou_matrix[0] < iou_th
                fn = fn.sum()
                num_tp += tp
                num_fp += fp
                num_fn += fn
            elif len(p) == 0 and len(gt):
                num_fn += len(gt)
            elif len(p) and len(gt) == 0:
                num_fp += len(p)
        score = 5 * num_tp / (5 * num_tp + 4 * num_fn + num_fp)
    else:
        score = 0
    return score

In [None]:
# Get the data
path = '/kaggle/images/0-2313.jpg'
im = cv2.imread('/kaggle/images/' + df[df['image_path']==path]['image_id'].iloc[0] + '.jpg')
vid_id = df[df['image_path']==path]['image_id'].iloc[0].split('-')[0]
seq_id = df[df['image_path']==path]['image_id'].iloc[0].split('-')[1]

im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
dh, dw, _ = im.shape
txt = open(f"/kaggle/labels/{vid_id}-{seq_id}.txt", "r").read().split(" ")[1:]
no_boxes = int(len(txt)/4)

# Plot
fig, axs = plt.subplots(1, 1, figsize=(23, 10))
fig.suptitle(f"Sample image and YOLO bounding boxes", fontsize = 20)

# Draw boxes
i = 0
while i < no_boxes:
    i = i+4
    box = txt[:i][-4:]

    # Src: https://github.com/pjreddie/darknet/blob/810d7f797bdb2f021dbe65d2524c2ff6b8ab5c8b/src/image.c#L283-L291
    # from YOLO to COCO
    x, y, w, h = box
    x, y, w, h = float(x), float(y), float(w), float(h)

    l = int((x - w / 2) * dw)
    r = int((x + w / 2) * dw)
    t = int((y - h / 2) * dh)
    b = int((y + h / 2) * dh)

    if l < 0: l = 0
    if r > dw - 1: r = dw - 1
    if t < 0: t = 0
    if b > dh - 1: b = dh - 1

    cv2.rectangle(im, (l, t), (r, b), (255,0,0), 3)

# Show image with bboxes
axs.imshow(im)
axs.set_axis_off()

plt.tight_layout()
plt.show()

In [None]:
(l, t), (r, b)

In [None]:
pd.set_option('display.max_columns', None)
df[df['image_path']==path]

In [None]:
train_data[train_data['image_path']==path]

In [None]:
# BoundingBox Confidence
model.conf = 0.2

from IPython.display import display
img = cv2.imread(path)[...,::-1]
bboxes, confis = predict(model, img, size=IMG_SIZE, augment=True)
display(show_img(img, bboxes, bbox_format='coco', colors = (255,0,0)))
display(show_img(img, df[df['image_path']==path]['bboxes'].iloc[0], bbox_format='coco', colors = (0,255,0)))

In [None]:
F2scores_list = []
conf_list = []
image_paths = df[(df['video_id'].isin(val_ids)) & (df.no_annotations>3)].sample(10).image_path.tolist()
for i in tqdm(range(0, 30, 1)):
    conf_list.append(i/100)
    model.conf = i/100
    F2scores_list_i = []
    for path in image_paths:
        img = cv2.imread(path)[...,::-1]
        bboxes, confis = predict(model, img, size=IMG_SIZE, augment=True)
        F2scores_list_i.append(calculate_score(bboxes, df[df['image_path']==path]['bboxes'].iloc[0]))
    F2scores_list.append(np.mean(F2scores_list_i))

In [None]:
plt.scatter(F2scores_list, conf_list)
plt.title("CONF vs F2 score")
plt.xlabel('CONF')
plt.ylabel('F2')
plt.show()

In [None]:
model.iou = 0.2

In [None]:
from IPython.display import display
image_paths = df[df.no_annotations>5].sample(100).image_path.tolist()
for idx, path in enumerate(image_paths):
    img = cv2.imread(path)[...,::-1]
    bboxes, confis = predict(model, img, size=IMG_SIZE, augment=True)
    display(show_img(img, bboxes, bbox_format='coco'))
    gc.collect()
    torch.cuda.empty_cache()
    if idx>2:
        break

# Inference

In [None]:
import greatbarrierreef
env = greatbarrierreef.make_env() # initialize the environment
iter_test = env.iter_test() # an iterator which loops over the test set and sample submission

In [None]:
for idx, (img, pred_df) in enumerate(tqdm(iter_test)):
    bboxes, confs  = predict(model, img, size=IMG_SIZE, augment=True)
    annot = format_prediction(bboxes, confs)
    pred_df['annotations'] = annot
    env.predict(pred_df)
    if idx<3:
        display(show_img(img, bboxes, bbox_format='coco'))