# The Great Barrier Reef:

![](https://www.worldatlas.com/r/w960-q80/upload/18/9a/e0/shutterstock-642116419.jpg)

**The Great Barrier Reef is the world’s largest coral reef system located in the Coral Sea off the shore of Queensland, northeastern Australia. It extends over a vast area of approximately 344,4000 sq. km and is is composed of over 2,900 individual reef systems, 760 fringe reefs, 300 coral rays and 900 islands that stretch over 2,300 km.**

**The reef system is composed and built by billions of coral polyps and now supports a wide variety of marine life ranging from ancient sea turtles, reef fish and 134 species of sharks, 400 different hard and soft corals in addition to a plethora of seaweeds. The reef has also cultural significance, as it was used by the Aboriginal Australians and Torres Strait Islanders.**


**The Great Barrier Reef is a biodiversity hotspot, housing at least 450 species of hard coral aa well as anemones, sponges, worms, gastropods, lobsters, crayfish, prawns and crabs. More than 1,500 species of fish inhabit the reef, such as wrasses, damselfish, triggerfish, angelfish, rays and sharks. Thirty species of crustaceans have been recorded in the reef, such as the dwarf minke whale, Indo-Pacific humpback dolphin and the humpback whale.Fifteen species of seagrass in beds attract the dugongs and turtles, with six species of sea turtles choosing the reef as its breeding spot. These species include the green sea turtle, leatherback sea turtle, hawksbill turtle, loggerhead sea turtle, flatback turtle and the Olive Ridley.**

src: https://www.worldatlas.com/heritage-sites/great-barrier-reef.html

![](https://www.worldatlas.com/r/w960-q80/upload/39/f6/5d/great-barrier-reef-01.png)

# Crown-of-Thorns Starfish
   

**In normal numbers on healthy coral reefs, COTS are an important part of the ecosystem. They tend to eat the faster growing corals which gives the slower growing species a chance to catch up, enhancing the coral diversity of our reefs. However, when the coral-eating starfish appear in outbreak proportions, the impact on coral reefs can be disastrous.**

![](https://th.bing.com/th/id/OIP.Et2ArvP-eXTAKdkIp7Vc0QHaFj?w=261&h=195&c=7&r=0&o=5&dpr=1.67&pid=1.7)

**The crown-of-thorns starfish preys on coral polyps. Large outbreaks of these starfish can devastate reefs. In 2000, an outbreak contributed to a loss of 66% of live coral cover on sampled reefs in a study by the Reef Research Centre (RRC). Outbreaks are believed to occur in natural cycles, worsened by poor water quality and overfishing of the starfish's predators.**

# References and Resources 
* https://www.kaggle.com/remekkinas/sahi-slicing-aided-hyper-inference-yv5-and-yx
* https://www.kaggle.com/awsaf49/great-barrier-reef-yolov5-train
* https://www.kaggle.com/remekkinas/yolox-training-pipeline-cots-dataset-lb-0-507?scriptVersionId=81353936
* https://www.kaggle.com/julian3833/reef-a-cv-strategy-subsequences
* https://www.kaggle.com/andradaolteanu/greatbarrierreef-full-guide-to-bboxaugmentation
* https://www.kaggle.com/dschettler8845/tf-find-the-cots-eda-baseline/notebook
* https://www.kaggle.com/steamedsheep/yolov5-high-resolution-training
* https://www.kaggle.com/remekkinas/yolox-training-pipeline-cots-dataset-lb-0-507?scriptVersionId=81353936

# Imports

In [1]:
%autosave 200
import folium
import os 
import numpy 
import pandas as pd 
import matplotlib.pyplot as plt 
import shutil
import numpy as np

from tqdm import tqdm

Autosaving every 200 seconds


**Weights and Bias Login**

In [2]:
import wandb

try:
    from kaggle_secrets import UserSecretsClient
    user_secrets = UserSecretsClient()
    api_key = user_secrets.get_secret("wandb_token")     # get token from kaggle secrets 
    wandb.login(key=api_key,anonymous=None)                                    # authenticate W and B account
    
except:
    wandb.login(anonymous='must')
    print('To use your W&B account,\nGo to Add-ons -> Secrets and provide your W&B access token. Use the Label name as WANDB. \nGet your W&B access token from here: https://wandb.ai/authorize')

[34m[1mwandb[0m: W&B API key is configured (use `wandb login --relogin` to force relogin)
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


# Loading Data

In [3]:
train_dir = '../input/tensorflow-great-barrier-reef/train_images' # training data  directory 

In [4]:
train = pd.read_csv('../input/tensorflow-great-barrier-reef/train.csv')
test = pd.read_csv('../input/tensorflow-great-barrier-reef/test.csv')

sample_sub = pd.read_csv('../input/tensorflow-great-barrier-reef/example_sample_submission.csv')

#train
train.head()

Unnamed: 0,video_id,sequence,video_frame,sequence_frame,image_id,annotations
0,0,40258,0,0,0-0,[]
1,0,40258,1,1,0-1,[]
2,0,40258,2,2,0-2,[]
3,0,40258,3,3,0-3,[]
4,0,40258,4,4,0-4,[]


**Basic EDA**

In [5]:
#basic eda 
print('Number of Videos in train data are :',train.video_id.nunique())

for idx,grp in train.groupby('video_id'):
    annot= len(grp[grp.annotations!='[]'])
    print(f'Video {idx} has {len(grp)} Images, of which {annot} Images have Annotations')


    
# marking df rows withoout annotations

train.loc[train['annotations']=='[]', 'no_annot'] = 1 
train.loc[train['no_annot']!=1, 'no_annot'] = 0 

Number of Videos in train data are : 3
Video 0 has 6708 Images, of which 2143 Images have Annotations
Video 1 has 8232 Images, of which 2099 Images have Annotations
Video 2 has 8561 Images, of which 677 Images have Annotations


In [6]:
print(f'Number of Images without annotations are {train.no_annot.sum()}')
print(f'% of Images without annotations are {(train.no_annot.sum()/len(train)).round(2)}')


Number of Images without annotations are 18582.0
% of Images without annotations are 0.79


In [7]:
#checking some annotataions
train[train['no_annot'] != 1].head()

Unnamed: 0,video_id,sequence,video_frame,sequence_frame,image_id,annotations,no_annot
16,0,40258,16,16,0-16,"[{'x': 559, 'y': 213, 'width': 50, 'height': 32}]",0.0
17,0,40258,17,17,0-17,"[{'x': 558, 'y': 213, 'width': 50, 'height': 32}]",0.0
18,0,40258,18,18,0-18,"[{'x': 557, 'y': 213, 'width': 50, 'height': 32}]",0.0
19,0,40258,19,19,0-19,"[{'x': 556, 'y': 214, 'width': 50, 'height': 32}]",0.0
20,0,40258,20,20,0-20,"[{'x': 555, 'y': 214, 'width': 50, 'height': 32}]",0.0


**Taking the data with annotated images**

In [8]:
#only taking images with Annotations for training 
train_set = train.query(expr = 'no_annot!=1') #drop im without annot 

train_set.shape

(4919, 7)

# YOLO Training

**Making Directories and copying files to those directories for training.**

In [9]:
#directories to copy files 
yolo_train = './yolo_data/fold1/images/train'
yolo_val   = './yolo_data/fold1/images/val'

yolo_train_labels = './yolo_data/fold1/labels/train'
yolo_val_labels   =  './yolo_data/fold1/labels/val'

In [10]:
!mkdir -p $yolo_train
!mkdir -p $yolo_val

!mkdir -p $yolo_train_labels
!mkdir -p $yolo_val_labels

**Functions to Copy Images and Convert the Annotations in the format YOLO expects.**

In [11]:
def copy_file(filepath,destination):
    '''copy files from source to dest'''
    shutil.copy(src=filepath,
               dst = destination)
    
    
def get_annotations(annotations,
                   image_height = 720,
                   image_width = 1280):
    '''return annotations formatted in YOLO format ,i.e [x-mid,y-mid,hieght,width], normalized by image height and width'''
    
    
    x_mid = annotations['x'] + annotations['width'] /2
    y_mid = annotations['y'] + annotations['height'] /2
    
    x_mid =x_mid/image_width
    y_mid=y_mid/image_height
    
    width = annotations['width']/image_width
    height = annotations['height']/image_height
    
    return f'0 {x_mid} {y_mid} {width} {height} \n'

def save_annot(path_to_save,
               annot):
    '''save annotations'''
    with open(path_to_save,'w') as f:
        f.write(annot)
        

**Moving the data in directories and converting the annotations in YOLO format**

In [12]:
#using 2nd video as validation fold :
yolo_ims = './yolo_data/fold1/images/'
yolo_lbl = './yolo_data/fold1/labels/'

use_this_video_as_val = 2 

for _,row in tqdm(iterable=train_set.iterrows(),
                  total=len(train_set),
                  desc = 'Files Copied % :'):        #copy files,convert and save annotations
    
    if row.video_id == use_this_video_as_val:
        set_type = 'val'
    else:
        set_type = 'train'
        
    
    video_id = row.video_id
    video_frame = row.video_frame
    img_id =  row.image_id
    annot = eval(row.annotations)[0]    # eval the annotations (which are in format str([x,y,width,height]))
    
    
    file_path = train_dir + f'/video_{video_id}/{video_frame}.jpg'  # filepath of img
    destination_path = yolo_ims + f'{set_type}/{img_id}.jpg'  #path to copy to 
    
    copy_file(file_path,destination_path) #copy 
    
    #get annot in yolo expected format
    yolo_annot = get_annotations(annot)
    
    #save annotations 
    save_annot(path_to_save = yolo_lbl + f'{set_type}/{img_id}.txt' ,
               annot=yolo_annot)
    
    
    


Files Copied % :: 100%|██████████| 4919/4919 [00:51<00:00, 96.07it/s]


# Cloning YOLO repo

In [13]:
!git clone https://github.com/ultralytics/yolov5.git -q

# Saving Hyperparm and Config file for yolo

**HyperParameters for YOLO:**


from : from https://www.kaggle.com/steamedsheep/yolov5-high-resolution-training/notebook

In [14]:
%%writefile ./yolov5/data/hyps/hyp.heavy.2.yaml


# YOLOv5 by Ultralytics, GPL-3.0 license
# Hyperparameters for COCO training from scratch
# python train.py --batch 40 --cfg yolov5m.yaml --weights '' --data coco.yaml --img 640 --epochs 300
# See tutorials for hyperparameter evolution https://github.com/ultralytics/yolov5#tutorials

lr0: 0.01  # initial learning rate (SGD=1E-2, Adam=1E-3)
lrf: 0.1  # final OneCycleLR learning rate (lr0 * lrf)
momentum: 0.937  # SGD momentum/Adam beta1
weight_decay: 0.0005  # optimizer weight decay 5e-4
warmup_epochs: 2.0  # warmup epochs (fractions ok)   (changed from inital 3)
warmup_momentum: 0.8  # warmup initial momentum
warmup_bias_lr: 0.1  # warmup initial bias lr
box: 0.05  # box loss gain
cls: 0.5  # cls loss gain
cls_pw: 1.0  # cls BCELoss positive_weight
obj: 1.0  # obj loss gain (scale with pixels)
obj_pw: 1.0  # obj BCELoss positive_weight
iou_t: 0.20  # IoU training threshold
anchor_t: 4.0  # anchor-multiple threshold
# anchors: 3  # anchors per output layer (0 to ignore)
fl_gamma: 0.0  # focal loss gamma (efficientDet default gamma=1.5)
hsv_h: 0.015  # image HSV-Hue augmentation (fraction)
hsv_s: 0.7  # image HSV-Saturation augmentation (fraction)
hsv_v: 0.4  # image HSV-Value augmentation (fraction)
degrees: 0.0  # image rotation (+/- deg)
translate: 0.1  # image translation (+/- fraction)
scale: 0.5  # image scale (+/- gain)
shear: 0.0  # image shear (+/- deg)
perspective: 0.0  # image perspective (+/- fraction), range 0-0.001
flipud: 0.5  # image flip up-down (probability)
fliplr: 0.5  # image flip left-right (probability)
mosaic: 1.0  # image mosaic (probability)
mixup: 0.5  # image mixup (probability)
copy_paste: 0.0  # segment copy-paste (probability)

Writing ./yolov5/data/hyps/hyp.heavy.2.yaml


****

In [15]:
%%writefile ./yolov5/data/reef_f1_naive.yaml

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../yolo_data/fold1/  # dataset root dir
train: images/train  # train images (relative to 'path') 128 images
val: images/val  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes
nc: 1  # number of classes
names: ['reef']  # class names


# Download script/URL (optional)
# download: https://ultralytics.com/assets/coco128.zip


Writing ./yolov5/data/reef_f1_naive.yaml


# Training

In [16]:
#change current working dir to yolo 

%cd yolov5/

/kaggle/working/yolov5


In [17]:

!python train.py --img 3000 --batch 4 --epochs 8 --data reef_f1_naive.yaml --weights yolov5s6.pt --name l6_3600_uflip_vm5_f1 --hyp data/hyps/hyp.heavy.2.yaml

[34m[1mwandb[0m: Currently logged in as: [33mvirajkadam[0m (use `wandb login --relogin` to force relogin)
[34m[1mwandb[0m: wandb version 0.12.10 is available!  To upgrade, please run:
[34m[1mwandb[0m:  $ pip install wandb --upgrade
[34m[1mwandb[0m: Tracking run with wandb version 0.12.7
[34m[1mwandb[0m: Syncing run [33ml6_3600_uflip_vm5_f1[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://wandb.ai/virajkadam/YOLOv5[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://wandb.ai/virajkadam/YOLOv5/runs/2o2nvy8o[0m
[34m[1mwandb[0m: Run data is saved locally in /kaggle/working/yolov5/wandb/run-20220213_083522-2o2nvy8o
[34m[1mwandb[0m: Run `wandb offline` to turn off syncing.





































[34m[1mwandb[0m: Waiting for W&B process to finish, PID 153... (failed 1). Press ctrl-c to abort syncing.
[34m[1mwandb[0m:                                                                         

**Delete Copied files after training**

In [18]:

!rm -r ../yolo_data/

In [19]:
!ls

CONTRIBUTING.md  __pycache__  hubconf.py	setup.cfg	val.py
Dockerfile	 data	      models		train.py	wandb
LICENSE		 detect.py    requirements.txt	tutorial.ipynb	yolov5s6.pt
README.md	 export.py    runs		utils
