#### What are you trying to do in this notebook?
My goal for this competition is to accurate identify starfish in real-time by building an object detection model trained on underwater videos of coral reefs. My work will help researchers to identify species that are threatening Australia's Great Barrier Reef and take well-informed action to protect the reef for future generations. In this notebook we explore sequences as potential units for cross-validation, but since there are only 20 sequences and their sizes are quite disimilar, we propose an approach to split them into smaller chunks, that we name subsequences.

#### Why are you trying it?
To detect crown-of-thorns starfish in underwater image data. In this competition, I will predict the presence and position of crown-of-thorns starfish in sequences of underwater images taken at various times and locations around the Great Barrier Reef. Predictions take the form of a bounding box together with a confidence score for each identified starfish. An image may contain zero or more starfish.

In this notebook we explore sequences as potential units for cross-validation, but since there are only 20 sequences and their sizes are quite disimilar, we propose an approach to split them into smaller chunks, that we name subsequences.

A sequence, as stated in the data tab of the competition, is:

sequence - ID of a gap-free subset of a given video. The sequence ids are not meaningfully ordered.

Subsequences, as we will define them below, are parts of a sequences where objects are continually present or are continually not present. We isolate 2 kind of subsequences: with objects and with no objects.


##### This notebook demonstrates how to run inference using an EfficientDet-D0 model trained with TensorFlow Object Detection API, and submit the detection result. See [this notebook](https://www.kaggle.com/khanhlvg/cots-detection-w-tensorflow-object-detection-api/) for details on how the model was trained.

In [None]:
import numpy as np
import os
import sys
import tensorflow as tf
import time

# Import the library that is used to submit the prediction result.
INPUT_DIR = '../input/tensorflow-great-barrier-reef/'
sys.path.insert(0, INPUT_DIR)
import greatbarrierreef

## Load the TensorFlow COTS detection model into memory and define some util functions for running inference.

In [None]:
MODEL_DIR = '../input/cots-detection-w-tensorflow-object-detection-api/cots_efficientdet_d0'
start_time = time.time()
tf.keras.backend.clear_session()
detect_fn_tf_odt = tf.saved_model.load(os.path.join(os.path.join(MODEL_DIR, 'output'), 'saved_model'))
end_time = time.time()
elapsed_time = end_time - start_time
print('Elapsed time: ' + str(elapsed_time) + 's')

In [None]:
def load_image_into_numpy_array(path):
    """Load an image from file into a numpy array.

    Puts image into numpy array to feed into tensorflow graph.
    Note that by convention we put it into a numpy array with shape
    (height, width, channels), where channels=3 for RGB.

    Args:
    path: a file path (this can be local or on colossus)

    Returns:
    uint8 numpy array with shape (img_height, img_width, 3)
    """
    img_data = tf.io.gfile.GFile(path, 'rb').read()
    image = Image.open(io.BytesIO(img_data))
    (im_width, im_height) = image.size
    
    return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

def detect(image_np):
    """Detect COTS from a given numpy image."""

    input_tensor = np.expand_dims(image_np, 0)
    start_time = time.time()
    detections = detect_fn_tf_odt(input_tensor)
    return detections

## Run inference and create the submission data

In [None]:
env = greatbarrierreef.make_env()   # initialize the environment
iter_test = env.iter_test()    # an iterator which loops over the test set and sample submission

In [None]:
DETECTION_THRESHOLD = 0.25

submission_dict = {
    'id': [],
    'prediction_string': [],
}

for (image_np, sample_prediction_df) in iter_test:
    height, width, _ = image_np.shape
    
    # Run object detection using the TensorFlow model.
    detections = detect(image_np)
    
    # Parse the detection result and generate a prediction string.
    num_detections = detections['num_detections'][0].numpy().astype(np.int32)
    predictions = []
    for index in range(num_detections):
        score = detections['detection_scores'][0][index].numpy()
        if score < DETECTION_THRESHOLD:
            continue

        bbox = detections['detection_boxes'][0][index].numpy()
        y_min = int(bbox[0] * height)
        x_min = int(bbox[1] * width)
        y_max = int(bbox[2] * height)
        x_max = int(bbox[3] * width)
        
        bbox_width = x_max - x_min
        bbox_height = y_max - y_min
        
        predictions.append('{:.2f} {} {} {} {}'.format(score, x_min, y_min, bbox_width, bbox_height))
    
    # Generate the submission data.
    prediction_str = ' '.join(predictions)
    sample_prediction_df['annotations'] = prediction_str
    env.predict(sample_prediction_df)

    print('Prediction:', prediction_str)

#### Did it work?
This competition uses a hidden test set that will be served by an API to ensure you evaluate the images in the same order they were recorded within each video. I tried doing EDA for the provided dataset. I plan to dig much dipper and understand the data in a much better way. There are alot of things to discover from this dataset.

#### What did you not understand about this process?
Well, everything provides in the competition data page. I've no problem while working on it. If you guys don't understand the thing that I'll do in this notebook then please comment on this notebook.

#### What else do you think you can try as part of this approach?
We solve the greatest challenges through innovative science and technology to unlock a better future for everyone. We are thinkers, problem solvers, leaders. We blaze new trails of discovery. We aim to inspire the next generation. The Great Barrier Reef Foundation creates a better future for coral reefs and their marine life through innovative projects and global advocacy efforts.

#### PLEASE UPVOTE if you find this notebook is useful for you !