<a href="https://colab.research.google.com/github/gl7176/GreySealCNN/blob/master/full_seal_detection_workflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
#####  <center> Be sure to update this hyperlink above if you clone and want to point to a different GitHub </center>

# Set up the Model and Data

Download all data from Google Drive onto the Co.Lab machine. This section sets up the environment of our virtual machine, then downloads tiles from our mosaic onto that virtual machine.

In [None]:
!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_download_path = os.path.expanduser('data')
try:
  os.makedirs(local_download_path)
except: pass

# 2. Auto-iterate using the query syntax
#    https://developers.google.com/drive/v2/web/search-parameters
#    The mess of characters before "in parents" is the address of the google drive folder getting pulled
#    (when you open the folder in google drive on a browser, it is the mess of characters at the end of the URL)

file_list = drive.ListFile(
    {'q': "'1INuRNVKvKMy8L_Nb6lmoVbyvScWK0-0D' in parents"}).GetList()

#    this bit pulls every file in the directory specified above
count = 0
for f in file_list:
  count += 1
  if count % 10 == 0:
    print(count)
  # 3. Create & download by id.
  fname = os.path.join(local_download_path, f['title'])
  f_ = drive.CreateFile({'id': f['id']})
  f_.GetContentFile(fname)
  

In [None]:
! ls data | head -10

In [None]:
#! pip uninstall --yes tensorflow
#! pip uninstall --yes keras
! pip install keras==2.4
! pip install tensorflow==2.3.0

### Install the Convolutional Neural Network that will do the detections. 

This section pulls code for a CNN called "retinanet", an existing model that is already trained for object detection, which we will further train for our task.

In [None]:
! git clone https://github.com/fizyr/keras-retinanet/

In [None]:
% cd keras-retinanet

Install the "retinanet" code so we can run it.

In [None]:
! pip install .

In [None]:
! python setup.py build_ext --inplace

In [None]:
% cd ../

Download the pre-trained model that we will use as a starting point for our seal detecting CNN. This includes weights and parameters that have been generated by past training of retinanet on a generic dataset of miscellaneous objects.

In [None]:
! wget -P data "https://github.com/fizyr/keras-retinanet/releases/download/0.5.1/resnet50_coco_best_v2.1.0.h5"

# Train the Model

#### Now we'll actually train the model. 

We're giving it the pre-trained weights that we downloaded above, and then we're telling it to use the training data in annotations.csv and to run for 10 epochs each with 20 steps with a batch-size of two. An epoch is a group of steps after which the model calculates its accuracy; a step is an increment of training the model on one batch or subset of files.

This process could take 5-30 minutes, but you can stop it whenever you'd like as long as it has saved a couple models to the snapshots directory.

In [None]:
! keras-retinanet/keras_retinanet/bin/train.py --weights data/resnet50_coco_best_v2.1.0.h5 --epochs 10 --steps 20 --batch-size 2 csv data/annotations.csv data/classes.csv

Let's take a look at the format of the annotations to see how it is reading them. The format here is:

`filename, x1, y1, x2, y2`

These X,Y pairs describe \(1) the top-left and \(2) the bottom-right corners of each box, respectively.

In [None]:
! head data/annotations.csv

We are now done training the model, and we want to see how it performs on our data! This next section converts the model from training mode to inference mode so it can be used to detection seals. This conversion process take a little time.

In [None]:
! keras-retinanet/keras_retinanet/bin/convert_model.py snapshots/resnet50_csv_10.h5 snapshots/test_model.h5

# Run Detection

Now we wil get into the bulk of the python code: first we'll load the necessary python modules to handle and process our data.

In [None]:
# show images inline
%matplotlib inline

# automatically reload modules when they have changed
%load_ext autoreload
%autoreload 2

# import keras
import keras

# import keras_retinanet
from keras_retinanet import models
from keras_retinanet.utils.image import read_image_bgr, preprocess_image, resize_image
from keras_retinanet.utils.visualization import draw_box, draw_caption
from keras_retinanet.utils.colors import label_color

# import miscellaneous modules
import matplotlib.pyplot as plt
import cv2
import os
import numpy as np
import time
import json
from random import shuffle

# set tf backend to allow memory to grow, instead of claiming everything
import tensorflow as tf

def get_session():
    #config = tf.ConfigProto()
    # deprecated
    config = tf.compat.v1.ConfigProto()    
    config.gpu_options.allow_growth = True
    #return tf.Session(config=config)
    # deprecated
    return tf.compat.v1.Session(config=config)

# use this environment flag to change which GPU to use
#os.environ["CUDA_VISIBLE_DEVICES"] = "1"

# set the modified tf session as backend in keras
#keras.backend.tensorflow_backend.set_session(get_session())
# deprecated
tf.compat.v1.keras.backend.set_session(get_session())

## Load RetinaNet model

Now we will load the model that you just converted into inference mode, it is called `test_model.h5`.

In [None]:
model_path = 'snapshots/test_model.h5'

print(model_path)

# load retinanet model
model = models.load_model(model_path, backbone_name='resnet50')

# you can check out a model summary by uncommenting this line
#print(model.summary())

# load label to names mapping for visualization purposes
labels_to_names = {0: 'seal'}

## Load imagery

Now we will load all of the images that we downloaded into our data directory during setup (the tiles from our orthomosaic). Just to check, we'll print out a count of those images.

In [None]:
# load imagery
image_dir = "data/"

image_list = []
for root, dirs, files in os.walk(image_dir):  
    for filename in files:
        if filename.lower().endswith(('.png')):
            image_list.append(image_dir + filename)
print(len(image_list))

## Test out detections

Now we'll visualize some detections from our model to see how it performs. Each detection has a "confidence score" that describes the CNN's confidence that the detection is correct. Change the minimum confidence score (the first line of code) and re-run the code to check out how your "confidence threshold" affects the numbers of false positives and false negatives.

In [None]:
min_score = 0.5 # this is the CNN's confidence that the detection is correct
detection_iterations = 10 # max number of images to visualize

visualize = True

detections = {}

total_time = 0

count = 0
shuffle(image_list)

# GDL added this, necessary for multiple category labels
labels_to_names = {0: 'Adult', 1: 'Pup', 2: 'Unknown'}

for image_path in image_list:
    if count > detection_iterations:
        break
    else:
        count +=1
        
    image = read_image_bgr(image_path)

    if visualize:
        # copy to draw on
        draw = image.copy()
        draw = cv2.cvtColor(draw, cv2.COLOR_BGR2RGB)

    # preprocess image for network
    image = preprocess_image(image)
    image, scale = resize_image(image)

    # process image
    start = time.time()
    boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
    total_time += time.time() - start

    # correct for image scale
    boxes /= scale
    if any(score >= min_score for score in scores[0]):
        detections[image_path] = []

    # visualize detections
    for box, score, label in zip(boxes[0], scores[0], labels[0]):
        # scores are sorted so we can break
        if score < min_score:
            break

        #print(score)
        #print(box)

        # TODO this does create a slight error in the boxes, might be worth doing something like
        # list(map(str, box) but then would need to cast on the other end back to float
        b = box.astype(int)
        detections[image_path].append({"box" : b, "label" : label, "score" : score})

        if visualize:
            color = label_color(label)

            # b = box.astype(int)
            draw_box(draw, b, color=color)

            caption = "{} {:.3f}".format(labels_to_names[label], score)
            draw_caption(draw, b, caption)

    if any(score >= min_score for score in scores[0]):
        if visualize:
            plt.figure(figsize=(10, 10))
            plt.axis('off')
            plt.imshow(draw)
            plt.show()
    
print("Finished, time per image:", total_time/len(image_list))

## Run Detections on all tiles ##
This section repeats the process we just tested for all tiles that make up our orthomosaic. If you want to experiment, you can vary the confidence threshold and the amount of time the model trains, then look at how it affects the resulting detections.


In [None]:
visualize = False
min_score = 0.5 # this is the CNN's confidence that the detection is correct

detections = {}

total_time = 0

for image_path in image_list:
       
    image = read_image_bgr(image_path)

    if visualize:
        # copy to draw on
        draw = image.copy()
        draw = cv2.cvtColor(draw, cv2.COLOR_BGR2RGB)

    # preprocess image for network
    image = preprocess_image(image)
    image, scale = resize_image(image)

    # process image
    start = time.time()
    boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
    total_time += time.time() - start

    # correct for image scale
    boxes /= scale
    if any(score >= min_score for score in scores[0]):
        detections[image_path] = []

    # visualize detections
    for box, score, label in zip(boxes[0], scores[0], labels[0]):
        # scores are sorted so we can break
        if score < min_score:
            break

        #print(score)
        #print(box)

        # TODO this does create a slight error in the boxes, might be worth doing something like
        # list(map(str, box) but then would need to cast on the other end back to float
        b = box.astype(int)
        detections[image_path].append({"box" : b, "label" : label, "score" : score})

        if visualize:
            color = label_color(label)

            # b = box.astype(int)
            draw_box(draw, b, color=color)

            caption = "{} {:.3f}".format(labels_to_names[label], score)
            draw_caption(draw, b, caption)

    if any(score >= min_score for score in scores[0]):
        if visualize:
            plt.figure(figsize=(10, 10))
            plt.axis('off')
            plt.imshow(draw)
            plt.show()
    
print("Finished, time per image:", total_time/len(image_list))

Run an evaluation script to get the mean average precision (mAP) of the CNN. 

mAP is a model evaluation metric that is relative (aka it can be challenging to compare mAP values across datasets), but a great general metric for different models and approaches to detection objects on the same dataset. 

Read more about mAP here: https://tarangshah.com/blog/2018-01-27/what-is-map-understanding-the-statistic-of-choice-for-comparing-object-detection-models/

In [None]:
! keras-retinanet/keras_retinanet/bin/evaluate.py csv data/annotations.csv data/classes.csv snapshots/test_model.h5

## Export detections##
Write out the detections to a json file that can be used in a GIS for  spatial databases and/or visualizations.

In [None]:
class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super(MyEncoder, self).default(obj)

In [None]:
with open('data/new_detections.json', 'w') as fp:
    json.dump(detections, fp, cls=MyEncoder)