# Set up the CNN + model, Train it and Test it

**Before running this script, make sure that your Google Drive folder contains the tiles you created, the `tiling_scheme.json` file (`step 1`), and the 5 `csv` files that you created (`step 3`) to describe: the 3 data subsets (1 `csv` file), the annotations for each (3 `csv` files) and the class list (1 `csv` file).**

This code section is adapted from the following guide:
https://medium.com/@tabdulwahabamin/an-introduction-to-implementing-retinanet-in-keras-for-multi-object-detection-on-custom-dataset-be746024c653

<a href="https://colab.research.google.com/github/gl7176/GreySealCNN/blob/master/4_CNN_setup_training_testing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
#####  <center> Be sure to update this hyperlink above if you clone and want to point to a different GitHub </center>

### Connect to our Google Drive folder and pull all data
Note: when you run this it will give you a link that you must click. You must give Google some permissions, then copy a code into a box that comes up in the output section of this code.

If customizing this code, you will need to point the `drive_folder` variable to a URL for your shared google drive folder.

In [1]:
# set variable to the destination google drive folder you want to pull from
drive_folder = 'https://drive.google.com/drive/folders/1INuRNVKvKMy8L_Nb6lmoVbyvScWK0-0D'

!pip install -U -q PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials

# 1. Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# choose a local (colab) directory to store the data.
local_download_path = os.path.expanduser('data')
try:
  os.makedirs(local_download_path)
except: pass

# 2. Auto-iterate using the query syntax
#    https://developers.google.com/drive/v2/web/search-parameters

# this bit points the code to that google drive folder
pointer = str("'" + drive_folder.split("/")[-1] + "'" + " in parents")

file_list = drive.ListFile(
    {'q': pointer}).GetList()

#    this bit pulls every file in the directory specified above
image_list = []
count = 0
countb = 0
for f in file_list:
  fname = os.path.join(local_download_path, f['title'])
  countb += 1
  if fname.endswith(".png"):
    image_list.append(fname.split("/")[1])
    count += 1
    if count % 10 == 0:
      print("{c} out of {l} tiles pulled".format(c = str(count), l=str(len(file_list)-(countb - count))))
  # 3. Create & download by id.
  f_ = drive.CreateFile({'id': f['id']})
  f_.GetContentFile(fname)
  if fname.endswith(".csv") or fname.endswith(".json"):
    print("Pulled file: " + fname)
print("Complete: " + str(count) + " tiles pulled")

Pulled file: data/tiling_scheme.json
Pulled file: data/new_detections.json
Pulled file: data/classes.csv
Pulled file: data/annotations_train.csv
Pulled file: data/annotations_test.csv
Pulled file: data/subset_list.csv
Pulled file: data/annotations_valid.csv
Pulled file: data/via_SealCNN_TrainingData.csv
10 out of 244 tiles pulled
20 out of 244 tiles pulled
30 out of 244 tiles pulled
40 out of 244 tiles pulled
50 out of 244 tiles pulled
60 out of 244 tiles pulled
70 out of 244 tiles pulled
80 out of 244 tiles pulled
90 out of 244 tiles pulled
100 out of 244 tiles pulled
110 out of 244 tiles pulled
120 out of 244 tiles pulled
130 out of 244 tiles pulled
140 out of 244 tiles pulled
150 out of 244 tiles pulled
160 out of 244 tiles pulled
170 out of 244 tiles pulled
180 out of 244 tiles pulled
190 out of 244 tiles pulled
200 out of 244 tiles pulled
210 out of 244 tiles pulled
220 out of 244 tiles pulled
230 out of 244 tiles pulled
240 out of 244 tiles pulled
Complete: 244 tiles pulled


### Identify necessary files from the input directory

In [2]:
import csv, json, glob

# use this variable to set input directory
input_dir = local_download_path

training_data_file = 'annotations_train.csv'
testing_data_file = 'annotations_test.csv'
validation_data_file = 'annotations_valid.csv'
classes_file = 'classes.csv'
subset_list_file = 'subset_list.csv'
tiling_scheme_file = 'tiling_scheme_placeholder'

checklist = {training_data_file:"training_data_file", testing_data_file:"testing_data_file", 
             validation_data_file:"validation_data_file", classes_file:"classes_file",
             subset_list_file:"subset_list_file", tiling_scheme_file:"tiling_scheme_file"}

for fname in os.listdir(input_dir):
  if fname.endswith(".csv"): 
    try: 
      vars()[checklist[fname]] = "{i}/{f}".format(i=input_dir, f=fname)
      print("required file found: {v}".format(v=vars()[checklist[fname]]))
      del checklist[fname]
    except: print("{f} detected but not listed among requirements".format(f=fname))
  if fname.endswith(".json"):
    tiling_scheme_candidate = "{i}/{f}".format(i=input_dir, f=fname)
    with open(tiling_scheme_candidate) as f:
      try:
        tile_list = json.load(f)["tile_pointers"]["image_locations"]
        tiling_scheme_file = tiling_scheme_candidate
        print("required file found: {s}".format(s=tiling_scheme_file))
        del checklist['tiling_scheme_placeholder']
      except: 
        print("{f} detected but not listed among requirements".format(f=fname))

if len(checklist) > 0:
  for key in checklist:
    print("Error: did not find {k} in your input folder".format(k=key))
  raise Exception("missing specified data files")
  
# confirm that all files in tiling_scheme_file were pulled from Google Drive
if len(tile_list) != len(image_list):
  print('Step one produced {n1} tiles, but google drive contained {n2} images. Confirm that tile set is complete.\n'.format(n1=len(tile_list), n2=len(image_list)))
  raise Exception("tile count mismatch")

new_detections.json detected but not listed among requirements
required file found: data/tiling_scheme.json
required file found: data/annotations_valid.csv
via_SealCNN_TrainingData.csv detected but not listed among requirements
required file found: data/classes.csv
required file found: data/annotations_test.csv
required file found: data/annotations_train.csv
required file found: data/subset_list.csv


### Install the Convolutional Neural Network that will do the detections. 

This section sets up the software and pulls code for a CNN model called "RetinaNet" which uses the model "ResNet-50" as a subcomponent. This section then loads data for an existing ResNet-50 model (pre-trained for object detection) which we will further train for our task.

Disregard any errors or prompts to "restart runtime" unless the code stops progressing (then email me at gdl10@duke.edu).

In [3]:
# clear colab's current versions
!pip uninstall -y keras
!pip uninstall -y keras-nightly
!pip uninstall -y tensorflow
!pip install h5py==2.10.0  

# install the keras package we need
!pip3 install keras==2.4.3
# install the TF version we need
!pip3 uninstall tensorflow -y
!pip3 install tensorflow==2.3.0

Found existing installation: keras 2.7.0
Uninstalling keras-2.7.0:
  Successfully uninstalled keras-2.7.0
Found existing installation: tensorflow 2.7.0
Uninstalling tensorflow-2.7.0:
  Successfully uninstalled tensorflow-2.7.0
Collecting h5py==2.10.0
  Downloading h5py-2.10.0-cp37-cp37m-manylinux1_x86_64.whl (2.9 MB)
[K     |████████████████████████████████| 2.9 MB 13.0 MB/s 
Installing collected packages: h5py
  Attempting uninstall: h5py
    Found existing installation: h5py 3.1.0
    Uninstalling h5py-3.1.0:
      Successfully uninstalled h5py-3.1.0
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
keras-vis 0.4.1 requires keras, which is not installed.[0m
Successfully installed h5py-2.10.0
Collecting keras==2.4.3
  Downloading Keras-2.4.3-py2.py3-none-any.whl (36 kB)
Installing collected packages: keras
Successfully installed keras-2.4.3
Collecting ten

In [4]:
# copy the files for RetinaNet
# note that this build is now deprecated, but we are fine with that
# now pulling from a personal clone that outputs error metrics
! git clone https://github.com/gl7176/keras-retinanet.git

Cloning into 'keras-retinanet'...
remote: Enumerating objects: 6236, done.[K
remote: Total 6236 (delta 0), reused 0 (delta 0), pack-reused 6236[K
Receiving objects: 100% (6236/6236), 13.48 MiB | 18.40 MiB/s, done.
Resolving deltas: 100% (4221/4221), done.


In [5]:
# change directory and install RetinaNet from the copied code
% cd keras-retinanet

! pip install .

/content/keras-retinanet
Processing /content/keras-retinanet
[33m  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.[0m
Collecting keras-resnet==0.2.0
  Downloading keras-resnet-0.2.0.tar.gz (9.3 kB)
Building wheels for collected packages: keras-retinanet, keras-resnet
  Building wheel for keras-retinanet (setup.py) ... [?25l[?25hdone
  Created wheel for keras-retinanet: filename=keras_retinanet-1.0.0-cp37-cp37m-linux_x86_64.whl size=169810 sha256=1f2c739185bdf23a34bf970569509e186d25ad9cef90c06209c82d29ae5d65db
  Stored in directory: /root/.cache/pip/wheels/32/29/34/9b33c07f08b1be9e77607c1fc6b08c679489aa7ddaed329652
  Building wheel for keras-r

In [6]:
! python setup.py build_ext --inplace

running build_ext
cythoning keras_retinanet/utils/compute_overlap.pyx to keras_retinanet/utils/compute_overlap.c
  tree = Parsing.p_module(s, pxd, full_module_name)
building 'keras_retinanet.utils.compute_overlap' extension
creating build
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/keras_retinanet
creating build/temp.linux-x86_64-3.7/keras_retinanet/utils
x86_64-linux-gnu-gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fdebug-prefix-map=/build/python3.7-Y7dWVB/python3.7-3.7.12=. -fstack-protector-strong -Wformat -Werror=format-security -g -fdebug-prefix-map=/build/python3.7-Y7dWVB/python3.7-3.7.12=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.7m -I/usr/local/lib/python3.7/dist-packages/numpy/core/include -c keras_retinanet/utils/compute_overlap.c -o build/temp.linux-x86_64-3.7/keras_retinanet/utils/compute_overlap.o
In file included from [01m[K/usr

In [7]:
# if you run this multiple times in a session it will break your pathing
% cd ../

/content


In [43]:
import subprocess

resnet_model = "resnet152"

if resnet_model == "resnet50":
  resnet_model_name = "retinanet_resnet50_500_classes_0.4594.h5"
elif resnet_model == "resnet101":
  resnet_model_name = "retinanet_resnet101_500_classes_0.4986.h5"
elif resnet_model == "resnet152":
  resnet_model_name = "retinanet_resnet152_500_classes_0.4991.h5"

#! wget -P data "https://github.com/ZFTurbo/Keras-RetinaNet-for-Open-Images-Challenge-2018/releases/download/v1.3/retinanet_resnet152_500_classes_0.4991.h5"

subprocess.run(['wget', '-P', 'data', 'https://github.com/ZFTurbo/Keras-RetinaNet-for-Open-Images-Challenge-2018/releases/download/v1.3/{r}'.format(r=resnet_model_name)])

CompletedProcess(args=['wget', '-P', 'data', 'https://github.com/ZFTurbo/Keras-RetinaNet-for-Open-Images-Challenge-2018/releases/download/v1.3/retinanet_resnet152_500_classes_0.4991.h5'], returncode=0)

### Train the Model

We're giving our model the pre-trained weights that we downloaded above, and then we're telling it to use the `training_data_file`, to run with hyper-parameters `epoch_number` and `batch_size_number`.

An epoch is a group of steps after which the model calculates its accuracy; the epoch parameter is the maximum number the training will run before stopping; in this framework the model will stop running once it stops improving (based on mAP) for multiple epochs, determined by a 'patience' variable (here, 5).

A step is an increment of training the model on one batch or subset of files. The step size is limited on the upper bound by the training data (divided into batches), which we calculate in the code.

A batch is the number of images being analyzed in each step. Batch-size is functionally limited by RAM (how many images the computer can store in memory), and given our default tile size, Colab runs out of memory at batch sizes larger than 2.

In [44]:
# Pull the total number of training images so we can calculate the maximum step number

import csv

training_subset_count = 0
with open(subset_list_file) as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
      if "training" in row:
        training_subset_count += 1
print(training_subset_count)

210


In [None]:
import subprocess

epoch_number = 50
batch_size_number = 2
step_number = int(training_subset_count/batch_size_number)
print(str(step_number) + " steps possible")

# the following code gets passed to the terminal, but we've moved it into a
# subprocess that pulls variables instead; use terminal code for troubleshooting
# or for teaching when you want to see the training in action and use custom variables

# (uncommented in for teaching purposes)
#! keras-retinanet/keras_retinanet/bin/train.py \
#--weights data/resnet50_coco_best_v2.1.0.h5 \
#--epochs 20 --steps 10 --batch-size 2 \
#csv data/annotations_train.csv data/classes.csv \
#--val-annotations data/annotations_valid.csv

# this process takes a while to run, be warned! Consider running in terminal commands
# (commented out above) if you want to see the live output as it's running

# you can also monitor epoch outputs by output files in the "output" folder

# (commented out for teaching purposes)
#model_run = subprocess.check_output(['keras-retinanet/keras_retinanet/bin/train.py', str('--backbone="{r}"'.format(r=resnet_model)),
model_run = subprocess.check_output(['keras-retinanet/keras_retinanet/bin/train.py', str('--backbone={r}'.format(r=resnet_model)),
                 '--weights', str('data/' + resnet_model_name),
                 '--epochs', str(epoch_number),  '--steps', str(step_number), '--batch-size',
                 str(batch_size_number), 'csv', training_data_file, classes_file,
                 '--val-annotations', validation_data_file]).decode("utf-8")
print(model_run)

105 steps possible


In [None]:
list_of_files = glob.glob('snapshots/resnet*.h5')
latest_file = max(list_of_files, key=os.path.getctime)
epoch_final = latest_file[latest_file.index("_csv_")+5:-3]
best_model_training = latest_file.replace("/content/", "")
print(best_model_training)

This next section converts the model from training mode to inference mode so it can be used to detect our target objects (seals). Until now we've been updating the model based on its performance; now we're fixing the model in a static "snapshot" so we can test it out. This conversion process take a little time.

In [None]:
# note that we are naming our model "best_model_inference" and locating it in the "snapshots" directory. Customize if wanted
model_name = resnet-model + "best_model_inference"
#! keras-retinanet/keras_retinanet/bin/convert_model.py snapshots/resnet50_csv_10.h5 snapshots/best_model_inference.h5
subprocess.run(["keras-retinanet/keras_retinanet/bin/convert_model.py", best_model_training, "snapshots/{m}.h5".format(m=model_name)])


### Run Detection in inference mode

This section sets up the environment, importing modules for python tasks and specific to keras+retinanet

In [None]:
# show images inline
%matplotlib inline

# automatically reload modules when they have changed
%load_ext autoreload
%autoreload 2

# import keras
import keras

# import keras_retinanet
from keras_retinanet import models
from keras_retinanet.utils.image import read_image_bgr, preprocess_image, resize_image
from keras_retinanet.utils.visualization import draw_box, draw_caption
from keras_retinanet.utils.colors import label_color

# import miscellaneous modules
import matplotlib.pyplot as plt
import cv2
import os
import numpy as np
import time
import json
from random import shuffle

# set tf backend to allow memory to grow, instead of claiming everything
import tensorflow as tf

def get_session():
    config = tf.compat.v1.ConfigProto()    
    config.gpu_options.allow_growth = True
    return tf.compat.v1.Session(config=config)

# use this environment flag to change which GPU to use
#os.environ["CUDA_VISIBLE_DEVICES"] = "1"

# set the modified tf session as backend in keras
tf.compat.v1.keras.backend.set_session(get_session())

### Load RetinaNet model

Now we will load the model that you just converted into inference mode: by default it is called `best_model_inference.h5`, but you might have renamed your model_name variable

In [None]:
model_path = 'snapshots/{m}.h5'.format(m=model_name)

print(model_path)

# load retinanet model
model = models.load_model(model_path, backbone_name=resnet_model)

# you can check out a model summary by uncommenting this line
#print(model.summary())

# load label to names mapping for visualization purposes
# pull labels from classes.csv
import csv
with open(classes_file, "r") as f:
    reader = csv.reader(f, delimiter=",")
    labels_to_names = {int(i[1]):i[0] for i in reader}

### Load test imagery

Now we will load the "testing" subset of images that we downloaded into our data directory during setup (as listed in the subsets CSV file). Just to check, we'll print out the first five names of those images.

In [None]:
# load imagery
image_dir = "data/"

# this code pulls only files from the test or validation subset
# as specified in this variable, "target_subset" either "test" or "validation"
target_subset = "testing"

image_list = []
with open(subset_list_file, "r") as f:
    reader = csv.reader(f, delimiter=",")
    for i in reader:
      if i[1] == target_subset:
        image_list.append(image_dir + i[0])
print(image_list[:5])


### Test out detections

Now we'll visualize some detections from our model to see how it performs. Each detection has a "confidence score" that describes the CNN's confidence that the detection is correct. Change the minimum confidence score (the first line of code) and re-run the code to check out how your "confidence threshold" affects the numbers of false positives and false negatives.

In [None]:
min_score = 0.4 # this is the CNN's confidence that the detection is correct
detection_iterations = 10 # max number of images to visualize

visualize = True

detections = {}

total_time = 0

count = 0
shuffle(image_list)

for image_path in image_list:
    if count > detection_iterations:
        break
    else:
        count +=1
        
    image = read_image_bgr(image_path)

    if visualize:
        # copy to draw on
        draw = image.copy()
        draw = cv2.cvtColor(draw, cv2.COLOR_BGR2RGB)

    # preprocess image for network
    image = preprocess_image(image)
    image, scale = resize_image(image)

    # process image
    start = time.time()
    boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
    total_time += time.time() - start

    # correct for image scale
    boxes /= scale
    if any(score >= min_score for score in scores[0]):
        detections[image_path] = []

    # visualize detections
    for box, score, label in zip(boxes[0], scores[0], labels[0]):
        # scores are sorted so we can break
        if score < min_score:
            break

        #print(score)
        #print(box)

        # TODO this does create a slight error in the boxes, might be worth doing something like
        # list(map(str, box) but then would need to cast on the other end back to float
        b = box.astype(int)
        detections[image_path].append({"box" : b, "label" : label, "score" : score})

        if visualize:
            color = label_color(label)

            # b = box.astype(int)
            draw_box(draw, b, color=color)

            caption = "{} {:.3f}".format(labels_to_names[label], score)
            draw_caption(draw, b, caption)

    if any(score >= min_score for score in scores[0]):
        if visualize:
            plt.figure(figsize=(10, 10))
            plt.axis('off')
            plt.imshow(draw)
            plt.show()
    
print("Finished, time per image:", total_time/len(image_list))

### Run Detections on all tiles
This section repeats the process we just tested for all tiles that make up our orthomosaic. If you want to experiment, you can vary the confidence threshold and the amount of time the model trains, then look at how it affects the resulting detections.


In [None]:
visualize = False
min_score = min_score # this is the CNN's confidence that the detection is correct

detections = {}

total_time = 0

for image_path in image_list:
       
    image = read_image_bgr(image_path)

    if visualize:
        # copy to draw on
        draw = image.copy()
        draw = cv2.cvtColor(draw, cv2.COLOR_BGR2RGB)

    # preprocess image for network
    image = preprocess_image(image)
    image, scale = resize_image(image)

    # process image
    start = time.time()
    boxes, scores, labels = model.predict_on_batch(np.expand_dims(image, axis=0))
    total_time += time.time() - start

    # correct for image scale
    boxes /= scale
    if any(score >= min_score for score in scores[0]):
        detections[image_path] = []

    # visualize detections
    for box, score, label in zip(boxes[0], scores[0], labels[0]):
        # scores are sorted so we can break
        if score < min_score:
            break

        #print(score)
        #print(box)

        # TODO this does create a slight error in the boxes, might be worth doing something like
        # list(map(str, box) but then would need to cast on the other end back to float
        b = box.astype(int)
        detections[image_path].append({"box" : b, "label" : label, "score" : score})

        if visualize:
            color = label_color(label)

            # b = box.astype(int)
            draw_box(draw, b, color=color)

            caption = "{} {:.3f}".format(labels_to_names[label], score)
            draw_caption(draw, b, caption)

    if any(score >= min_score for score in scores[0]):
        if visualize:
            plt.figure(figsize=(10, 10))
            plt.axis('off')
            plt.imshow(draw)
            plt.show()
    
print("Finished, time per image:", total_time/len(image_list))

Run an evaluation script to get the mean average precision (mAP) of the CNN. 

mAP is a model evaluation metric that is relative (aka it can be challenging to compare mAP values across datasets), but a great general metric for different models and approaches to detection objects on the same dataset. 

Read more about mAP here: https://tarangshah.com/blog/2018-01-27/what-is-map-understanding-the-statistic-of-choice-for-comparing-object-detection-models/

In [None]:
#! keras-retinanet/keras_retinanet/bin/evaluate.py csv data/annotations_test.csv data/classes.csv snapshots/test_model.h5

precision_metrics = subprocess.check_output(['keras-retinanet/keras_retinanet/bin/evaluate.py', 'csv', testing_data_file, classes_file, model_path]).decode("utf-8")
model_summary = str('Model {m} was generated using {e} epochs, {s} steps and {b} batches'.format(m=model_name, e=epoch_final, s=step_number, b=batch_size_number))
print(model_summary)
print(precision_metrics)

### Export detections##
Write out the detections to a json file that can be used in a GIS for  spatial databases and/or visualizations.

In [None]:
class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super(MyEncoder, self).default(obj)

In [None]:
output_file_name = 'data/new_detections.json'
with open(output_file_name, 'w') as fp:
    json.dump(detections, fp, cls=MyEncoder)

In [None]:
from google.colab import files
files.download("/content/" + output_file_name)

### Export model and metrics

In [None]:
# export metrics (fast)
files.download("/content/output/Epoch-{n}.png".format(n=epoch_final))
files.download("/content/output/Epoch-{n}.csv".format(n=epoch_final))

In [None]:
#export inference model (slow)
files.download("/content/{m}".format(m=model_path))

In [None]:
#export training model (even slower)
files.download("/content/{m}".format(m=best_model_training))

#### At the end of this script you should have a single `json` file downloaded (in addition model training metrics in `png` and `csv` format), and two versions of the final model: an inferential version for deployment and a training version for additional training applications. Drop the `json` in the Google Drive folder so it can be imported in the next step.

Next steps:

5) export CNN outputs