<a href="https://colab.research.google.com/github/aubricot/object_detection_for_image_cropping/blob/master/training_yolo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training YOLO in Darkflow to detect bats from EOL images
---
*Last Updated 29 January 2020*   
Use images and annotation files to train YOLO in Darkflow to detect bats from EOL images.

Datasets exported from [preprocessing.ipynb](https://colab.research.google.com/github/aubricot/object_detection_for_image_cropping/blob/master/preprocessing.ipynb) were converted to xml formatted annotation files before use in this notebook. Images and annotations should be uploaded to Google Drive before using this notebook to train YOLO in Darkflow. 

For each 24 hour period on Google Colab, you have up to 12 hours of GPU access. Training the object detection model on bats took 30 hours split into 3 days.

Make sure to set the runtime to Python 2 with GPU Hardware Accelerator.   

References:
[Official Darkflow training instructions](https://github.com/thtrieu/darkflow)   
[Medium Blog on training using YOLO via Darkflow in Colab](https://medium.com/coinmonks/detecting-custom-objects-in-images-video-using-yolo-with-darkflow-1ff119fa002f)

## Installs
---

In [0]:
# Mount google drive to import/export files
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

In [0]:
# Change to your working directory
%cd drive/My Drive/fall19_smithsonian_informatics/train

# Install libraries
# Make sure you are using Python 3.6
!python --version
!pip install tensorflow-gpu==1.15.0rc2
!pip install cython
!pip install opencv-python

In [0]:
# Download darkflow (the tensorflow implementation of YOLO)
import os
import pathlib
import shutil 

if os.path.exists("darkflow-master"):
  %cd darkflow-master/darkflow
  !pwd

elif not os.path.exists("darkflow-master"):
    !git clone --depth 1 https://github.com/thtrieu/darkflow.git
    # Compile darkflow
    %cd darkflow
    !python setup.py build_ext --inplace
    # Rename darkflow to darkflow-master to distinguish between folder names
    shutil.move('/content/drive/My Drive/fall19_smithsonian_informatics/train/darkflow', 
              '/content/drive/My Drive/fall19_smithsonian_informatics/train/darkflow-master')

# Change wd to darkflow-master
%cd ../

In [0]:
# Check time remaining in Google Colab session
import time, psutil
Start = time.time()- psutil.boot_time()
Left= 12*3600 - Start
print('Time remaining for this session is: {} hours'.format(format(Left/3600, '.2f')))

### Imports   
---

In [0]:
%cd darkflow-master

# For importing/exporting files, working with arrays, etc
from google.colab import files
import os
import pathlib
import imageio
import time
import csv
import urllib
import numpy as np
import pandas as pd

# For the actual object detection
from darkflow.net.build import TFNet

# For drawing onto and plotting the images
import matplotlib.pyplot as plt
import cv2
%config InlineBackend.figure_format = 'svg'
%matplotlib inline

### Model Preparation
---   
For detailed instructions on training YOLO using a custom dataset, see the [Darkflow GitHub Repository](https://github.com/thtrieu/darkflow).

In [0]:
# Upload yolo.weights, pre-trained weights file (for YOLO v2) from Google drive 
weights = 'bin/yolo'
weights_file = weights + '.weights'
if not os.path.exists('weights_file'): # why is this downloading when the path exists?
  !gdown --id 0B1tW_VtY7oniTnBYYWdqSHNGSUU
  !mkdir bin
  !mv yolo.weights bin

In [0]:
# Test installation, you should see an output with different parameters for flow
!python flow --h

#### if training on one class
---

In [0]:
# Make new label file/overwrite existing labels.txt downloaded with darkflow
!echo "Chiroptera" > labels.txt

# Upload model config file edited for training with 1 class
config = 'yolo-1c'
mod_config = config + '.cfg'
if not os.path.exists('mod_config'):
  %cd train/darkflow-master/cfg
  !gdown --id 1bjt5Mqvf4AZSLNARgtgmZsfHZSyFj2yx

#### if training on many classes
---

In [0]:
# Download config file to your computer
files.download('cfg/yolo.cfg') 

# Re-name the file to yolo-[no. classes]c.cfg, ex: yolo-1c.cfg
# Modify the file using a text editor
# Under last [convolutional] section (line 237), change filters, filters=(no. classes + 5)*5, ex: on 1 class, filters = 30
# Under last [region] section (line 244), change classes, classes=[no. classes]
# Manually upload the file to your working directory

## Train the model
---

In [0]:
# Train yolo-1c using pre-trained weights at yolo.weights for basal layers, last layer will be trained from scracth to detect bats
# Change the dataset and annotation directories to your paths in GoogleDrive
%cd darkflow-master
!python flow --model cfg/yolo-1c.cfg --train --trainer adam --load bin/yolo.weights --gpu 0.8 --epoch 3000 --dataset "/content/drive/My Drive/fall19_smithsonian_informatics/train/images" --annotation "test/training/annotations"

In [0]:
# For training the model starting from specified checkpoint
# change --load 750 to the number of steps of your most recent checkpoint file
#--pbLoad .pb --metaLoad .meta
!python flow --load 25625 --model cfg/yolo-1c.cfg --train --savepb --trainer adam --gpu 0.8 --epoch 3000 --dataset "/content/drive/My Drive/fall19_smithsonian_informatics/train/images" --annotation "test/training/annotations"

In [0]:
# Save the last checkpoint to protobuf file
!python flow --model cfg/yolo-1c.cfg --load -1 --savepb

In [0]:
# Saving graph and weights to protobuf file
!python flow --model cfg/yolo-1c.cfg --load bin/yolo-1c.weights --savepb

In [0]:
# Import your own images to sample_images, then run them through your trained object detector
!python flow --pbLoad built_graph/yolo-1c.pb --gpu 0.8 --metaLoad built_graph/yolo-1c.meta --imgdir "/content/drive/My Drive/fall19_smithsonian_informatics/train/mini_test"

## 'Flow' test images through the trained object detector
--- 

In [0]:
# For uploading an image from url
# Modified from https://www.pyimagesearch.com/2015/03/02/convert-url-to-image-with-python-and-opencv/
def url_to_image(url):
  resp = urllib.request.urlopen(url)
  image = np.asarray(bytearray(resp.read()), dtype="uint8")
  image = cv2.imdecode(image, cv2.IMREAD_COLOR)
  image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 
  return image

In [0]:
# For drawing bounding boxes around detected objects on images
def boxing(image, predictions):
    newImage = np.copy(image)
    im_height, im_width, im_depth = image.shape

    # Write the header of test_1.tsv before the loop
    with open('/content/drive/My Drive/fall19_smithsonian_informatics/train/crops_yolo_1c_test_1.tsv', 'a') as out_file:
                  tsv_writer = csv.writer(out_file, delimiter='\t')
                  tsv_writer.writerow('obj_url', 'height', 'width', 'xmin', 'ymin', 'xmax', 'ymax' )
    # Oraganize results of object detection for plotting and export
    for result in predictions:
        xmin = result['topleft']['x']
        ymin = result['topleft']['y']

        xmax = result['bottomright']['x']
        ymax = result['bottomright']['y']

        confidence = result['confidence']
        label = result['label'] + " " + str(round(confidence, 3))

        # only show boxes that are above .1 confidence and for the label, bird
        if confidence > 0.45 and result['label'] == 'bird' :
            # draw boxes on images
            fontScale = min(im_width,im_height)/(600)
            newImage = cv2.rectangle(newImage, (xmin, ymax), (xmax, ymin), (255, 0, 157), 3)
            newImage = cv2.putText(newImage, label, (xmin, ymax-5), cv2.FONT_HERSHEY_SIMPLEX, fontScale, (153, 255, 255), 5, cv2.LINE_AA)

            # optional: if mounted to drive, export detection results to sample_crops_yolo.tsv
            # must make empty sample_crops_yolo.tsv file in your drive first and paste the path to it below
            if os.path.exists('/content/drive/My Drive/fall19_smithsonian_informatics/train/crops_yolo_1c_test_1.tsv'):
              with open('/content/drive/My Drive/fall19_smithsonian_informatics/train/crops_yolo_1c_test_1.tsv', 'a') as out_file:
                  tsv_writer = csv.writer(out_file, delimiter='\t')
                  #crop_width = xmax-xmin
                  #crop_height = ymax-ymin
                  tsv_writer.writerow([image_url, im_height, im_width, 
                            xmin, ymin, xmax, ymax])
            
        else:
          print("No birds detected in {}.".format(image_url))
    return newImage

In [0]:
# Define parameters for "flow"ing the images through the model
params = {
    'model': 'cfg/yolo-1c.cfg',
    'load': 'bin/yolo.weights',
    'gpu': 0.8
    'threshold': 0.45, 
    'pbLoad': 'built_graph/yolo-1c.pb', 
    'metaLoad': 'built_graph/yolo-1c.meta' 
}

# Run the model
tfnet = TFNet(params)

In [0]:
# Loops through first 5 image urls from the text file
for i, row in df1.head(5).itertuples(index=True, name='Pandas'):

# For ranges of rows or all rows, use the commands below
# Can be useful if running batch jobs
#for i, row in df1.iloc[500:800].iterrows():
#for i, row in df1.itertuples(index=True, name='Pandas'):
#for i, row in df1.tail(5).itertuples(index=True, name='Pandas'):

  try:
    # Record inference time
    start_time = time.time()
    image_url = df1.get_value(i, "link")
    image = url_to_image(image_url)
    # Detection
    result = tfnet.return_predict(image)
    end_time = time.time()
    # Draw boxes on images
    boxing(image, result)
  
    # If running detection on >50 images, do not display detection results
    # Instead run below command to track progress
    #print('Detection complete in {} of 1,000 images'.format(i+1))
  
  except:
    print('Error: check if web address {} is valid'.format(image_url))
  
  # Plot and show detection boxes on images
  # Hashtag out this portion if running detection on >50 images
  _, ax = plt.subplots(figsize=(10, 10))
  ax.imshow(boxing(image, result))
  plt.title('{}) Inference time: {}'.format(i+1, format(end_time-start_time, '.2f')))
  plt.close()