<a href="https://colab.research.google.com/github/aubricot/object_detection_for_image_cropping/blob/master/aves_tf_ssd_rcnn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using Faster-RCNN and SSD in Tensorflow to detect birds from images   
---   
*Last Updated 9 December 2019*   
Using Faster-RCNN and SSD as methods to do customized, large-scale image processing with Tensorflow. Using the location and dimensions of the detected birds, images will be cropped to square dimensions that are centered and padded around the detection box. Pre-trained models are used for "out of the box" inference on images of birds of varying dimensions and resolutions, but will be modified and fine-tuned in future efforts for other taxonomic groups.

This notebook is meant to be run enitrely in Google Colab and doesn't require any software installations or downloads to your local machine. To get started, just click the "Open in Colab" button. 

It is modified from [here](https://medium.com/@nickbortolotti/tensorflow-object-detection-api-in-5-clicks-from-colaboratory-843b19a1edf1). The [Tensorflow Object Detection API Tutorial](https://github.com/tensorflow/models/tree/master/research/object_detection) was also used as a reference. Tensorflow Object Detection API is meant for building models for custom object detection, see more information here: [Tensorflow Object Detection API](https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/install.html#tensorflow-models-installation). 

## Installs
---
Install the Tensorflow Object Detection API directly to this Colab notebook.

In [0]:
!git clone https://github.com/tensorflow/models.git
!apt-get -qq install libprotobuf-java protobuf-compiler
!protoc ./models/research/object_detection/protos/string_int_label_map.proto --python_out=.
!cp -R models/research/object_detection/ object_detection/
!rm -rf models

### Imports   
---

In [0]:
%tensorflow_version 1.0

import tensorflow as tf 
tf.compat.v1.enable_eager_execution()

# For importing/exporting files, working with arrays, etc
import os
import pathlib
import six.moves.urllib as urllib
import sys
import tarfile
import zipfile
import numpy as np 
import csv
import matplotlib
import time
import pandas as pd

# For downloading the images
import tempfile
from six.moves.urllib.request import urlopen
from six import BytesIO
from collections import defaultdict
from io import StringIO

# For drawing onto and plotting the images
import matplotlib.pyplot as plt
from PIL import Image
from PIL import ImageColor
from PIL import ImageDraw
from PIL import ImageFont
from PIL import ImageOps

import cv2

from IPython.display import display

from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util

### Model Preparation
--- 
Configure the model to use and select needed elements to use the Object Detection API.

In [0]:
# What model to download. Can choose between SSD or Faster RCNN by commenting out/in the different MODEL_NAME(s) below
# SSD Model
# MODEL_NAME = 'ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_2018_07_03'

# Faster RCNN Model
MODEL_NAME = 'faster_rcnn_resnet50_coco_2018_01_28'

MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('object_detection/data', 'mscoco_label_map.pbtxt')

NUM_CLASSES = 90

opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
  file_name = os.path.basename(file.name)
  if 'frozen_inference_graph.pb' in file_name:
    tar_file.extract(file, os.getcwd())
    
detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.compat.v1.GraphDef()
  with tf.io.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')
    
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

## Define functions to load in sample images
--- 

In [0]:
def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)

In [0]:
def download_and_convert_image(url, display=False):
  _, filename = tempfile.mkstemp(suffix=".jpg")
  response = urlopen(url)
  image_data = response.read()
  image_data = BytesIO(image_data)
  pil_image = Image.open(image_data)
  pil_image_rgb = pil_image.convert("RGB")
  pil_image_rgb.save(filename, format="JPEG", quality=90)
  if display:
    display_image(pil_image)
  return filename

## Define function for object detection
---   
Use a Tensorflow session to detect objects using pre-trained models and display the results of detection.

In [0]:
def show_inference(image_np_expanded):
  with detection_graph.as_default():
    with tf.Session(graph=detection_graph) as sess:
      # Definite input and output Tensors for detection_graph
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      # Each box represents a part of the image where a particular object was detected.
      detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
      #max_boxes_to_draw = detection_boxes.shape[0] # add this line and remove (i) and (ii) below to show multiple detection boxes
      # Each score represent how level of confidence for each of the objects.
      # Score is shown on the result image, together with the class label.
      detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
      detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')
      min_score_thresh = .7

      # Actual detection.
      (boxes, scores, classes, num) = sess.run(
          [detection_boxes, detection_scores, detection_classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          min_score_thresh=.7,
          max_boxes_to_draw=1,
          line_thickness=8)
      
      plt.figure()
      plt.imshow(image_np)

## Load in sample images and run them through the object detector
---
You can either **A) Load individual images in by URL**, or for large image batches or **B) Load multiple images from a text file of image URLs**. Other methods for importing to Google Colab are listed [here](https://colab.research.google.com/notebooks/io.ipynb#scrollTo=XDg9OBaYqRMd). 

**A) Load individual images in by URL**
Load in images by URL and run the image detector for all images. Plotted results include the image with bounding box around detected objects (birds), class type, and confidence score. Inference times are printed above images.

In [0]:
image_urls = ["https://content.eol.org/data/media/7e/9c/7a/542.15445377044.jpg",
              "https://content.eol.org/data/media/81/1c/0d/542.7816025222.jpg",
              "https://content.eol.org/data/media/7e/3c/0b/542.10578857864.jpg"]

In [0]:
for image_url in image_urls:
      image_path = download_and_convert_image(image_url)
      image = Image.open(image_path)
      # The array based representation of the image is needed for detection
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      start_time = time.time()
      # Detection function
      show_inference(image_np_expanded)
      end_time = time.time()
      # Display inference time above images
      plt.title('Inference time: {}'.format(format(end_time-start_time, '.2f')))

**B) Load multiple images from a text file of image URLs** Load in multiple images from a text file of URLS and run the image detector for all images. Plotted results include the image with bounding box around detected objects (birds), class type, and confidence score. Inference times are printed above images. 

In [0]:
urls = 'https://editors.eol.org/other_files/bundle_images/files/images_for_Aves_breakdown_download_000001.txt'
df1 = pd.read_csv(urls)
df1.columns = ["link"]
pd.DataFrame.head(df1)

In [0]:
# Loops through first 5 image urls from the text file
for i, row in df1.head(5).itertuples(index=True, name='Pandas'):
  
    # Use YOLO for object detection  
    # Record inference time
    image_url = df1.get_value(i, "link")
    image_path = download_and_convert_image(image_url)
    image = Image.open(image_path)
    image_np = load_image_into_numpy_array(image)
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)
    start_time = time.time()
    show_inference(image_np_expanded)
    end_time = time.time()
    plt.title('{}) Inference time: {}'.format(i+1, format(end_time-start_time, '.2f'))) 