# Coffee Object Detection Demo

<!-- <img src="https://pages.databricks.com/rs/094-YMS-629/images/SBUX-DB.png"/> -->

This notebook provides a simplified example of object detection (e.g. coffee, cups, chair, etc.) within images and videos.  In addition to use *Databricks Runtime for ML - GPU*, we are also working with the [Image AI](https://github.com/OlafenwaMoses/ImageAI/releases/download/2.0.2/imageai-2.0.2-py3-none-any.whl) library.

## Setup
There are additional installations to include the **pre-trained** models as part of **[ImageAI](https://github.com/OlafenwaMoses/ImageAI/blob/master/imageai/Detection/VIDEO.md):**
* **Yolo V3**: You only look once (YOLO) is a state-of-the-art, real-time object detection system; we are using V3 as noted in [YOLOv3: An Incremental Improvement](https://pjreddie.com/media/files/papers/YOLOv3.pdf) (using this model for images)
* **Resnet**: Residual Network model (in h5 format) that we are using for feature extraction (using this model for videos)

In [0]:
%sh
## Install the ImageAI pre-trained models: YoloV3, ResNet
# wget -P /dbfs/tmp/dennylee/cafe/ https://github.com/OlafenwaMoses/ImageAI/releases/download/1.0/yolo.h5
# wget -P /dbfs/tmp/dennylee/cafe/ https://github.com/OlafenwaMoses/ImageAI/releases/download/1.0/resnet50_coco_best_v2.0.1.h5

In [0]:
## ------------------------------
## Miscellaneous Functions
## ------------------------------
# displayVid(): Shows video from mounted cloud storage
def displayVid(filepath):
  return displayHTML("""
  <video width="480" height="320" controls>
  <source src="/files/%s" type="video/mp4">
  </video>
  """ % filepath)

# displayDbfsVid(): Shows video from DBFS
def displayDbfsVid(filepath):
  return displayHTML("""
  <video width="480" height="320" controls>
  <source src="/dbfs/%s" type="video/mp4">
  </video>
  """ % filepath)

# displayImg(): Shows image from dbfs/cloud storage
def displayImg(filepath):
  dbutils.fs.cp(filepath, "FileStore/%s" % filepath)
  return displayHTML("""
  <img src="/files/%s" width="800">
  """ % filepath)

In [0]:
# Configure TensorFlow for Auto-Growth
import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.visible_device_list = "0"
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.5

tf.Session(config=config)
tf.set_random_seed(42) # For reproducibility

## Extracting Coffee Drinks (and other Objects) from Video

Let's perform the same task as above, but this time extract out the objects using the following video.  
* We'll also use HorovodRunner code so we can distribute this when we are processing multiple videos (or video segments).

<!--
# Can display the video from `FileStore` 
displayHTML("""<video width="600" controls>
  <source src="/dbfs/mnt/tardis6/workspace/cafe/Coffee.mp4" type="video/mp4">
</video>""")
-->

In [0]:
displayHTML("""<video width="600" controls>
  <source src="https://pages.databricks.com/rs/094-YMS-629/images/Coffee3.mp4" type="video/mp4">
</video>
""")

The following code is similar to the previous image extraction except:
* It using `detectObjectsFromVideo` instead of `detectObjectsFromImage`
* We're wrapping it with HorovodRunner so it can be distributed across multiple nodes

In [0]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
from keras.callbacks import TensorBoard
import math
import time
import tempfile

# Use TensorFlow Backend
import tensorflow as tf
tf.set_random_seed(42) # For reproducibility

In [0]:
# Check for any available GPUs
K.tensorflow_backend._get_available_gpus()

In [0]:
# Pin GPU to be used to process local rank (one GPU per process)
config = tf.ConfigProto()
config.gpu_options.allow_growth = True

# Initialize TensorFlow Session
sess = tf.Session(config=config)
K.set_session(sess)

# Re-import these libraries on the workers (vs. the driver)
from imageai.Detection import VideoObjectDetection
import os

# Configuration of execution paths
root_path = "/dbfs/ml/sbux/cafe"
root_video = "Coffee3.mp4"
execution_path = "/dbfs/ml/sbux/cafe/sources"
execution_path_export = "/dbfs/ml/sbux/cafe/"

# Bring in VideoObjectDetection
detector = VideoObjectDetection()
detector.setModelTypeAsRetinaNet()
detector.setModelPath(os.path.join(root_path , "resnet50_coco_best_v2.0.1.h5"))
detector.loadModel()

# Detect objects in video
video_path = detector.detectObjectsFromVideo(
  input_file_path=os.path.join(execution_path, root_video),
  output_file_path=os.path.join(execution_path_export, "Coffee3_detected"),
  frames_per_second=20, 
  log_progress=True
)

### View Objects Detected from Video

Now that we have executed the `detectObjectsFromVideo` via ImageAI using the resnet model, let's:
* Review the output stored in the file system
* Review the objects identified in the video

<!-- 
# Can view videos directly the mount
displayHTML("""<video width="600" controls>
  <source src="/dbfs/mnt/tardis6/workspace/cafe/coffee_detected.mp4" type="video/mp4">
</video>""")
-->

In [0]:
%fs ls /ml/sbux/cafe/

path,name,size
dbfs:/ml/sbux/cafe/Coffee3_detected.avi,Coffee3_detected.avi,47507782
dbfs:/ml/sbux/cafe/resnet50_coco_best_v2.0.1.h5,resnet50_coco_best_v2.0.1.h5,152661008
dbfs:/ml/sbux/cafe/results.jpg,results.jpg,195043
dbfs:/ml/sbux/cafe/results.jpg-objects/,results.jpg-objects/,0
dbfs:/ml/sbux/cafe/sources/,sources/,0
dbfs:/ml/sbux/cafe/yolo.h5,yolo.h5,248686624


In [0]:
displayHTML("""
<video width="600" controls>
  <source src="https://pages.databricks.com/rs/094-YMS-629/images/Coffee3_detected.mp4" type="video/mp4">
</video>
""")

## Roadmap
We are starting with ImageAI due to the existing pre-trained models. Future plans include:
* Using `keras-yolov3` directly so we can train our own models to be more Starbucks specific (coffee, baked goods, cups, etc.) 
* Scale up to improve the speed of these models
* Scale out to handle more images and videos in a *distributed* fashion