# Run object detection on S+ Camera

You can run your code cell by cell with Jupyter Notebook on S+ Camera here. To run a cell, click to focus and press Ctrl + Enter. Just try it with the cell below, then you will find output 'Welcom to S+ Jupyter Notebook Handson!'

In [None]:
import cv2
import itertools
import json
import logging
import matplotlib.pyplot as plt
import numpy as np
import os
import requests
import time

from IPython.display import display, clear_output
from pylab import *

from surplus_camera_client import SurplusCameraClient

%matplotlib inline
plt.style.use('ggplot')

logger = logging.getLogger()
assert len(logger.handlers) == 1
handler = logger.handlers[0]
handler.setLevel(logging.WARNING)

def download_and_save(url, path):
    total_size = int(requests.head(url).headers.get('content-length'))
    print("Starting download {}, {} bytes.".format(url,total_size))
    r = requests.get(url,stream=True)
    downloaded = 0
    checkpoints = [25,50,75,100]
    checkpoint = checkpoints.pop(0)
    with open(path, 'wb') as f:
        for chunk in r.iter_content(chunk_size=1024):
            downloaded += len(chunk)
            progress = downloaded / total_size * 100
            if checkpoint < progress:
                print("{} bytes of {} bytes downloaded.".format(downloaded,total_size))
                checkpoint = checkpoints.pop(0)
            f.write(chunk)
    print("{} saved. File size: {} bytes.\n\n".format(path, r.headers['content-length']))
    
def crop_and_resize(image):
    image = deviceInterface.getCapture()
    image = image[image_crop['y_from']:image_crop['y_to'],
                  image_crop['x_from']:image_crop['x_to'],
                  :]
    image = cv2.resize(image, (int(image.shape[1] * 2), int(image.shape[0] * 2)))
    return image

print("Welcome to S+ Jupyter Notebook Handson!")

## Try capture image from camera

We have loaded and initialized reuiqred modules and so on. Now let's instantiate camera client and access an image.

In [None]:
camera_client = SurplusCameraClient.get_client()

frame = camera_client.capture_image_as_nparray()

plt.grid('off')
plt.axis('off')
plt.imshow(frame)
plt.show()

Did camera image show up? Now let's move on to preparation for object detection.

## Download model and weights

First, we download model(and weights) and labels. For this handson, we use [Caffe](https://caffe.berkeleyvision.org/) and MobileNet-SSD pre-trained model by [https://github.com/chuanqi305/MobileNet-SSD](https://github.com/chuanqi305/MobileNet-SSD). Please be noted that model(`MobileNetSSD_deploy.caffemodel`) is 20MB large, so it takes a while to download.

In [None]:
#Preparing model
pbtext_path = './MobileNetSSD_deploy.prototxt'
if not os.path.isfile(pbtext_path):
    pbtext_url = 'https://raw.githubusercontent.com/djmv/MobilNet_SSD_opencv/master/MobileNetSSD_deploy.prototxt'
    download_and_save(pbtext_url, pbtext_path)
else:
    print("Skipping download {} because it's already here.".format(pbtext_path))

weights_path = './MobileNetSSD_deploy.caffemodel'
if not os.path.isfile(weights_path):
    weights_url = 'https://raw.githubusercontent.com/djmv/MobilNet_SSD_opencv/master/MobileNetSSD_deploy.caffemodel'
    download_and_save(weights_url, weights_path)
else:
    print("Skipping download {} because it's already here.".format(weights_path))
    
#Labels of network.
classNames = { 0: 'background',
    1: 'aeroplane', 2: 'bicycle', 3: 'bird', 4: 'boat',
    5: 'bottle', 6: 'bus', 7: 'car', 8: 'cat', 9: 'chair',
    10: 'cow', 11: 'diningtable', 12: 'dog', 13: 'horse',
    14: 'motorbike', 15: 'person', 16: 'pottedplant',
    17: 'sheep', 18: 'sofa', 19: 'train', 20: 'tvmonitor' }

#Load the Caffe model 
net = cv2.dnn.readNetFromCaffe(pbtext_path, weights_path)

## Setup inference method

Then, let's define functions for object detection. Surprisingly, object detection itself has been done by just 3 lines of code below.

```
blob = cv2.dnn.blobFromImage(frame_resized, 0.007843, (300, 300), (127.5,127.5,127.5),False)
net.setInput(blob)
detections = net.forward()
```

Following lines do post processing like adding text labels and bounding boxes on top of image.

In [None]:
threshold = 0.2

def inference(frame):    
    frame_resized = cv2.resize(frame,(300,300))
    blob = cv2.dnn.blobFromImage(frame_resized, 0.007843, (300, 300), (127.5,127.5,127.5),False)
    net.setInput(blob)
    detections = net.forward()

    #Size of frame resize (300x300)
    cols = frame_resized.shape[1] 
    rows = frame_resized.shape[0]
    
    #Summary
    detection_summary = {}

    #For get the class and location of object detected, 
    # There is a fix index for class, location and confidence
    # value in @detections array .
    for i in range(detections.shape[2]):
        confidence = detections[0, 0, i, 2] #Confidence of prediction 
        if confidence > threshold: # Filter prediction 
            class_id = int(detections[0, 0, i, 1]) # Class label

            # Object location 
            xLeftBottom = int(detections[0, 0, i, 3] * cols) 
            yLeftBottom = int(detections[0, 0, i, 4] * rows)
            xRightTop   = int(detections[0, 0, i, 5] * cols)
            yRightTop   = int(detections[0, 0, i, 6] * rows)

            # Factor for scale to original size of frame
            heightFactor = frame.shape[0]/300.0  
            widthFactor = frame.shape[1]/300.0 
            # Scale object detection to frame
            xLeftBottom = int(widthFactor * xLeftBottom) 
            yLeftBottom = int(heightFactor * yLeftBottom)
            xRightTop   = int(widthFactor * xRightTop)
            yRightTop   = int(heightFactor * yRightTop)
            # Draw location of object  
            cv2.rectangle(frame, (xLeftBottom, yLeftBottom), (xRightTop, yRightTop),
                          (0, 255, 0))

            # Draw label and confidence of prediction in frame resized
            if class_id in classNames:
                label = classNames[class_id] + ": " + str(confidence)
                labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)

                yLeftBottom = max(yLeftBottom, labelSize[1])
                cv2.rectangle(frame, (xLeftBottom, yLeftBottom - labelSize[1]),
                                     (xLeftBottom + labelSize[0], yLeftBottom + baseLine),
                                     (255, 255, 255), cv2.FILLED)
                cv2.putText(frame, label, (xLeftBottom, yLeftBottom),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0))

                #print(label) #print class and confidence
                
                if(classNames[class_id] in detection_summary ):
                    detection_summary[classNames[class_id]]+=1
                else:
                    detection_summary[classNames[class_id]]=1
    return frame, detection_summary

## Try inference!

It's all set! Let's give a try.

In [None]:
frame = camera_client.capture_image_as_nparray()
frame,detection_summary = inference(frame)
print(detection_summary)

plt.grid('off')
plt.axis('off')
clear_output(wait=True)
plt.imshow(frame)
plt.show()

Did it work?

Here is an exmaple for main logic loop which does followings.
1. Capture image
2. Object detection
3. Show detection summary and send it to SORACOM Harvest
4. If `person` is included in detection, it also sends image itself to SORACOM Harvest Files.

You can stop anytime you would like to by pressing stop button above.

In [None]:
for i in range(0,10):
    
    frame = camera_client.capture_image_as_nparray()
    frame,detection_summary = inference(frame)
    print("Attempt #: {}".format(str(i+1)))

    if(len(detection_summary) > 0):
    
        print("""Detection summary: {}
Sent detection summary to SORACOM Harvest: {}""".format(
                json.dumps(detection_summary),  
                requests.post('http://harvest.soracom.io',json=detection_summary)
            )
        )

        if 'person' in detection_summary:
            result, jpeg = cv2.imencode('.jpg', frame)
            upload_path = '/surplusdemo/detection.jpg'
            sent_image = requests.post(
                'http://harvest-files.soracom.io',
                headers={'Content-Type':'image/jpeg'},
                data=jpeg.tobytes()
            )
            print("Sent image with person detected to {}: {}".format(
                upload_path,
                sent_image
            ))
    else:
        print("Nothing detected.")

    plt.grid('off')
    plt.axis('off')    
    plt.imshow(frame)
    plt.show()
    print("\n")
        
    
    

## Congratulations!

You walked through S+ Camera Basinc demo. You can start your own project with using this notebook as boilerplate :)