# Object Following (50 pts)

In this notebook we'll show how you can follow an object with JetBot!  We'll use a pre-trained neural network
that was trained on the [COCO dataset](http://cocodataset.org) to detect 90 different common objects.  These include

* Person (index 0)
* Cup (index 47)

and many others (you can check [this file](https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_complete_label_map.pbtxt) for a full list of class indices).  The model is sourced from the [TensorFlow object detection API](https://github.com/tensorflow/models/tree/master/research/object_detection),
which provides utilities for training object detectors for custom tasks also!  Once the model is trained, we optimize it using NVIDIA TensorRT on the Jetson Nano GPU.

This makes the network very fast, capable of real-time execution on Jetson Nano!  We won't run through all of the training and optimization steps in this notebook though. The goal here is to demonstrate what one can do with neural networks. In the final project, you will get a chance to train neural networks for obstacle avoidance and navigation. 

## Important Tips ##

* For this assignment, you will want to use the long extension cable that was recently shipped to you. This cable is 10 feet long and should give the JetBot enough room to move around. 
- It will be helpful to work in a well-lit environment if possible. This will make the object tracking easier for the JetBot.
- Be careful with the JetBot! If possible, it would be helpful to find an area to work in that allows the JetBot to roam around freely. The JetBot moves quite fast and we recommend avoiding collisions. Given that we only have a short amount of time left in the semester we likely won’t be able to ship replacement parts, so please be as careful as possible. 
- You may want to angle the camera of the JetBot up a little bit so it can see the object you are following more easily (instead of being pointed in a way that only allows the camera to see the ground).

Anyways, let's get started!  First, we'll want to import the ``ObjectDetector`` class. You will also need to download the pre-trained neural network for object detection onto your laptop/desktop and then upload it to the JetBot. You can download the pre-trained neural network from here:

https://drive.google.com/file/d/1KjlDMRD8uhgQmQK-nC2CZGHFTbq4qQQH/view

Once you have downloaded the file, please upload it to the JetBot (you can just drag and drop the file).


### Compute detections on single camera image

In [None]:
from jetbot import ObjectDetector

model = ObjectDetector('ssd_mobilenet_v2_coco.engine')

In case you are interested in some details (feel free to ignore if not): internally, the ``ObjectDetector`` class uses the TensorRT Python API to execute the engine that we provide.  It also takes care of preprocessing the input to the neural network, as
well as parsing the detected objects.  Right now it will only work for engines created using the ``jetbot.ssd_tensorrt`` package. That package has the utilities for converting
the model from the TensorFlow object detection API to an optimized TensorRT engine.

Next, let's initialize our camera.  Our detector takes 300x300 pixel input, so we'll set this when creating the camera.

> Internally, the Camera class uses GStreamer to take advantage of Jetson Nano's Image Signal Processor (ISP).  This is super fast and offloads
> a lot of the resizing computation from the CPU. 

In [None]:
from jetbot import Camera

camera = Camera.instance(width=300, height=300)

Now, let's execute our network using some camera input.  By default the ``ObjectDetector`` class expects ``bgr8`` format that the camera produces.  However,
you could override the default pre-processing function if your input is in a different format.

In [None]:
detections = model(camera.value)

print(detections)

If there are any COCO objects in the camera's field of view, they should now be stored in the ``detections`` variable.

### Display detections in text area

We'll use the code below to print out the detected objects.

In [None]:
from IPython.display import display
import ipywidgets.widgets as widgets

detections_widget = widgets.Textarea()

detections_widget.value = str(detections)

display(detections_widget)

You should see the label, confidence, and bounding box of each object detected in each image.  There's only one image (our camera) in this example. 


To print just the first object detected in the first image, we could call the following

> This may throw an error if no objects are detected

In [None]:
image_number = 0
object_number = 0

print(detections[image_number][object_number])

### Control robot to follow central object

Now we want our robot to follow an object of the specified category (e.g., person, etc.).  To do this we'll do the following

1.  Detect objects matching the specified class
2.  Select object closest to center of camera's field of vision; this is the 'target' object
3.  Steer robot towards target object; otherwise stop

We'll also define the robot's speed and a simple turning controller that will control how fast the robot turns based off the distance between the target object and the center of the robot's field of view. 

First, let's define some scripts that will process the images from the JetBot. 

### Task 1 (10 pts) ###

Fill in the function "closest_detection" below. This should find the detected object that is closest to the center of the image. 

In [None]:
import torch
import torchvision
import torch.nn.functional as F
import cv2
import numpy as np

device = torch.device('cuda') # This makes the computations happen on the JetBot's GPU

mean = 255.0 * np.array([0.485, 0.456, 0.406])
stdev = 255.0 * np.array([0.229, 0.224, 0.225])

normalize = torchvision.transforms.Normalize(mean, stdev)

# Widget for displaying camera feed
image_widget = widgets.Image(format='jpeg', width=300, height=300)

# Define function for doing some basic pre-processing on images
def preprocess(camera_value):
    global device, normalize
    x = camera_value
    x = cv2.resize(x, (224, 224))
    x = cv2.cvtColor(x, cv2.COLOR_BGR2RGB)
    x = x.transpose((2, 0, 1))
    x = torch.from_numpy(x).float()
    x = normalize(x)
    x = x.to(device)
    x = x[None, ...]
    return x

from jetbot import bgr8_to_jpeg
import time

width = int(image_widget.width)
height = int(image_widget.height)

def detection_center(detection):
    """Computes the center x, y coordinates of the object"""
    bbox = detection['bbox']
    center_x = (bbox[0] + bbox[2]) / 2.0 - 0.5
    center_y = (bbox[1] + bbox[3]) / 2.0 - 0.5
    return (center_x, center_y)
    
def norm(vec):
    """Computes the length of the 2D vector"""
    return np.sqrt(vec[0]**2 + vec[1]**2)

def closest_detection(detections):
    """TODO: Find the detection closest to the image center"""
    closest_detection = None
    for det in detections:
        # Loop through and find the detection that is closest to the image center
        # You can use the detection_center function above to find the center of the detected object
        # Note that the origin (i.e., (x,y) = (0,0)) corresponds to the center of the image. So you can
        # use the "norm" function above to find the detection that is closest to the center.
        # Return the det that corresponds to the closest detection to the image center.
        # If nothing is detected, return None.
    return closest_detection
        

Great, now let's initialize our robot so we can control the motors.

In [None]:
from jetbot import Robot

robot = Robot()

Finally, let's display control widgets for displaing the camera feed, the label of the object that we want to track (e.g., person = 1), and the speed. The block of code below will also be responsible for making the JetBot follow the object.

### Task 2 (20 pts) ###

Fill in the block of code below that says "TODO" to make the JetBot follow the object. 

In [None]:
### You don't have to modify this block. But please read and understand ###

# Widgets for displaying label of object to be tracked, and the speed
label_widget = widgets.IntText(value=1, description='tracked label')
speed_widget = widgets.FloatSlider(value=0.25, min=0.0, max=1.0, description='speed')

# Display widgets
display(widgets.VBox([
    widgets.HBox([image_widget]),
    label_widget,
    speed_widget
]))

# Define main function that takes in an image and provides appropriate control inputs to the JetBot
def execute(change):
    image = change['new']
        
    # compute all detected objects
    detections = model(image)
    
    # draw all detections on image
    for det in detections[0]:
        # Coordinates of bounding box of object
        bbox = det['bbox'] 
        # Draw bounding box
        cv2.rectangle(image, (int(width * bbox[0]), int(height * bbox[1])), (int(width * bbox[2]), int(height * bbox[3])), (255, 0, 0), 2)
    
    # select detections that match selected class label
    matching_detections = [d for d in detections[0] if d['label'] == int(label_widget.value)]
    
    # get detection closest to center of field of view and draw it
    det = closest_detection(matching_detections) # This relies on the function you wrote above
    if det is not None:
        bbox = det['bbox']
        cv2.rectangle(image, (int(width * bbox[0]), int(height * bbox[1])), (int(width * bbox[2]), int(height * bbox[3])), (0, 255, 0), 5)
    
        
    # If nothing is detected, stop
    if det is None:
        robot.forward(float(0.0))
        
    ###################################
        
    ############ TODO: Write code that will make the robot follow the object ############
    
    # otherwsie steer towards target
    else:
        # move robot forward and steer proportional to target's x-distance from center
        center = detection_center(det)
        commanded_speed = speed_widget.value
        
        # Fill in your controller here. 
        # You can use the robot.set_motors() function to set the motor speeds (see the basic_motion 
        # notebook in /Notebooks/basic_motion)


        
    ########################################
    
    # update image widget
    image_widget.value = bgr8_to_jpeg(image)
    
execute({'new': camera.value})

camera.unobserve_all()
camera.observe(execute, names='value')

Awesome!  If the robot is not blocked you should see boxes drawn around the detected objects in blue.  The target object (which the robot follows) will be displayed in green.

The robot should steer towards the target when it is detected.  

You can call the code block below to manually disconnect the processing from the camera and stop the robot.

In [None]:
import time

camera.unobserve_all()
time.sleep(1.0)
robot.stop()

# Submission #

Please submit to Gradescope "HW9: Coding" a zip including: this notebook Lab9 (30pts), two videos (20pts see below), and Lab10 notebook (50pts).

For videos, please submit the following:
- (10 pts) A video (e.g., taken from your cellphone) showing the JetBot following you (or any other person). 
- (10 pts) A video showing the JetBot moving towards a different object (i.e., not a person). For this, you will have to change the "tracked label" in the code above to correspond to the object you want the JetBot to follow/move towards. You are welcome to choose any object that is convenient for you. For example, you can place a chair (or whatever object you choose) in front of the JetBot and demonstrate that your code makes the JetBot move towards that object. 