# Object Following Tutorial

<font color='red'>
Note for next batch of interns: As this tutorial is prepared for typically new users, only small part of the code should be allowed for the users to be edited because of the time restriction of the event. You can handover a psuedocode/ basic idea for object following and let the users edit it within a time frame. I have indicated the part which you can consider giving it to the users for playing with.</font> 
<a href="https://youtu.be/72aoHh-E9EE">Sample Video Link</a>


This tutorial involves a sequence of steps for object following using Jetbot! A pre-trained neural network
that was trained on the [COCO dataset](http://cocodataset.org) to detect 90 different common objects is used for this tutorial.  These include

* Person (index 0)
* Cup (index 47)

and many others (you can check [this file](https://github.com/tensorflow/models/blob/master/research/object_detection/data/mscoco_complete_label_map.pbtxt) for a full list of class indices).  Common objects which work very well with the Jetbot are person(index 0), cup(index 47) and bottle(index 44). The model is sourced from the [TensorFlow object detection API](https://github.com/tensorflow/models/tree/master/research/object_detection),
which provides utilities for training object detectors for custom tasks also!  Once the model is trained, we optimize it using NVIDIA TensorRT on the Jetson Nano.

This makes the network very fast, capable of real-time execution on Jetson Nano!  We won't run through all of the training and optimization steps in this notebook though.

Anyways, let's get started.  First, we'll want to import the ``ObjectDetector`` class which takes our pre-trained SSD engine.

In [1]:
from jetbot import ObjectDetector

## this will take a while to load
model = ObjectDetector('ssd_mobilenet_v2_coco.engine')

Internally, the ``ObjectDetector`` class uses the TensorRT Python API to execute the engine that we provide.  It also takes care of preprocessing the input to the neural network, as
well as parsing the detected objects.  Right now it will only work for engines created using the ``jetbot.ssd_tensorrt`` package. That package has the utilities for converting
the model from the TensorFlow object detection API to an optimized TensorRT engine.

Next, let's initialize our camera and widgets.  Our detector takes 300x300 pixel input, so we'll set this when creating the camera.

> Internally, the Camera class uses GStreamer to take advantage of Jetson Nano's Image Signal Processor (ISP).  This is super fast and offloads
> a lot of the resizing computation from the CPU. 

In [2]:
from jetbot import Camera
from IPython.display import display
import ipywidgets.widgets as widgets

camera = Camera.instance(width=300, height=300)

### Control robot to follow central object

Now we want our robot to follow an object of the specified class.  To do this we'll do the following

1.  Detect objects matching the specified class
2.  Select object closest to center of camera's field of vision, this is the 'target' object
3.  Steer robot towards target object, otherwise wander
4.  If we're blocked by an obstacle, turn left

We'll also create some widgets that we'll use to control the target object label, the robot speed, and
a "turn gain", that will control how fast the robot turns based off the distance between the target object
and the center of the robot's field of view. 


First, let's load our collision detection model.  The pre-trained model is stored in this directory as a convenience, but if you followed
the collision avoidance example you may want to use that model if it's better tuned for your robot's environment.

In [3]:
import torch
import torchvision
import torch.nn.functional as F
import cv2
import numpy as np

collision_model = torchvision.models.alexnet(pretrained=False)


#################### loading of collision avoidance model ##########################################
### The following two lines will load the collision avoidance model, but this will consume additional CPU and we do not encourage to have it. 
### Uncomment the next two lines to try with the collision avoidance models implemented as well

# collision_model.classifier[6] = torch.nn.Linear(collision_model.classifier[6].in_features, 2)
# collision_model.load_state_dict(torch.load('../collision_avoidance/best_model_alex.pth'))

#################### End of the loading the collision avoidance model ##############################

device = torch.device('cuda')
collision_model = collision_model.to(device)

mean = 255.0 * np.array([0.485, 0.456, 0.406])
stdev = 255.0 * np.array([0.229, 0.224, 0.225])

normalize = torchvision.transforms.Normalize(mean, stdev)

def preprocess(camera_value):
    global device, normalize
    x = camera_value
    x = cv2.resize(x, (224, 224))
    x = cv2.cvtColor(x, cv2.COLOR_BGR2RGB)
    x = x.transpose((2, 0, 1))
    x = torch.from_numpy(x).float()
    x = normalize(x)
    x = x.to(device)
    x = x[None, ...]
    return x

Great, now let's initialize our robot so we can control the motors.

In [4]:
from jetbot import Robot

robot = Robot()

Finally, let's display all the control widgets and connect the network execution function to the camera updates.

In [5]:
from jetbot import bgr8_to_jpeg

blocked_widget = widgets.FloatSlider(min=0.0, max=1.0, value=0.0, description='blocked')
image_widget = widgets.Image(format='jpeg', width=300, height=300)
label_widget = widgets.IntText(value=47, description='tracked label')
speed_widget = widgets.FloatSlider(value=0.2, min=0.0, max=1.0, description='speed')
turn_gain_widget = widgets.FloatSlider(value=0.25, min=0.0, max=2.0, description='turn gain')

display(widgets.VBox([
    widgets.HBox([image_widget, blocked_widget]),
    label_widget,
    speed_widget,
    turn_gain_widget
]))

width = int(image_widget.width)
height = int(image_widget.height)

def detection_center(detection):
    """Computes the center x, y coordinates of the object"""
    bbox = detection['bbox']
    center_x = (bbox[0] + bbox[2]) / 2.0 - 0.5
    center_y = (bbox[1] + bbox[3]) / 2.0 - 0.5
    return (center_x, center_y)
    
def norm(vec):
    """Computes the length of the 2D vector"""
    return np.sqrt(vec[0]**2 + vec[1]**2)

def closest_detection(detections):
    """Finds the detection closest to the image center"""
    closest_detection = None
    for det in detections:
        center = detection_center(det)
        if closest_detection is None:
            closest_detection = det
        elif norm(detection_center(det)) < norm(detection_center(closest_detection)):
            closest_detection = det
    return closest_detection
        
def execute(change):
    image = change['new']
    
    # execute collision model to determine if blocked
    collision_output = collision_model(preprocess(image)).detach().cpu()
    prob_blocked = float(F.softmax(collision_output.flatten(), dim=0)[0])
    blocked_widget.value = prob_blocked
    
    ##### Collision avoidance didn't work very well and it is not our main objective of this tutorial, thus I have commented it. ###############
    # Consider using the model in the object avoidance example, if necessary 
    # # turn left if blocked
    #     if prob_blocked > 0.95:
    #     robot.left(0.1)
    #     image_widget.value = bgr8_to_jpeg(image)
    #     return
    ######################################################################################################################################
        
        
        
        
    # compute all detected objects
    detections = model(image)
    
    # draw all detections on image
    for det in detections[0]:
        bbox = det['bbox']
        cv2.rectangle(image, (int(width * bbox[0]), int(height * bbox[1])), (int(width * bbox[2]), int(height * bbox[3])), (255, 0, 0), 2)
    
    # select detections that match selected class label
    matching_detections = [d for d in detections[0] if d['label'] == int(label_widget.value)]
    
    # get detection closest to center of field of view and draw it
    det = closest_detection(matching_detections)
    if det is not None:
        bbox = det['bbox']
        cv2.rectangle(image, (int(width * bbox[0]), int(height * bbox[1])), (int(width * bbox[2]), int(height * bbox[3])), (0, 255, 0), 5)
    
    ##########################################start of user modification ##################################################
    #### User can work with the following logic and interpret in other ways to perform object following function(Preferable if a psuedocode is given)
    #Do not move until specific object is detected
    if det is None:
        robot.set_motors(0,0)
        
    #Otherwise steer towards target
    else:
        # move robot forward and steer proportional target's x-distance & y- distance from the center
        center = detection_center(det)
        if abs(center[1]) < 0.1:
            speed = 0
        else:
            speed = abs(speed_widget.value * 3 * center[1])
            robot.set_motors(
            float(speed + turn_gain_widget.value * center[0]),
            float(speed - turn_gain_widget.value * center[0])
            )
    ############################################# end of user modification ########################################
    
    # update image widget
    image_widget.value = bgr8_to_jpeg(image)
    
execute({'new': camera.value})

VBox(children=(HBox(children=(Image(value=b'', format='jpeg', height='300', width='300'), FloatSlider(value=0.…

Call the block below to connect the execute function to each camera frame update.

In [6]:
camera.unobserve_all()
camera.observe(execute, names='value')

Awesome!  If the robot is not blocked you should see boxes drawn around the detected objects in blue.  The target object (which the robot follows) will be displayed in green.

The robot should steer towards the target when it is detected.  If it is blocked by an object it will simply turn left.

You can call the code block below to manually disconnect the processing from the camera and stop the robot.

In [7]:
######## If you are starting a new kernel that needs to use the camera, it is important to run these lines
import time
camera.unobserve_all()
time.sleep(1.0)
robot.stop()

Following step is added to terminate the camera before shut down. Useful step if the camera often hangs.

In [8]:
camera.stop()