In [1]:
%load_ext autoreload
%autoreload 2
from user_study_helpers import *

# Load Dev Set and primitive data

Expand a video by clicking on it; then use `;` to play the video.

In [67]:
bounding_boxes = get_bboxes('dev')
parking_space_gt = get_gt( )
visualize_boxes([bounding_boxes.filter(lambda i: i['payload']['class'] == 'car'), parking_space_gt])

100%|██████████| 2/2 [00:00<00:00, 14.01it/s]
100%|██████████| 2/2 [00:00<00:00, 475.65it/s]


VGridWidget(vgrid_spec={'compressed': True, 'data': b'x\x9c\xc4\xbd\xdb\xaeeKV\xb5\xf7*\xbf\xea\xdaFq>\xf8\xd2…

# Task: detect all empty parking spaces

Your goal is to write a Rekall program to detect all empty parking spaces (visualized in the second timeline above).

You're given a Rekall `IntervalSetMapping` object, `bounding_boxes`, that contains detections from Mask R-CNN. The Intervals contain 3D bounds, and the payloads contain the class and the class score:

In [33]:
bounding_boxes[0].get_intervals()[0]

<Interval t1:0.0 t2:30.0 x1:0.0 x2:0.08424050211906434 y1:0.5207680172390408 y2:0.6528446621365017 payload:{'class': 'car', 'score': 0.9638893008232117}>

The bounding boxes are sampled every thirty seconds (hence why the Interval above has time bounds of 0 to 30), and so are the ground truth annotations.

Your goal is to fill in the following function:

```Python
def detect_empty_parking_spaces(bounding_boxes, reference_video):
    """
    Function to detect empty parking spaces.
    
    bounding_boxes is a Rekall IntervalSetMapping object whose intervals are
    bounding box outputs from Mask R-CNN.
    
    first_video_id is the video ID of the first video. We guarantee that at
    time 0 in this video, all the car detections are parking spots.
    
    This function needs to return a Rekall IntervalSetMapping object whose
    Intervals are empty parking spots in the video.
    
    The output Intervals need to have time extent (0, 30), (30, 60), etc.
    Each Interval should have the spatial extent of a single empty parking
    spot.
    """
    
    # Your code here
```

This function takes in bounding box outputs from Mask R-CNN, and detects empty parking spaces. We also pass in the video ID of the first video - we guarantee that at time zero of this video, all the car detections will be parked cars.

This function will be evaluated against an unseen test set at the end of the user study. We'll be using the Average Precision metric.

We provide some helper functions to evaluate your programs on the dev set below!

This task is inspired by [this Medium blog post](https://medium.com/@ageitgey/snagging-parking-spaces-with-mask-r-cnn-and-python-955f2231c400):
* They use an off-the-shelf object detector to detect cars (like what you have in `bounding_boxes`)
* They take a timestamp where all the parking spots are full, and use car detections to get parking spots (like time 0 of `reference_video`)
* Then empty parking spots are just parking spots where there are no cars

In [116]:
def detect_empty_parking_spaces(bounding_boxes, reference_video):
    """
    Function to detect empty parking spaces.
    
    bounding_boxes is a Rekall IntervalSetMapping object whose intervals are
    bounding box outputs from Mask R-CNN.
    
    first_video_id is the video ID of the first video. We guarantee that at
    time 0 in this video, all the car detections are parking spots.
    
    This function needs to return a Rekall IntervalSetMapping object whose
    Intervals are empty parking spots in the video.
    
    The output Intervals need to have time extent (0, 30), (30, 60), etc.
    Each Interval should have the spatial extent of a single empty parking
    spot.
    """
    
    # Reference video (minus) bounding boxes
    join_ref_intervals = bounding_boxes[reference_video].dilate(30).coalesce(
        ('t1', 't2'), lambda b1, b2: b1.span(b2), predicate=iou_at_least(0.5)
    ).dilate(-30).filter(lambda e: e['t1'] == 0)
    
    parking_spots = IntervalSetMapping({
        k : IntervalSet([
            Interval(Bounds3D(
                t1 = 0,
                t2 = int(bounding_boxes[k].get_intervals()[-1]['t2']),
                x1 = interval['x1'],
                x2 = interval['x2'],
                y1 = interval['y1'],
                y2 = interval['y2']
            ))
            for interval in join_ref_intervals.get_intervals()
        ])
        for k in bounding_boxes
    })
    
    cars = bounding_boxes.filter(lambda i: i['payload']['class'] == 'car')
    
    join_cars = cars.dilate(15).coalesce(
        ('t1', 't2'), lambda bounds1, bounds2: bounds1.span(bounds2), predicate = iou_at_least(0.5)
    ).dilate(-15)
    
    return parking_spots.minus(
        join_cars,
        predicate = and_pred(
            Bounds3D.T(overlaps()),
            Bounds3D.X(overlaps()),
            Bounds3D.Y(overlaps()),
            iou_at_least(0.25)
        ),
        window = 0.0,
    )
    
    # BAD example - return all the car detections (this is obviously incorrect...)

## Helper functions

`evaluate_on_dev` runs `detect_empty_parking_spaces` on `bounding_boxes`, and visualizes the results along with the ground truth in the dev set. Then it computes the AP score on the dev set.

Of course, you can also run these functions yourself!

`compute_ap` takes the predicted parking spaces and the ground truth set, and computes the AP score. As we saw before, `visualize_boxes` takes a list of `IntervalSetMapping` objects and visualizes them.

In [70]:
def evaluate_on_dev():
    empty_parking_spaces = detect_empty_parking_spaces(bounding_boxes, 0)
    
    # Visualize the predictions the first row will be your predictions, second
    # row will be the ground truth
    widget = visualize_boxes([empty_parking_spaces, parking_space_gt])
    display(widget)
    
    # Compute average precision on the dev set
    ap = compute_ap(empty_parking_spaces, parking_space_gt)
    
    print('Average precision: ', ap)

In [117]:
evaluate_on_dev()

VGridWidget(vgrid_spec={'compressed': True, 'data': b'x\x9c\xdd\x9dIo\x1bG\x10\x85\xff\x8a\xc1s \xf5\xbe\xe8\x…

100%|██████████| 2/2 [00:00<00:00, 31.70it/s]
100%|██████████| 2/2 [00:00<00:00,  3.03it/s]
100%|██████████| 2/2 [00:00<00:00,  3.21it/s]

Average precision:  0.8122743682310469





In [119]:
evaluate_on_test(detect_empty_parking_spaces)

100%|██████████| 2/2 [00:00<00:00, 583.64it/s]
100%|██████████| 2/2 [00:00<00:00, 65.77it/s]
100%|██████████| 2/2 [00:00<00:00, 74.04it/s]
100%|██████████| 2/2 [00:00<00:00,  6.18it/s]
100%|██████████| 2/2 [00:00<00:00,  7.89it/s]

Average precision:  0.75





# Post-hoc analysis

In [129]:
def detect_empty_parking_spaces_modified(bounding_boxes, reference_video):
    """
    Function to detect empty parking spaces.
    
    bounding_boxes is a Rekall IntervalSetMapping object whose intervals are
    bounding box outputs from Mask R-CNN.
    
    first_video_id is the video ID of the first video. We guarantee that at
    time 0 in this video, all the car detections are parking spots.
    
    This function needs to return a Rekall IntervalSetMapping object whose
    Intervals are empty parking spots in the video.
    
    The output Intervals need to have time extent (0, 30), (30, 60), etc.
    Each Interval should have the spatial extent of a single empty parking
    spot.
    """
    
    # Reference video (minus) bounding boxes
    join_ref_intervals = bounding_boxes[reference_video].dilate(30).coalesce(
        ('t1', 't2'), lambda b1, b2: b1.span(b2), predicate=iou_at_least(0.5)
    ).dilate(-30).filter(lambda e: e['t1'] == 0)
    
    parking_spots = IntervalSetMapping({
        k : IntervalSet([
            Interval(Bounds3D(
                t1 = 0,
                t2 = int(bounding_boxes[k].get_intervals()[-1]['t2']),
                x1 = interval['x1'],
                x2 = interval['x2'],
                y1 = interval['y1'],
                y2 = interval['y2']
            ))
            for interval in join_ref_intervals.get_intervals()
        ])
        for k in bounding_boxes
    })
    
    cars = bounding_boxes.filter(lambda i: True)
    
    join_cars = cars.dilate(15).coalesce(
        ('t1', 't2'), lambda bounds1, bounds2: bounds1.span(bounds2), predicate = iou_at_least(0.5)
    ).dilate(-15)
    
    return parking_spots.minus(
        join_cars,
        predicate = and_pred(
            Bounds3D.T(overlaps()),
            Bounds3D.X(overlaps()),
            Bounds3D.Y(overlaps()),
            iou_at_least(0.25)
        ),
        window = 0.0,
    )
    
    # BAD example - return all the car detections (this is obviously incorrect...)

In [130]:
def evaluate_on_dev_modified():
    empty_parking_spaces = detect_empty_parking_spaces_modified(bounding_boxes, 0)
    
    # Visualize the predictions the first row will be your predictions, second
    # row will be the ground truth
    widget = visualize_boxes([empty_parking_spaces, parking_space_gt])
    display(widget)
    
    # Compute average precision on the dev set
    ap = compute_ap(empty_parking_spaces, parking_space_gt)
    
    print('Average precision: ', ap)

In [131]:
evaluate_on_dev_modified()

VGridWidget(vgrid_spec={'compressed': True, 'data': b'x\x9c\xdd\x9d[O\x1bI\x10\x85\xffJ\xe4\xe7\x15\xf4\xfd\xc…

100%|██████████| 2/2 [00:00<00:00, 38.01it/s]
100%|██████████| 1/1 [00:00<00:00,  1.86it/s]
100%|██████████| 2/2 [00:00<00:00,  3.75it/s]

Average precision:  0.9453781512605042





In [132]:
evaluate_on_test(detect_empty_parking_spaces_modified)

100%|██████████| 2/2 [00:00<00:00, 734.43it/s]
100%|██████████| 2/2 [00:00<00:00, 12.21it/s]
100%|██████████| 2/2 [00:00<00:00, 85.35it/s]
100%|██████████| 2/2 [00:00<00:00,  8.97it/s]
100%|██████████| 2/2 [00:00<00:00,  9.25it/s]

Average precision:  0.9865771812080537



