# VIC: Introduction to Visual Computing - 2021/22
## Assignment 2

**Instructor:** Maria Vakalopoulou\
**T.A.:** Joseph Boyd\
**Due Date:** February 25, 2022

This is your second assignment in the computer vision course. This time you are to implement a system that outputs bounding boxes (bb) of cars in a dashcam videos. You are expected to submit your prediction to a Kaggle challenge at the following link, where you will also find the dataset and further instructions:

https://www.kaggle.com/t/1df65691991d413ab190f6cc4d31b968
 
**Scoring:**

 - Your work will be evaluated as usual (complexity of the solution, clean implementation, well documented) **PLUS** the best 5 will receive +1 for the grade of the assignement.
 
**Deliverables:**

 - Your code that is a single python module (possibly with requirements.txt or with a dockerfile). See the example prenom_nom.py. It has to implement the same interface, if it fails to run, your solution is considered failing.
 - Your report. Short summary of your algorithm, motivation for the algorithm used, failing cases, code and results (~1 page).
 
You should send your assignment by mail to maria.vakalopoulou@centralesupelec.fr, the name of the subject of the mail should be: VIC_Assignement2_name 

**Guidelines:**

* Remember, you can pre-process the data (rgb2gray, resize, prefilter) however you like, but you are still expected to output bounding boxes for the raw images.

* You are free to design your solution pipeline but NO DEEP LEARNING APPROACHES ARE ALLOWED. You should classic methods to complete this assignment!!

* You cannot use external datasets in your final pipeline, including in the training of any machine learning models.

* Even if your pipeline is not the most successful, you will be rewarded for being innovative/having a good implementation.

In [6]:
import pandas as pd

df_ground_truth = pd.read_csv('./train.csv', index_col=0)
df_ground_truth.head()

Unnamed: 0_level_0,bounding_boxes
frame_id,Unnamed: 1_level_1
1,0 225 214 317 0 172 345 254 285 240 155 131 70...
2,0 254 190 293 0 169 338 271 276 238 160 137 70...
3,0 306 59 241 0 155 306 318 235 233 191 149 713...
4,0 143 239 298 164 223 240 172 721 293 94 76 57...
5,0 217 137 270 55 209 323 208 731 296 99 79 573...


In [9]:
import os
import numpy as np
from matplotlib import pyplot as plt
from skimage.io import imread
import matplotlib.patches as patches


data_root = './train/'

_N = 202 # number of frames

def format_id(frame):
    assert _N >= frame
    return '%03d' % frame

def read_frame(root, frame):
    """Read frames and create integer frame_id-s"""
    assert _N >= frame
    return imread(os.path.join(root, format_id(frame)+'.jpg'))

def annotations_for_frame(solution, frame):
    assert frame in solution.index
    bbs = solution[solution.index == frame].bounding_boxes.values[0]
    bbs = list(map(int, bbs.split(' ')))
    return np.array_split(bbs, len(bbs) / 4)

def show_annotation(solution, frame):
    assert frame <= _N
    img = read_frame(data_root, frame)
    bbs = annotations_for_frame(solution, frame)

    fig, ax = plt.subplots(figsize=(10, 8))

    for x, y, dx, dy in bbs:
        rect = patches.Rectangle((x, y), dx, dy, edgecolor='r', facecolor='none')
        ax.add_patch(rect)

    ax.imshow(img)
    ax.set_title('Annotations for frame {}.'.format(frame))

from ipywidgets import interact, widgets
from IPython.display import display

def f_display(frame_id):
    show_annotation(df_ground_truth, frame_id)

interact(f_display, frame_id=widgets.IntSlider(min=1, max=_N, step=1, value=1))

interactive(children=(IntSlider(value=1, description='frame_id', max=202, min=1), Output()), _dom_classes=('wi…

<function __main__.f_display(frame_id)>