<a href="https://colab.research.google.com/github/google/applied-machine-learning-intensive/blob/master/content/04_classification/08_video_processing_project/colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### Copyright 2020 Google LLC.

In [None]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Video Classification with Pre-Trained Models Project

In this project we will import a pre-existing model that recognizes objects and use the model to identify those objects in a video. We'll edit the video to draw boxes around the identified object, and then we'll reassemble the video so the boxes are shown around objects in the video.

# Exercises

## Exercise 1: Coding

You will process a video frame by frame, identify objects in each frame, and draw a bounding box with a label around each car in the video.
 
Use the [SSD MobileNet V1 Coco](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md) (*ssd_mobilenet_v1_coco*) model. The video you'll process can be found [on Pixabay](https://pixabay.com/videos/cars-motorway-speed-motion-traffic-1900/). The 640x360 version of the video is smallest and easiest to handle, though any size should work since you must scale down the images for processing.
 
Your program should:
 
* ~~Read in a video file (use the one in this colab if you want)~~
* ~~Load the TensorFlow model linked above~~
* ~~Loop over each frame of the video~~
* ~~Scale the frame down to a size the model expects~~
* ~~Feed the frame to the model~~
* ~~Loop over detections made by the model~~
* ~~If the detection score is above some threshold, draw a bounding box onto the frame and put a label in or near the box~~
* ~~Write the frame back to a new video~~
 
Some tips:
 
* Processing an entire video is slow, so consider truncating the video or skipping over frames during development. Skipping frames will make the video choppy. But you'll be able to see a wider variety of images than you would with a truncated video with all of the original frames in the clip.
* The model expects a 300x300 image. You'll likely have to scale your frames to fit the model. When you get a bounding box, that box is relative to the scaled image. You'll need to scale the bounding box out to the original image size.
* Don't start by trying to process the video. Instead, capture one frame and work with it until you are happy with your object detection, bounding boxes, and labels. Once you get those done, use the same logic on the other frames of the video.
* The [Coco labels file](https://github.com/nightrome/cocostuff/blob/master/labels.txt) can be used to identify classified objects.
 

### **Student Solution**

### Imports 

In [None]:
#Imports
import cv2 as cv
import numpy as np
import pandas as pd
from google.colab.patches import cv2_imshow
import matplotlib.pyplot as plt
import tensorflow as tf
import urllib.request
import os
import tarfile
import shutil


### Load the TensorFlow Model 

In [None]:
# Obtaing the model file
url = 'http://download.tensorflow.org/models/object_detection/'

filename = 'ssd_mobilenet_v1_coco_2018_01_28.tar.gz'

url1 = url + filename

urllib.request.urlretrieve(url1, filename)

dir_name= filename[0:-len('.tar.gz')]

In [None]:
# Extracting model data
dir_name = filename[0:-len('.tar.gz')]

if os.path.exists(dir_name):
  shutil.rmtree(dir_name) 

tarfile.open(filename, 'r:gz').extractall('./')

os.listdir(dir_name)

### Graph

In [None]:
# Defining frozen graph
frozen_graph = os.path.join(dir_name, 'frozen_inference_graph.pb')

with tf.io.gfile.GFile(frozen_graph, "rb") as f:
  graph_def = tf.compat.v1.GraphDef()
  loaded = graph_def.ParseFromString(f.read())

In [None]:
#Wrapping Graph
def wrap_graph(graph_def, inputs, outputs, print_graph=False):
  wrapped = tf.compat.v1.wrap_function(
    lambda: tf.compat.v1.import_graph_def(graph_def, name=""), [])

  return wrapped.prune(
    tf.nest.map_structure(wrapped.graph.as_graph_element, inputs),
    tf.nest.map_structure(wrapped.graph.as_graph_element, outputs))
    


In [None]:
#From Git Hib
dict = {
    0:"background",
    1:"person",
    2:"bicycle",
    3:"car",
    4:"motorcycle",
    5:"airplane",
    6:"bus",
    7:"train",
    8:"truck",
    9:"boat",
    10:"trafficlight",
    11:"firehydrant",
    12:"unknown",
    13:"stopsign",
    14:"parkingmeter",
    15:"bench",
    16:"bird",
    17:"cat",
    18:"dog",
    19:"horse",
    20:"sheep",
    21:"cow",
    22:"elephant",
    23:"bear",
    24:"zebra",
    25:"giraffe",
    26:"unknown",
    27:"backpack",
    28:"umbrella",
    29:"unknown",
    30:"unknown",
    31:"handbag",
    32:"tie",
    33:"suitcase",
    34:"frisbee",
    35:"skis",
    36:"snowboard",
    37:"sportsball",
    38:"kite",
    39:"baseballbat",
    40:"baseballglove",
    41:"skateboard",
    42:"surfboard",
    43:"tennisracket",
    44:"bottle",
    45:"unknown",
    46:"wineglass",
    47:"cup",
    48:"fork",
    49:"knife",
    50:"spoon",
    51:"bowl",
    52:"banana",
    53:"apple",
    54:"sandwich",
    55:"orange",
    56:"broccoli",
    57:"carrot",
    58:"hotdog",
    59:"pizza",
    60:"donut",
    61:"cake",
    62:"chair",
    63:"couch",
    64:"pottedplant",
    65:"bed",
    66:"unknown",
    67:"diningtable",
    68:"unknown",
    69:"unknown",
    70:"toilet",
    71:"unknown",
    72:"tv",
    73:"laptop",
    74:"mouse",
    75:"remote",
    76:"keyboard",
    77:"cellphone",
    78:"microwave",
    79:"oven",
    80:"toaster",
    81:"sink",
    82:"refrigerator",
    83:"unknown",
    84:"book",
    85:"clock",
    86:"vase",
    87:"scissors",
    88:"teddybear",
    89:"hairdrier",
    90:"toothbrush"
}

In [None]:
# Model
def drawBoxes(frame):
  image = frame

  outputs = (
    'num_detections:0',
    'detection_classes:0',
    'detection_scores:0',
    'detection_boxes:0',
  )
  input_images = [image]
      
  model = wrap_graph(graph_def=graph_def,
                    inputs=["image_tensor:0"],
                    outputs=outputs)

  tensor = tf.convert_to_tensor(input_images, dtype=tf.uint8)

  detections = model(tensor)
  boxes = []
  i = 0
  while detections[3][0][i].numpy().any():
    boxes.append((detections[3][0][i].numpy(), detections[1][0][i].numpy()))
    i += 1

  height = image.shape[0]
  width = image.shape[1]
  for box in boxes:
    label = box[1]
    y1 = int(box[0][0] * height)
    x1 = int(box[0][1] * width)
    y2 = int(box[0][2] * height)  
    x2 = int(box[0][3] * width)
    image = cv.rectangle(image, (x1, y1), (x2, y2), (0,255,0), 2)
    cv.putText(image, dict[label], (x1,y2), cv.FONT_HERSHEY_SIMPLEX, .8, [0,0,0], 2)

  return image

### Video

#### Import Video

In [None]:
# Uploading file
# Takes a minute to run
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))

#### Video and Model

In [None]:
# The video is stored here
video = cv.VideoCapture('Cars.mp4')


In [None]:
# Information about the video
height = int(video.get(cv.CAP_PROP_FRAME_HEIGHT))
width = int(video.get(cv.CAP_PROP_FRAME_WIDTH))
fps = video.get(cv.CAP_PROP_FPS)
total_frames = int(video.get(cv.CAP_PROP_FRAME_COUNT))

print(f'height: {height}')
print(f'width: {width}')
print(f'frames per second: {fps}')
print(f'total frames: {total_frames}')
print(f'video length (seconds): {total_frames / fps}')

In [None]:
# This code block defines the output video
fourcc = cv.VideoWriter_fourcc(*'mp4v')
output_video = cv.VideoWriter('Updated Cars Video.mp4', fourcc, fps, (width, height))

In [None]:
#This prints each frame, dectects the objects, and places frame in New Cars Video
#This code takes 5-7 minutes to run 
for i in range(0, total_frames, 25):
  print(i, " / ", total_frames)
  video.set(cv.CAP_PROP_POS_FRAMES, i)
  ret, frame = video.read()
  frame = drawBoxes(frame)
  if not ret:
    raise Exception("Problem reading frame", i, " from video")
  output_video.write(frame)

In [None]:
#Releases original video
video.release()

In [None]:
#Releases Updated video
output_video.release()

## Exercise 2: Ethical Implications

Even the most basic models have the potential to affect segments of the population in different ways. It is important to consider how your model might positively and negatively affect different types of users.

In this section of the project, you will reflect on the positive and negative implications of your model. Frame the context of your model creation using this narrative:

> The city of Seattle is attempting to reduce traffic congestion in its downtown area. As part of this project, they plan to allow each local driver one free trip to downtown Seattle per week. After that, the driver will have to pay a $50 toll for each extra day per week driven. As an early proof of concept for this project, your team is tasked with using machine learning to correctly identify automobiles on the road. The next phase of the project will involve detecting license plate numbers and then cross-referencing that data with RFID chips that should be mounted in all local drivers' cars.

### **Student Solution**

**Positive Impact**

Your model is trying to solve a problem. Think about who will benefit from that problem being solved and write a brief narrative about how the model will help.

> *The city of Seattle will benefit because the problem stated is that there is a lot of traffic in downtown Seattle which could lead to less car accidents and therefore less car related deaths. The city of Seattle also benifits because the earned money from the $50 tolls everyday. This model will help identify the license plates of the car driving so the city will know who to bill to. Another group who can benefit are people walking downtown due to lack of traffic.*


**Negative Impact**

Models rarely benefit everyone equally. Think about who might be negatively impacted by the predictions your model is making. This person(s) might not be directly using the model, but they might be impacted indirectly.

> *Businesses in downtown Seattle as well as workers will be negatively impacted. Businesses in downtown Seattle will be negatively impacted because people who would have drove downtown just for the business may not because using the toll becomes expensive. People who work downtown will also be negatively impacted because they would have spend more money simply driving to work.*

**Bias**

Models can be biased for many reasons. The bias can come from the data used to build the model (e.g., sampling, data collection methods, available sources) and/or from the interpretation of the predictions generated by the model.

Think of at least two ways bias might have been introduced to your model and explain both below.

> *One source of bias in the model could be that the model may not be able to detect the license plates in other weather besides being sunny. For example if it were raining I questioned if the model could detect the license plate.*

> *Another source of bias in the model could be license plates that are in the windows of the car. For instance in a way to avoid paying the fines drivers could put the license plates in the window. In that case police wouldnt pull them over and the camerea may not be able to dectact the plate.*

**Changing the Dataset to Mitigate Bias**

Having bias in your dataset is one of the primary ways in which bias is introduced to a machine learning model. Look back at the input data you fed to your model. Think about how you might change something about the data to reduce bias in your model.

What change or changes could you make to reduce the bias in your dataset? Consider the data you have, how and where it was collected, and what other sources of data might be used to reduce bias.

Write a summary of changes that could be made to your input data.

> *Since the data has potential with the weather and time of day the original video was taken we can increase the amount of cars in the dataset so that the model can take account of the weather and different times of day.* 

**Changing the Model to Mitigate Bias**

Is there any way to reduce bias by changing the model itself? This could include modifying algorithmic choices, tweaking hyperparameters, etc.

Write a brief summary of changes you could make to help reduce bias in your model.

> *Since the model has potential bias in the weather conditions when recording data and making prediction, we can adjust the data by including recording of cars in all kinds of weather conditions. We wouldnt suggest changing the model itself yet because this bias could be fixed with more data.*

**Mitigating Bias Downstream**

Models make predictions. Downstream processes make decisions. What processes and/or rules should be in place for people and systems interpreting and acting on the results of your model to reduce bias? Describe these rules and/or processes below.

> *Since the predictions have potential bias in including all the circumstances a rule that could be put in place for people and systems interpreting and acting on the results of the model is too simply take into account that the license plates maybe hard to read in certin weather conditions.*

---