<a href="https://colab.research.google.com/github/Applied-Machine-Learning-2022/final-project-yeg-ua/blob/yasser/new_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### Copyright 2020 Google LLC.

In [None]:
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Video Classification with Pre-Trained Models Project

In this project we will import a pre-existing model that recognizes objects and use the model to identify those objects in a video. We'll edit the video to draw boxes around the identified object, and then we'll reassemble the video so the boxes are shown around objects in the video.

# Exercises

## Exercise 1: Coding

You will process a video frame by frame, identify objects in each frame, and draw a bounding box with a label around each car in the video.
 
Use the [SSD MobileNet V1 Coco](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md) (*ssd_mobilenet_v1_coco*) model. The video you'll process can be found [on Pixabay](https://pixabay.com/videos/cars-motorway-speed-motion-traffic-1900/). The 640x360 version of the video is smallest and easiest to handle, though any size should work since you must scale down the images for processing.
 
Your program should:
 
* Read in a video file (use the one in this colab if you want)
* Load the TensorFlow model linked above
* Loop over each frame of the video
* Scale the frame down to a size the model expects
* Feed the frame to the model
* Loop over detections made by the model
* If the detection score is above some threshold, draw a bounding box onto the frame and put a label in or near the box
* Write the frame back to a new video
 
Some tips:
 
* Processing an entire video is slow, so consider truncating the video or skipping over frames during development. Skipping frames will make the video choppy. But you'll be able to see a wider variety of images than you would with a truncated video with all of the original frames in the clip.
* The model expects a 300x300 image. You'll likely have to scale your frames to fit the model. When you get a bounding box, that box is relative to the scaled image. You'll need to scale the bounding box out to the original image size.
* Don't start by trying to process the video. Instead, capture one frame and work with it until you are happy with your object detection, bounding boxes, and labels. Once you get those done, use the same logic on the other frames of the video.
* The [Coco labels file](https://github.com/nightrome/cocostuff/blob/master/labels.txt) can be used to identify classified objects.
 

### **Student Solution**

In [None]:
import urllib.request
import os
import tarfile
import shutil
import tensorflow as tf
import cv2 as cv
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
base_url = 'http://download.tensorflow.org/models/object_detection/'    #download model
file_name = 'ssd_mobilenet_v1_coco_2018_01_28.tar.gz'
url = base_url + file_name
urllib.request.urlretrieve(url, file_name)
dir_name = file_name[0:-len('.tar.gz')]
if os.path.exists(dir_name):
  shutil.rmtree(dir_name)
tarfile.open(file_name, 'r:gz').extractall('./')
frozen_graph = os.path.join(dir_name, 'frozen_inference_graph.pb')
with tf.io.gfile.GFile(frozen_graph, "rb") as f:
  graph_def = tf.compat.v1.GraphDef()
  loaded = graph_def.ParseFromString(f.read())


def wrap_graph(graph_def, inputs, outputs, print_graph=False):
  wrapped = tf.compat.v1.wrap_function(
    lambda: tf.compat.v1.import_graph_def(graph_def, name=""), [])
  return wrapped.prune(
    tf.nest.map_structure(wrapped.graph.as_graph_element, inputs),
    tf.nest.map_structure(wrapped.graph.as_graph_element, outputs))
  
dict = {            #options
0:"background",
1:"person",
2:"bicycle",
3:"car",
4:"motorcycle",
5:"airplane",
6:"bus",
7:"train",
8:"truck",
9:"boat",
10:"trafficlight",
11:"firehydrant",
12:"unknown",
13:"stopsign",
14:"parkingmeter",
15:"bench",
16:"bird",
17:"cat",
18:"dog",
19:"horse",
20:"sheep",
21:"cow",
22:"elephant",
23:"bear",
24:"zebra",
25:"giraffe",
26:"unknown",
27:"backpack",
28:"umbrella",
29:"unknown",
30:"unknown",
31:"handbag",
32:"tie",
33:"suitcase",
34:"frisbee",
35:"skis",
36:"snowboard",
37:"sportsball",
38:"kite",
39:"baseballbat",
40:"baseballglove",
41:"skateboard",
42:"surfboard",
43:"tennisracket",
44:"bottle",
45:"unknown",
46:"wineglass",
47:"cup",
48:"fork",
49:"knife",
50:"spoon",
51:"bowl",
52:"banana",
53:"apple",
54:"sandwich",
55:"orange",
56:"broccoli",
57:"carrot",
58:"hotdog",
59:"pizza",
60:"donut",
61:"cake",
62:"chair",
63:"couch",
64:"pottedplant",
65:"bed",
66:"unknown",
67:"diningtable",
68:"unknown",
69:"unknown",
70:"toilet",
71:"unknown",
72:"tv",
73:"laptop",
74:"mouse",
75:"remote",
76:"keyboard",
77:"cellphone",
78:"microwave",
79:"oven",
80:"toaster",
81:"sink",
82:"refrigerator",
83:"unknown",
84:"book",
85:"clock",
86:"vase",
87:"scissors",
88:"teddybear",
89:"hairdrier"
}


def drawBoxes(frame):       #create box method
    image = frame
    outputs = (
      'num_detections:0',
      'detection_classes:0',
      'detection_scores:0',
      'detection_boxes:0',
       )
    input_images = [image]
    model = wrap_graph(graph_def=graph_def,
                  inputs=["image_tensor:0" ],
                   outputs=outputs)
    tensor = tf.convert_to_tensor(input_images, dtype=tf.uint8)
    detections = model(tensor)
    boxes = []
    i = 0
    while detections[1][0][i].numpy().any():
      boxes.append((detections[1][0][i].numpy(), detections[1][0][i].numpy()))
      i += 1
    height = image.shape[0]
    width = image.shape[1]
    for box in boxes:
      label = box[1]
      y1 = int(box[0][0] * height)
      x1 = int(box[0][1] * width)
      y2 = int(box[0][2] * height)
      x2 = int(box[0][3] * width)
      image = cv.rectangle(image, (x1, y1), (x2, y2), (0,255,0), 2)
      cv.putText(image, dict[1], (x1,y2), cv.FONT_HERSHEY_SIMPLEX, .8, [0,0,0], 2)
    return image
input_video = cv.VideoCapture('people')         #grab regular car video
height = int(input_video.get(cv.CAP_PROP_FRAME_HEIGHT))
width = int(input_video.get(cv.CAP_PROP_FRAME_WIDTH))
fps = input_video.get(cv.CAP_PROP_FPS)
total_frames = int(input_video.get(cv.CAP_PROP_FRAME_COUNT))
fourcc = cv.VideoWriter_fourcc(*'mp4v')
output_video = cv.VideoWriter('people.mp4', fourcc, fps, (width, height))
for i in range(0, total_frames, 1):
  input_video.set(cv.CAP_PROP_POS_FRAMES, i)
  ret, frames = input_video.read()
  frames = drawBoxes(frames)
  if not ret:
    raise Exception("Problem reading frame", i, " from video")
  output_video.write(frames)
input_video.release()   #release cars.mp4

output_video.release()  #release cars-detected.mp4

---

## Exercise 2: Ethical Implications

Even the most basic models have the potential to affect segments of the population in different ways. It is important to consider how your model might positively and negatively affect different types of users.

In this section of the project, you will reflect on the positive and negative implications of your model. Frame the context of your model creation using this narrative:

> The city of Seattle is attempting to reduce traffic congestion in its downtown area. As part of this project, they plan to allow each local driver one free trip to downtown Seattle per week. After that, the driver will have to pay a $50 toll for each extra day per week driven. As an early proof of concept for this project, your team is tasked with using machine learning to correctly identify automobiles on the road. The next phase of the project will involve detecting license plate numbers and then cross-referencing that data with RFID chips that should be mounted in all local drivers' cars.

### **Student Solution**

**Positive Impact**

Your model is trying to solve a problem. Think about who will benefit from that problem being solved and write a brief narrative about how the model will help.

Many people will benefit from the reduction of traffic congestion. The main benefactor for this model would be any driver traveling in the new, less congested roads. Other people who would also see a positive impact include workers in the city like police officers who will have less trouble regulating traffic. The city in its entirety will also see an indirect benefit as automobile emissions would be greatly reduced with this new system. Finally, public transportation could benefit from this since less cars will be on the road and more people will new a new avenue of traveling.

**Negative Impact**

Models rarely benefit everyone equally. Think about who might be negatively impacted by the predictions your model is making. This person(s) might not be directly using the model, but they might be impacted indirectly.

Some negative impacts may arise, therefore not all parties will benefit from this new system. For example, drivers who work in downtown Seattle and do not want to use public transportation would be negatively impacted. Another group of people who could also be impacted negatively are local business owners stationed downtown. Due to less congestion their businessâ€™s could see a decrease in traffic because people do not have a way of reaching them without having to pay an additional fee after their first visit.

**Bias**

Models can be biased for many reasons. The bias can come from the data used to build the model (e.g., sampling, data collection methods, available sources) and/or from the interpretation of the predictions generated by the model.

Think of at least two ways bias might have been introduced to your model and explain both below.

Since the data being analyzed may not be clear, bias is inevitable when trying to account for multiple moving vehicles. For example, developing a system to automatically identify cars could appear lackluster if the model is not given enough information between car models. Therefor, a main problem that may arise is the incorrect identification of some motor vehicles. 

Another example of bias that may be encountered is the incorrect reading of a license plate. Since the license plate accounts as the identification of a car, one incorrect digit could lead to a different person being charged a fee because of a mistake made by the program. To midigate biases like these the best option for training this program would be to give it as much data as needed, or to try different camera angles, so the accuracy can be as best as possible.

**Changing the Dataset to Mitigate Bias**

Having bias in your dataset is one of the primary ways in which bias is introduced to a machine learning model. Look back at the input data you fed to your model. Think about how you might change something about the data to reduce bias in your model.

What change or changes could you make to reduce the bias in your dataset? Consider the data you have, how and where it was collected, and what other sources of data might be used to reduce bias.

Write a summary of changes that could be made to your input data.

> *Since the data has potential bias A we can adjust the method we use to determine which car has already used their free downtown pass. One adjustment could be using each driver's VIN. As stated earleir, although using license plate numbers is feasible, one missed number could distort the entire process.*

**Changing the Model to Mitigate Bias**

Is there any way to reduce bias by changing the model itself? This could include modifying algorithmic choices, tweaking hyperparameters, etc.

Write a brief summary of changes you could make to help reduce bias in your model.

> *Since the model has potential bias A, we can adjust the way in which vehicles are checked and admitted. As stated earlier, cameras may not correctly document everything that is required of a person's vehicle, but having designated booths, with someone checking each vehicle could be a better solution.*



**Mitigating Bias Downstream**

Models make predictions. Downstream processes make decisions. What processes and/or rules should be in place for people and systems interpreting and acting on the results of your model to reduce bias? Describe these rules and/or processes below.

> *Since the predictions have potential bias A, we can adjust the criteria for who should and shouldn't have to undergo the newly proposed downtown criteria. For example, someone who works downtown should not have to participate in this model.*

---