# Intro to Object Detection


**TABLE OF CONTENT**
- [What is an image?](#What-is-an-image?)
- [Introduction to Computer Vision](#Introduction-to-Computer-Vision)
- [The Viola-Jones algorithm](#The-Viola-Jones-algorithm)
- [Intro to OpenCV](#Intro-to-OpenCV)
--- 
![image](https://blog-c7ff.kxcdn.com/blog/wp-content/uploads/2018/08/shutterstock_668209624-1.jpg)

# What is an image?
[Return to top](#Intro-to-Object-Detection)

For half of century, we've been trying to give computers the ability to see. 
To hear is not the same as listening and to take pictures is not the same as seeing.

How computers see is very different to how us human perceive things. This perception of color derives from the stimulation of cone cells in the human eye while the process of perception is a very debated matter. 

Ever since the invention of digital imagery, computers have been able to store images, but only recently have they begun to truly see. When you take a picture with a camera. The captured images are digitized and stored as a computer file. The light intensity is essentially transformed into a matrix of pixels.

Let's say your camera offers 12 Mega pixel, it means that each picture is represented by 12 million pixels, each assigned a color from a combination of primary red, green, blue. This is called RGB color model

Each color will take a value between 0 and 255 (low to high intensity).

![img](https://web.stanford.edu/class/cs101/image-diagram2.png)

Computer vision is essentially the field of manipulating those matrices to derive meaningful information.

# Introduction to Computer Vision
[Return to top](#Intro-to-Object-Detection)

You've probably used computer vision on several occasions, with softwares like photoshop or your phone's picture editor. These are essentially mathematical operations executed on the original picture.

![img](https://blogsimages.adobe.com/jkost/files/2013/09/39filters.jpg)



But computer vision has many more applications like image retrieval, surveilance, self-driving capabilties, etc. 

These applications require the use of object detection. Object detection revolves around the idea of using algorithms to detect and classify instances of real-world objects present in a picture or video.

The basic concept is that each object class has its own special features that helps in classifying the class (i.e. all circles are round, the eiffel tower is always the same shape, etc.)

Object class detection uses these special features. For example, when looking for circles, objects that are at a particular distance from a point (i.e. the center) are sought. Similarly, when looking for squares, objects that are perpendicular at corners and have equal side lengths are needed. A similar approach is used for face identification where eyes, nose, and lips can be found and features like skin color and distance between eyes can be found.

# The Viola-Jones algorithm
[Return to top](#Intro-to-Object-Detection)

The Viola–Jones object detection framework is the first object detection framework to provide competitive object detection rates in real-time proposed in 2001 by Paul Viola and Michael Jones. The paper can be downloaded by clicking [here](https://www.cs.cmu.edu/~efros/courses/LBMV07/Papers/viola-cvpr-01.pdf).

**Haar-like features:**

In order to understand how the algorithm works, we must understand what is trying to identify. In  this case, we use haar-like features. 

A Haar-like feature considers adjacent rectangular regions at a specific location in a detection window, sums up the pixel intensities in each region and calculates the difference between these sums. This difference is then used to categorize subsections of an image. For example, let us say we have an image database with human faces. It is a common observation that among all faces the region of the eyes is darker than the region of the cheeks. Therefore a common Haar feature for face detection is a set of two adjacent rectangles that lie above the eye and the cheek region. 

![img](https://i.stack.imgur.com/mVBld.png)

**Viola-Jones object detection framework:**


![gif](https://www.pyimagesearch.com/wp-content/uploads/2014/10/sliding_window_example.gif)

In order to get the intuition behing the algorithm, one can picture a moving window that slides over the image, aiming to identify haar-like features. When a feature is recognized, a positive indication is fedback to the model and when enough of positive classes are detected (i.e. two eyes + a mouth), the algorithm classify the area of interest as a face/eye/body (or whatever we are trying to identify).



As one can imagine, regions like the nose, eyebrows and the mouth are a perfect fit for haar-like features and can serve as very powerful predictors.

![img](https://docs.opencv.org/3.4.3/haar.png)



# Intro to OpenCV
[Return to top](#Intro-to-Object-Detection)

![img](https://jayrambhia.files.wordpress.com/2012/06/opencv_hor_900_1.jpg)


Now that we know the basic theory behind face detection, let's move onto the implementation.

For this exercise we use OpenCV, an open-source library of programming functions mainly aimed at real-time computer vision. 
Luckily for us, OpenCV comes with pre-trained Haar-cascading algorithms (face, eyes, smiles, etc.). This means that all we have to do is load the algorithm and apply it onto our processed images. Let's get started!

To move onto the face-detection exercise, open the file titled face_detection_template.ipynb or click [here](./face_detection_template.ipynb)