# FACE DETECTION

## what is face detection?


###### ----->   Face detection is a type of computer vision technology that is able to identify people’s faces within digital images. This is very easy for humans, but computers need precise instructions. The images might contain many objects that aren’t human faces, like buildings, cars, animals, and so on.

##### ---->  It is distinct from other computer vision technologies that involve human faces, like facial recognition, analysis, and tracking.



##### ---->   Facial recognition involves identifying the face in the image as belonging to person X and not person Y. It is often used for biometric purposes, like unlocking your smartphone.


##### ---->   Facial analysis tries to understand something about people from their facial features, like determining their age, gender, or the emotion they are displaying.

##### ---->   Facial tracking is mostly present in video analysis and tries to follow a face and its features (eyes, nose, and lips) from frame to frame. The most popular applications are various filters available in mobile apps like Snapchat.

##### ---->   All of these problems have different technological solutions. This tutorial will focus on a traditional solution for the first challenge: face detection.

## HOW DO  MACHINE CAN SEE IMAGES?


##### ---->   The smallest element of an image is called a pixel, or a picture element. It is basically a dot in the picture. An image contains multiple pixels arranged in rows and columns.



##### ---->You will often see the number of rows and columns expressed as the image resolution. For example, an Ultra HD TV has the resolution of 3840x2160, meaning it is 3840 pixels wide and 2160 pixels high.



#####  ---->   But a computer does not understand pixels as dots of color. It only understands numbers. To convert colors to numbers, the computer uses various color models.

##### ---->   In color images, pixels are often represented in the RGB color model. RGB stands for Red Green Blue. Each pixel is a mix of those three colors. RGB is great at modeling all the colors humans perceive by combining various amounts of red, green, and blue.



##### ---->Since a computer only understand numbers, every pixel is represented by three numbers, corresponding to the amounts of red, green, and blue present in that pixel. You can learn more about color spaces in Image Segmentation Using Color Spaces in OpenCV + Python.


##### ---->In grayscale (black and white) images, each pixel is a single number, representing the amount of light, or intensity, it carries. In many applications, the range of intensities is from 0 (black) to 255 (white). Everything between 0 and 255 is various shades of gray.


##### ----> If each grayscale pixel is a number, an image is nothing more than a matrix (or table) of numbers:


## IMPORTANT LIBRARIES 

#####   1)SCIKIT---IMAGE


#####   2)SCIKIT LEARN 

##### 3)OPEN CV

## TO DETECT FACE IN IMAGE I USED :VIOLA-JONES OBJECT DETECTION FRAMEWORK

#####  ----> This algorithm is named after two computer vision researchers who proposed the method in 2001: Paul Viola and Michael Jones.

##### ---> They developed a general object detection framework that was able to provide competitive object detection rates in real time. It can be used to solve a variety of detection problems, but the main motivation comes from face detection.

##### The Viola-Jones algorithm has 4 main steps

###   1)SELECTING HAAR-LIKE FEATURES  :   

#####               --->  All human faces share some similarities. If you look at a photograph showing a person’s face, you will see, for example, that the eye region is darker than the bridge of the nose. The cheeks are also brighter than the eye region. We can use these properties to help us understand if an image contains a human face.

#####     ---->A simple way to find out which region is lighter or darker is to sum up the pixel values of both regions and comparing them. The sum of pixel values in the darker region will be smaller than the sum of pixels in the lighter region. This can be accomplished using Haar-like features.

#####  ---->A Haar-like feature is represented by taking a rectangular part of an image and dividing that rectangle into multiple parts. They are often visualized as black and white adjacent rectangles:

###   3)VERTICAL FEATURES WITH TWO RECTANGLES

#####    ---->An integral image (also known as a summed-area table) is the name of both a data structure and an algorithm used to obtain this data structure. It is used as a quick and efficient way to calculate the sum of pixel values in an image or rectangular part of an image.

###   3)ADA BOOSTING 

#####  ---->Boosting is based on the following question: “Can a set of weak learners create a single strong learner?” A weak learner (or weak classifier) is defined as a classifier that is only slightly better than random guessing.




#####   ---->In face detection, this means that a weak learner can classify a subregion of an image as a face or not-face only slightly better than random guessing. A strong learner is substantially better at picking faces from non-faces.

#####    ---->The power of boosting comes from combining many (thousands) of weak classifiers into a single strong classifier. In the Viola-Jones algorithm, each Haar-like feature represents a weak learner. To decide the type and size of a feature that goes into the final classifier, AdaBoost checks the performance of all classifiers that you supply to it.

#####    ---->"To calculate the performance of a classifier, you evaluate it on all subregions of all the images used for training. Some subregions will produce a strong response in the classifier. Those will be classified as positives, meaning the classifier thinks it contains a human face.

#####      ----->Subregions that don’t produce a strong response don’t contain a human face, in the classifiers opinion. They will be classified as negatives.

#####    ----->The classifiers that performed well are given higher importance or weight. The final result is a strong classifier, also called a boosted classifier, that contains the best performing weak classifiers.

#####     ---->The algorithm is called adaptive because, as training progresses, it gives more emphasis on those images that were incorrectly classified. The weak classifiers that perform better on these hard examples are weighted more strongly than others.


###   4) CASCADING CLASSIFIERS

##   ---->The definition of a cascade is a series of waterfalls coming one after another. A similar concept is used in computer science to solve a complex problem with simple units. The problem here is reducing the number of computations for each image.

##### ---->To solve it, Viola and Jones turned their strong classifier (consisting of thousands of weak classifiers) into a cascade where each weak classifier represents one stage. The job of the cascade is to quickly discard non-faces and avoid wasting precious time and computations.

#####    ---->When an image subregion enters the cascade, it is evaluated by the first stage. If that stage evaluates the subregion as positive, meaning that it thinks it’s a face, the output of the stage is maybe.

#####   ---->If a subregion gets a maybe, it is sent to the next stage of the cascade. If that one gives a positive evaluation, then that’s another maybe, and the image is sent to the third stage:

#####     ---->This process is repeated until the image passes through all stages of the cascade. If all classifiers approve the image, it is finally classified as a human face and is presented to the user as a detection.

#####   ---->If, however, the first stage gives a negative evaluation, then the image is immediately discarded as not containing a human face. If it passes the first stage but fails the second stage, it is discarded as well. Basically, the image can get discarded at any stage of the classifier:

#####   ----> This is designed so that non-faces get discarded very quickly, which saves a lot of time and computational resources. Since every classifier represents a feature of a human face, a positive detection basically says, “Yes, this subregion contains all the features of a human face.” But as soon as one feature is missing, it rejects the whole subregion.

#####     ---->To accomplish this effectively, it is important to put your best performing classifiers early in the cascade. In the Viola-Jones algorithm, the eyes and nose bridge classifiers are examples of best performing weak classifiers.

#####   ---->Now that you understand how the algorithm works, it is time to use it to detect faces with Python.

##  --Import OpenCV and load the image into memory:



In [10]:
import cv2 as cv

# Read image from your local file system
original_image = cv.imread('docs_1/image4.jpg')
# Convert color image to grayscale for Viola-Jones
grayscale_image = cv.cvtColor(original_image, cv.COLOR_BGR2GRAY)

##### ----->Next, you need to load the Viola-Jones classifier. If you installed OpenCV from source, it will be in the folder where you installed the OpenCV library.



#####    ---->Depending on the version, the exact path might vary, but the folder name will be haarcascades, and it will contain multiple files. The one you need is called haarcascade_frontalface_alt.xml.

In [11]:
## Load the classifier and create a cascade object for face detection
face_cascade = cv.CascadeClassifier('docs_1/haarcascade_frontalface_default.xml')

#####   ---->The face_cascade object has a method detectMultiScale(), which receives an image as an argument and runs the classifier cascade over the image. The term MultiScale indicates that the algorithm looks at subregions of the image in multiple scales, to detect faces of varying sizes:



In [12]:
detected_faces = face_cascade.detectMultiScale(grayscale_image)

#####    ---->The variable detected_faces now contains all the detections for the target image. To visualize the detections, you need to iterate over all detections and draw rectangles over the detected faces.


#####     ---->OpenCV’s rectangle() draws rectangles over images, and it needs to know the pixel coordinates of the top-left and bottom-right corner. The coordinates indicate the row and column of pixels in the image.

#####   ---->Luckily, detections are saved as pixel coordinates. Each detection is defined by its top-left corner coordinates and width and height of the rectangle that encompasses the detected face.

#####    ---->Adding the width to the row and height to the column will give you the bottom-right corner of the image:

In [14]:
for (column, row, width, height) in detected_faces:
    cv.rectangle(original_image,(column, row),(column + width, row + height),(255, 0, 0),2)
#rectangle() accepts the following arguments:
#The original image
#The coordinates of the top-left point of the detection
#The coordinates of the bottom-right point of the detection
#The color of the rectangle (a tuple that defines the amount of red, green, and blue (0-255))
#The thickness of the rectangle line

## Finally, you need to display the image

In [15]:
cv.imshow('Image', original_image)
cv.waitKey(0)
cv.destroyAllWindows()

#####   ----> imshow() displays the image. waitKey() waits for a keystroke. Otherwise, imshow() would display the image and immediately close the window. Passing 0 as the argument tells it to wait indefinitely. Finally, destroyAllWindows() closes the window when you press a key.

### CODESNIPPET

In [17]:
import cv2 as cv

original_image = cv.imread('docs_1/image4.jpg')


grayscale_image = cv.cvtColor(original_image, cv.COLOR_BGR2GRAY)

face_cascade = cv.CascadeClassifier('docs_1/haarcascade_frontalface_default.xml')

detected_faces = face_cascade.detectMultiScale(grayscale_image)

for (column, row, width, height) in detected_faces:
    cv.rectangle(original_image,(column, row),(column + width, row + height),(255, 0, 0),2)
cv.imshow('Image', original_image)

cv.waitKey(0)

cv.destroyAllWindows()