# Satet-Of-The-Art Deep Learning for Face Detection

## 1. MTCNN (Multi Task Cascade Convolution Neural Network)

- *“Multi-Task Cascaded Convolutional Neural Network,”* or MTCNN for short, described by Kaipeng Zhang, et al. in the 2016 paper titled *“Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks.”*
- The MTCNN is popular because it achieved then **state-of-the-art** results on a range of benchmark datasets, and because it is capable of also recognizing other **facial features** such as **eyes** and **mouth**, called **landmark detection**.
- The network uses a cascade structure with three networks; 
    - first the image is rescaled to a range of different sizes (called an **image pyramid**), 
    - then the first model (Proposal Network or **P-Net**) proposes **candidate facial regions**, 
    - the second model (Refine Network or **R-Net**) **filters the bounding boxes**, and 
    - the third model (Output Network or **O-Net**) proposes **facial landmarks**.
    
<img src="resource/mtcnnmodel.png" style="width:400px"></img>

- install **pre-trained** model in library `mtcnn` created by *ipazc/MTCNN*

In [None]:
! pip install mtcnn

- detect face on **image** using `mtcnn`

In [None]:
import cv2
from mtcnn.mtcnn import MTCNN

detector_mtcnn = MTCNN()

In [None]:
img = cv2.imread("lena.jpg")
faces = detector_mtcnn.detect_faces(img)
for face in faces:
    print(face)

In [None]:
img = cv2.imread("lena.jpg")
faces = detector_mtcnn.detect_faces(img)
for face in faces:
    x, y, w, h = face['box']
    cv2.rectangle(img, (x,y), (x+w, y+h), (0,255,255), 2)
    print("confidence : %.2f%%" % (face['confidence']*100))

plt.imshow(img[:,:,::-1])
plt.show()

> **IMPORANT NOTICE** : *MTCNN required image in 3D shape (its means we doesn't need to convert data to grayscale before)*.

- detect face on **video frame** using `mtcnn`

In [None]:
def draw_ped(img, label, x0, y0, xt, yt, color=(255,127,0), text_color=(255,255,255)):

    (w, h), baseline = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
    cv2.rectangle(img,
                  (x0, y0 + baseline),  
                  (max(xt, x0 + w), yt), 
                  color, 
                  2)
    cv2.rectangle(img,
                  (x0, y0 - h),  
                  (x0 + w, y0 + baseline), 
                  color, 
                  -1)  
    cv2.putText(img, 
                label, 
                (x0, y0),                   
                cv2.FONT_HERSHEY_SIMPLEX,     
                0.5,                          
                text_color,                
                1,
                cv2.LINE_AA) 
    return img

In [None]:
cap = cv2.VideoCapture(0)

while cap.isOpened():
    ret, frame = cap.read()
    
    if ret:
        faces = detector_mtcnn.detect_faces(frame)
        for face in faces:
            x, y, w, h = face['box']
            label_text = "face (%.2f %%)" % (face['confidence']*100)
            frame = draw_ped(frame, label_text, x, y, x + w, y + h, color=(0,255,255), text_color=(50,50,50))
            
        cv2.imshow("Face detection MTCNN", frame)
        
        if cv2.waitKey(1) == ord('q'):
            break
    else :
        break
cap.release()
cv2.destroyAllWindows()