# FACE AND EYE RECOGNITION USING VIOLA&JONES (HAAR CASCADE CLASSIFIER)

This notebook includes a code for face and eye recognition from our webcam images using Haar Cascade classifier. Once eyes are extracted from the original image, that images is cropped and used to predict, using our model trainned with our dataset, if the eyes are opened or closed.

Haar Cascade have some issues detecting closed eyes as eyes, also it can detect noise holes, chin or mouth as eyes and our model also have some problems classifing this closed eyes as closed, but its an initial code.

Object Detection using Haar feature-based cascade classifiers is an effective object detection method proposed by Paul Viola and Michael Jones in their paper, "Rapid Object Detection using a Boosted Cascade of Simple Features" in 2001. It is a machine learning based approach where a cascade function is trained from a lot of positive and negative images. It is then used to detect objects in other images.

Initially, the algorithm needs a lot of positive images (images of faces) and negative images (images without faces) to train the classifier. Then we need to extract features from it. For this, Haar features shown in the below image are used. They are just like our convolutional kernel. Each feature is a single value obtained by subtracting sum of pixels under the white rectangle from sum of pixels under the black rectangle.

<img src="notebook_images/haar.jpeg">

OpenCV library includes some models already trained for many object detections, for our case there are models for face and eye detection, that can be easily used with Python.

This is a old model that gives quite good results and can be used at real-time applications, but Haar cascades are notoriously prone to false-positives — the Viola-Jones algorithm can easily report a face in an image when no face is present.

More modern and accurate models must be taken into account, as this method has now been far surpassed by other methods, such as using Histogram of Oriented Gradients (HOG) + Linear SVM and deep learning (CNN, YOLO). 

In [6]:
import numpy as np
import cv2
from keras.models import load_model
from skimage.transform import resize
import os
import matplotlib.pyplot as plt
import pandas as pd
from tensorflow.keras.preprocessing.image import load_img
from tensorflow.keras.preprocessing.image import img_to_array
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix, classification_report
import time
import torch

In [7]:
path = r"C:\Users\guill\Desktop\INDIZEN\Capstone\Repositorio\CV_Capstone\eye classifier notebooks\videos"
model = torch.hub.load('ultralytics/yolov5', 'custom', path='model/yolo.pt', force_reload=True)
clf_model = load_model('model/eye_classifier1_v4.h5')
for file in os.listdir(path):
    cap = cv2.VideoCapture(file)
    frames = 120
    if (cap.isOpened()== False):
        print("Error opening video stream or file")
    while cap.isOpened():
        ret, frame = cap.read()
        # if frame is read correctly ret is True
        if not ret:
            print("Can't receive frame (stream end?). Exiting ...")
            break
        if frames == 120:
            start = time.time()
        elif frames == 0:
            frames = 121
            end = time.time()
            seconds = end - start
            print ("Time taken : {0} seconds".format(seconds))
            # Calculate frames per second
            fps  = 120 / seconds
            print("Estimated frames per second : {0}".format(fps))
        result = model(frame)        
        df = result.pandas().xyxy[0]
        if len(df)>2:
            df2 = df
        for j in range(len(df)):
            if float(df["confidence"][j]) > 0.5:
                xmin = int(df["xmin"][j])
                xmax = int(df["xmax"][j])
                ymin = int(df["ymin"][j])
                ymax = int(df["ymax"][j])
                eye_image = frame[ymin:ymax, xmin:xmax]
                eye_scaled = resize(eye_image, (80, 80), preserve_range=True).astype(np.uint8)
                eye_scaled_norm = eye_scaled.astype("float32") / 255
                out_probabilities = clf_model.predict(np.reshape(eye_scaled_norm,(1,80,80,3)))
                result = "OPENED" + str(out_probabilities[0][0]) if out_probabilities[0][0] > 0.5 else "CLOSED " + str(out_probabilities[0][0])
                result = "OPENED" if out_probabilities[0][0] > 0.5 else "CLOSED "
                text_x = int(xmin)
                text_y = int(ymin-20)
                #DRAW TEXT OVER EYES
                cv2.rectangle(frame,(xmin,ymin),(xmax,ymax),color=(255, 0, 0), thickness=3)
                cv2.putText(frame, result, (text_x, text_y), cv2.FONT_HERSHEY_PLAIN, 1, (255,0,0), 1, cv2.LINE_AA)
        cv2.imshow('frame', frame)
        frames = frames - 1
        if cv2.waitKey(1) == ord('q'):
            break
    cap.release()
    cv2.destroyAllWindows()

Downloading: "https://github.com/ultralytics/yolov5/archive/master.zip" to C:\Users\guill/.cache\torch\hub\master.zip
[31m[1mrequirements:[0m torchvision>=0.8.1 not found and is required by YOLOv5, attempting auto-update...
[31m[1mrequirements:[0m Command 'pip install 'torchvision>=0.8.1' ' returned non-zero exit status 1.
YOLOv5  2022-5-26 Python-3.8.5 torch-1.10.2+cpu CPU

Fusing layers... 
Model summary: 213 layers, 7012822 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape... 


Error opening video stream or file
Error opening video stream or file
Error opening video stream or file
Error opening video stream or file
Error opening video stream or file
Error opening video stream or file
Error opening video stream or file
Error opening video stream or file
Error opening video stream or file


In [5]:
file = "prueba_mauricio2.mp4"
model = torch.hub.load('ultralytics/yolov5', 'custom', path='model/yolo.pt', force_reload=True)
clf_model = load_model('model/eye_classifier1_v4.h5')
cap = cv2.VideoCapture(file)
frames = 120
if (cap.isOpened()== False):
    print("Error opening video stream or file")
while cap.isOpened():
    ret, frame = cap.read()
    # if frame is read correctly ret is True
    if not ret:
        print("Can't receive frame (stream end?). Exiting ...")
        break
    if frames == 120:
        start = time.time()
    elif frames == 0:
        frames = 121
        end = time.time()
        seconds = end - start
        print ("Time taken : {0} seconds".format(seconds))
        # Calculate frames per second
        fps  = 120 / seconds
        print("Estimated frames per second : {0}".format(fps))
    result = model(frame)        
    df = result.pandas().xyxy[0]
    if len(df)>2:
        df2 = df
    for j in range(len(df)):
        if float(df["confidence"][j]) > 0.5:
            xmin = int(df["xmin"][j])
            xmax = int(df["xmax"][j])
            ymin = int(df["ymin"][j])
            ymax = int(df["ymax"][j])
            eye_image = frame[ymin:ymax, xmin:xmax]
            eye_scaled = resize(eye_image, (80, 80), preserve_range=True).astype(np.uint8)
            eye_scaled_norm = eye_scaled.astype("float32") / 255
            gray = cv2.cvtColor(eye_scaled_norm, cv2.COLOR_BGR2GRAY)
            out_probabilities = clf_model.predict(np.reshape(gray,(1,80,80,1)))
            result = "OPENED" + str(out_probabilities[0][0]) if out_probabilities[0][0] > 0.5 else "CLOSED " + str(out_probabilities[0][0])
            result = "OPENED" if out_probabilities[0][0] > 0.5 else "CLOSED "
            text_x = int(xmin)
            text_y = int(ymin-20)
            #DRAW TEXT OVER EYES
            cv2.rectangle(frame,(xmin,ymin),(xmax,ymax),color=(255, 0, 0), thickness=3)
            cv2.putText(frame, result, (text_x, text_y), cv2.FONT_HERSHEY_PLAIN, 1, (255,0,0), 1, cv2.LINE_AA)
    cv2.imshow('frame', frame)
    frames = frames - 1
    if cv2.waitKey(1) == ord('q'):
        break
cap.release()
cv2.destroyAllWindows()

Downloading: "https://github.com/ultralytics/yolov5/archive/master.zip" to C:\Users\guill/.cache\torch\hub\master.zip
[31m[1mrequirements:[0m torchvision>=0.8.1 not found and is required by YOLOv5, attempting auto-update...
[31m[1mrequirements:[0m Command 'pip install 'torchvision>=0.8.1' ' returned non-zero exit status 1.
YOLOv5  2022-5-25 Python-3.8.5 torch-1.10.2+cpu CPU

Fusing layers... 
Model summary: 213 layers, 7012822 parameters, 0 gradients, 15.8 GFLOPs
Adding AutoShape... 


Time taken : 34.466389656066895 seconds
Estimated frames per second : 3.4816527404655853
