# Eye Blinking Detection

Notebook by [Prashant Brahmbhatt](https://www.github.com/hashbanger)

For detection we have to compute a metric call **Eye Aspect Ratio (EAR)**.  
More on this can be found in [this](http://vision.fe.uni-lj.si/cvww2016/proceedings/papers/05.pdf) paper.

The traditional technique of eye blinking involves methods such as:  
- Eye localization.
- Thresholding to find the whites of the eyes.
- Determining if the “white” region of the eyes disappears for a period of time (indicating a blink).

The EAR involves a very simple calculation based on the ratio of distances between facial landmarks of the eyes and it is an efficient as well as fast way.

Our process involves 
- perform facial landmark detection 
- detect blinks in video streams

## The Eye Aspect Ratio

Each eye is represented by 6 (x, y) coordinates, starting at the left-corner of the eye and works clockwise from there onwards.  
![eye](eye.jpg)

From the referenced paper, we can take away one equation for EAR:  
![ear](ear.png)      

The above formula calculates the ratio for the vertical distance to the horizontal distance. The 2 in the denominator is because of the presence of two sets of points for the vertical distance while only one set for the horizontal.   

The EAR has almost a constant value while the eye remains open however it decreases rapidly when the eye closes as the vertical distance reaches almost 0, while on opening it again rises almost to the same level which indicates a blink.

## Detecting blinks and facial landmarks usign openCV

The follwing code is to be written in a script `detect_blinks.py`

In [None]:
from scipy.spatial import distance as dist
from imutils.video import FileVideoStream
from imutils.video import VideoStream
from imutils import face_utils
import numpy as np
import argparse
import imutils
import time
import dlib
import cv2

Now we have to define the `eye_aspect_ratio()` function

In [None]:
def eye_aspect_ratio(eye):
    '''Taking the array of eye coordinates and returning the ratio'''
    A = dist.euclidean(eye[1], eye[5])
    B = dist.euclidean(eye[2], eye[4])
    C = dist.euclidean(eye[0], eye[3])
    
    ear = (A + B)/ (2 * C)
    
    return ear

Now we parse our command line arguments

In [None]:
ap = argparse.ArgumentParser()
ap.add_argument('-p','--shape-predictor', required = True,
               help = 'path to the facial landmark predictor')
ap.add_argument('-v','--video', required = True,
               help = 'path to input video file') # omit this for a live video stream
args = vars(ap.parse_args())

The next step is to set up two constants that one may require tuning as per their requirements. We also require two variables declaration.  
- The first constant is for setting the default threshold value.  
- The second constant is the number of frames the EAR should be below threshold to consider it a blink.

In [7]:
EYE_AR_THRESH = 0.3
EYE_AR_CONSEC_FRAMES = 3

#initializing the blink counts and frame counters
COUNTER = 0
TOTAL = 0

Now we initialize the dlib face detector and facial landmark detector

In [None]:
print('Loading the facial landmark predictor ###########')
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(args['shape_predictor'])

Since our dlib detector returns all the 68 (x, y) coordinates we need to slice the coordinates for both of our eyes.  
We can use the face utils functionality for getting the coordinates for the eyes.

In [9]:
(lStart, lEnd) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
(rStart, rEnd) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]

Now if we want to use built in web cam live stream then we would require the line  
        `vs = VideoStream(src=0).start()`
        
For using through Raspberry pi use  
        `vs = VideoStream(usePiCamera=True).start()`

In [None]:
vs = FileVideoStream(args["video"]).start()
#fileStream = True
vs = VideoStream(src=0).start() # for built in camera
#vs = VideoStream(usePiCamera=True).start()   # for Raspberry pi module
fileStream = False  # use in case of live 
time.sleep(1.0)

Now we write the main code for our script

In [None]:
# Looping over the frames of the video stream
while True:
    # if this is a file video stream, then we need to check if
    # there any more frames left in the buffer to process
    if fileStream and not vs.more():
        break
        
    # grab the frame from the threaded video file stream, resize
    # it, and convert it to grayscale
    frame = vs.read()
    frame = imutils.resize(frame, width= 450)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    rects = detector(gray, 0)

After getting the frame we loop over each of the faces detected. As suggested in the paper, we average out the ratio for eyes and most probably the person will blink both of the eyes together.

In [None]:
    for rect in rects:
        # determining the facial landmarks and then coverting them to a numpy array
        shape = predictor(gray, rect)
        shape = face_utils.shape_to_np(shape)

        # extract the left and right eye coordinates, then use the
        # coordinates to compute the eye aspect ratio for both eyes
        leftEye = shape[lStart:lEnd]
        rightEye = shape[rStart:rEnd]

        leftEAR = eye_aspect_ratio(leftEye)
        rightEAR = eye_aspect_ratio(rightEye)

        # averaging the eye aspect ratio together for both eyes
        ear = (leftEAR + rightEAR) / 2.0

Now we simple handle the detected facial landmarks for the eye regions.   
We compute the convex hull for both the eyes and then visualize the eyes.

In [None]:
        leftEyeHull = cv2.convexHull(leftEye)
        rightEyeHull = cv2.convexHull(rightEye)
        cv2.drawContours(frame, [leftEyeHull], -1, (0, 255, 0), 1 )
        cv2.drawContours(frame, [rightEyeHull], -1, (0, 255, 0), 1 )

We are yet to determine if the blink happened

In [None]:
        if ear < EYE_AR_THRESH:
            COUNTER += 1 # increasing the blink frame counter (requires EYE_AR_CONSEC_FRAMES to be a blink)

            else:   # the eyes were closed for sufficient frames then consider it as a blink
                if COUNTER >= EAR_AR_CONSEC_FRAMES:
                    TOTAL += 1
                COUNTER = 0

Finally we need put the information on the screen

In [None]:
        cv2.putText(frame, "Blinks : {}".format(TOTAL), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
        cv2.putText(frame, "E.A.R : {}".format(TOTAL), (300, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
        
    cv2.imshow("frame", frame)
    key = cv2.waitKey(1) & 0xFF
    
    if key == ord("q"):
        break
        
cv2.destroyAllWindows()
vs.stop()

**References:**  
[www.medium.com ]()  
[www.pyimagesearch.com]()    
[www.stackoverflow.com]()

### de nada!