## Real time blink detection 

In this article we will learn how to make a real time blink detector application using computer vision down the line we will be also using more libraries and mathematics to build such an application we will be going through complete pipeline and code by code analysis for the same.

## Application of blink detection application

1. **Driver drowsiness detection:** As name suggest this application is very useful in building real world application like detecting whether if driver is sleepy or not while detecting the eye moment or blink.


2. **Iris tracking:** This is another use case while we can also track the iris movement for building AR kind of application.


3. **Virtual gaming:** We are in the age of virtual reality evolution and by far mostly VR powered games are either hand or body movement driven but we can also build games which are eye movement driven.

## How we will gonna achieve this?

* Firstly we will extract only those points which are located near our eyes to get the enclosed area of the eye and then we will find out the **EAR (Eye Aspect Ratio)** which will help us to determine that blink as an event has occured or not.
* There are total of **6 XY coordinates** for an eye which starts from the left corner of the eye and then from that position it will go to **clockwise direction**.
* There will be a relation between the **height and width** of these coordinates.

## Let's starts by importing required libraries

In [1]:
from scipy.spatial import distance as dist
from imutils.video import FileVideoStream
from imutils.video import VideoStream
from imutils import face_utils
import numpy as np
import imutils
import time
import dlib
import cv2

* **distance:** This library will help us to find the **Ecludean distance** and remove some burden of applying mathematical calculations.
* **FileVideoStream:** This library will help us to stream the videos from the **file explorer** i.e. video file (.mp4 or other type).
* **VideoStream:** This library will help us to stream the **real-time** video from the **webcam**.
* **face_utils:** This library will be responsible for grabbing the face landmarks (here eyes).
* **numpy:** This library will help us to perform some other **mathematical operations** like arrays.
* **time:** This library is to get the system time or to get delayed i.e. **sleep function**.
* **dlib:** This is the heart of this application as this library will help us get the access to **68 landmarks** of the face in real-time.
* **cv2:** Computer vision library to perform some image processsing techniques.

## Eye aspect ratio (EAR) function

In [2]:
def eye_aspect_ratio(eye):
    
    A = dist.euclidean(eye[1], eye[5])
    B = dist.euclidean(eye[2], eye[4])

    C = dist.euclidean(eye[0], eye[3])

    ear = (A + B)/ (2.0 * C)

    return ear

Code breakdown:
1. First we will get the **Ecludean distance** between the 2 coordinates of eyes
2. While grabbing the coordinates we will first have the **vertical eye landmarks**.
3. Then we will have the **horizontal eye landmarks** using the same algorithm.
4. After grabbing the coordinates we will calculate the **Eye Aspect ratio**.
5. Then at the last we will **return the EAR.**

## Define constants

In [3]:
EYE_AR_THRESH = 0.3
EYE_AR_CONSEC_FRAMES = 3

* **Eye aspect ratio constant:** This constant value will act like a threshold value to detect the blink.
* **Count of frames:**: This constant value is the threshold value for the number of consecutive frames.

## Initializing the variables

In [4]:
COUNTER = 0
TOTAL = 0

* **Counter:** This value will denote the total number of consecutive frames that will have the threshold value less than EYE ASPECT RATIO constant.
* **Total:** This value will make a count of total number of blinks in certain number of frames.

## Initialize the dlib's face detector

In [5]:
print("Loading the dlib's face detector")
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor("shape_predictor_68_face_landmarks.dat")

Loading the dlib's face detector


* **detector**: Here we will initialize the dlib library (**frontal face detector**).
* **predictor**: Now we will use the **shape_predictor** method to load the **.dat** file and predict the landmarks accordingly.

## Get the index of facial landmarks (here eye)

In [6]:
(lStart, lEnd) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
(rStart, rEnd) = face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]

* Firstly we are grabbing the coordinates values of **left eye** using **face_utils** function.
* Secondly we will do the same for **right_eye**.

## Loading the video/ real-time streaming.

In [7]:
print("Starting the video/live stteaming")
vs = FileVideoStream("Video.mp4").start()
fileStream = True
# vs = VideoStream(src = 0).start() # run this line if you want to run it on webcam.
# vs = VideoStream(usePiCamera = True).start()
fileStream = False
time.sleep(1.0)

Starting the video/live stteaming


Code breakdown:

1. While using **FileVideoStream** we will initialize the object with the video file location and then **start()** the same.
2. Setting the **fileStream** value as **True** after successfull streaming of file (video).
3. If we want a **real-time** streaming we will be using the **VideoStream(src=0).start()**.
4. Then we will using the **sleep** function from **time** library to delayed the frame.

## Main logic

In [8]:
while True:

    if fileStream and not vs.more():
        break

    frame = vs.read()
    frame = imutils.resize(frame, width = 450)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    rects = detector(gray, 0)

    for rect in rects:

        shape = predictor(gray, rect)
        shape = face_utils.shape_to_np(shape)

        leftEye = shape[lStart:lEnd]
        rightEye = shape[rStart:rEnd]
        leftEAR = eye_aspect_ratio(leftEye)
        rightEAR = eye_aspect_ratio(rightEye)

        ear = (leftEAR + rightEAR) / 2.0

        leftEyeHull = cv2.convexHull(leftEye)
        rightEyeHull = cv2.convexHull(rightEye)

        cv2.drawContours(frame, [leftEyeHull], -1, (0, 255, 0), 1)
        cv2.drawContours(frame, [rightEyeHull], -1, (0, 255, 0), 1)


        if ear < EYE_AR_THRESH:
            COUNTER += 1

        else:
            if COUNTER >= EYE_AR_CONSEC_FRAMES:
                TOTAL += 1
            #reset the eye frame counter

            COUNTER = 0

        cv2.putText(frame, "Blinks:{}".format(TOTAL), (10, 30), cv2.FONT_HERSHEY_COMPLEX, 0.7, (0, 0, 255), 2)
        cv2.putText(frame, "EAR:{:.2f}".format(ear), (300, 30), cv2.FONT_HERSHEY_COMPLEX, 0.7, (0, 0, 255), 2)

    cv2.imshow("Frame", frame)
    key = cv2.waitKey(12) & 0xFF

    if key == ord("q"):
        break

cv2.destroyAllWindows()
vs.stop()

Code breakdown:

1. Firstly we will loop over the video streaming and along with that we will also check that whether there are any **more frames left in the buffer**.
2. Now we will first pick all the **frames** from the video/live streaming then we will **resize** to our desired dimensions and at the last we will convert it to **grayscale**.
3. Then using the **detector** function we will detect the faces.
4. Now, with the help of **predictor** function we will detect the facial landmarks then convert it into **numpy array**.
5. Then we will first grab the **left and right** eye coordinate then will compute the **Eye Aspect Ratio** and do the average by 2 i.e. 2 eyes.
6. Then we will compute the **convex hull** for both the eyes so that we can visualise the eyes by **drawing** methods using **contours**.
7. Now, we will check that our calculated **EAR** should be below than the **threshold value** so that we can increase the **blink counter**.
8. Else if the **EAR** is greater than **threshold value** then we will increase the counter of **Total frames** so that we can check other frames as well also if the eyes were closed for certain frames then also we will increase the **number of blinks**.
9. Then using the **putText** methid we will draw the **number of blinks** in each frame and also the **Eye Aspect Ratio (EAR) value**
10. Then at the last using the **show** function we will show the main frame and along with that we will also code to exit from the loop i.e. **q** and at the last for clean-up process we will destroy all the windows.

## Conclusion

1. Firstly we saw the real world applicaton of **blink detection application** then we saw what we will be doing in a nutshell.
2. The main key takeaway from this article is to **segment the eyes** by using it's coordinates.
3. We have also learnt about the concept of **ecludean distance** and its formulae using the specific library.
4. Along with that we also came across the concept of **Eye Aspect Ratio (EAR)** which is the soul of this application.
5. We also learnt how **dlib** library can detect the landmarks of the face and along with that reading the video files as well as live streaming.

video link :
https://usercontent.one/wp/www.computervision.zone/wp-content/uploads/2022/01/Video.mp4?media=1632743877