# <font style = "color:rgb(50,120,229)">Blink and Drowsiness Detection</font>

# <font style = "color:rgb(50,120,229)">Overview</font>

In this module, we will learn two simple but impactful applications involving analysis of the eye region. 

The first application is the detection of blinks. This has a variety of use cases. It can be used as a Human-Computer-Interface ( HCI ) specially for people with severe paralysis or diseases like multiple sclerosis or cerebral palsy. They are not able to move most parts of their body except the eyes. Their eye movements and blinks can be tracked to understand what they want to say or how they feel.

A fun use case can be to use it as a signal to the camera app to take the selfie. Since, clicking the button on the camera can shake it a bit, no matter how stable the hand is. 

Another related application is drowsy driver detection. We all like to go on long drives, but the one thing that is inseparable from long drives is fatigue, which brings us to the problem of drowsy driving. The risk, danger and sometimes tragic results of drowsy driving are alarming. The National Highway Traffic Safety Administration conservatively estimates that 100,000 police-reported crashes resulted in an estimated 1,550 deaths, 71,000 injuries, and \$12.5 billion in monetary losses due to driver fatigue.

In the following sections, we will see how to create a simple application for detecting blinks and drowsiness using OpenCV. Once you are done with it, you can make your own modifications to the code and build a mobile application for iOS or Android or a standalone gadget using stuffs like Raspberry pi which can be mounted on the car dashboard.

# <font style = "color:rgb(50,120,229)">Approach</font>

The first thing we need to do is to check the status of the eye, whether it is open or closed. Many methods have been considered in literature for checking the status of the eye. One of these methods checks whether the pupil is visible by detecting the black circle in the eye region, another method finds the the eye region and checks for white pixels. If the number of white pixels are above a threshold, then we can say eye is open. 

We will track the eyelids of the person using Dlib’s facial landmark detector. We will use this to decide the status of the eye for every frame and make further decisions like the blink count and whether the person is drowsy. We will follow the following steps to arrive at our application. After we have described the process, we will see how it is being done in the code along with detailed explanations and intermediate results.

## <font style = "color:rgb(50,120,229)">Step 1 : Find the eye region using Dlib Facial Landmark Detector</font>

We will use Dlib’s Facial Landmark Detector for finding the eye region. We could also use the eye detector using Haar Cascades in OpenCV, but we found it to be less accurate and also we need the exact positions of the eyelids to perform blink detection. Given below is a sample image showing the points detected using Dlib. We can see that the left eye points are from 36 to 41 and the right eye points are from 42 to 47.

<center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeregion.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeregion.jpg"/></a></center>
<center>Figure showing the dlib points for the eye region</center>

## <font style = "color:rgb(50,120,229)">Step 2 : Find the area enclosed by the eyelids</font>

We want to check whether the eye is open or closed. Using the left / right eye points we can create a mask for the eyelids as shown in the figure below. We can find the number of pixels inside the white region which corresponds to the area enclosed by the eyelids.



| <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeMask1.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeMask1.jpg"/></a></center> | <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeMask2.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeMask2.jpg"/></a></center> |
| -------- | -------- | 
|<center>Mask for eyelids of open eye</center>    | <center>Mask for eyelids of closed eye</center>     | 

## <font style = "color:rgb(50,120,229)">Step 3 : Check the status of the eye (Open / Closed)</font> 

A face close to the camera will have a higher area than the one far away from the camera. Thus, we need to normalize the area for robust decision making. To do so, we take the distance between the two corners of the eye (since this length will also scale accordingly as the size of face changes) and divide the area of eye by the square of this value. The eye is open if the normalized area > threshold otherwise closed.

It should be noted that we can also use other measurements to normalize the eye area like the total area of the face or the area enclosed by the points 17, 26 and 28 or something else as long as this area represents the size of face. 

## <font style = "color:rgb(50,120,229)">Step 4 : Decision on blink and drowsiness using a Finite State Machine</font>

A blink is supposed to last for 300 to 400 milliseconds. So, if the eye remains closed for more than 800-900 ms, we can say that the person is either drowsy or sleeping. On the other hand, if the eye reopens after just 100-200ms, should it be considered a valid blink or should it be discarded? It depends on what you want to do. 

After getting the status of the eye, we need to deal with 3 things. False blink, Valid Blink and Drowsiness. For doing this in a systematic way, we will use a Finite State Machine ( FSM ) to keep track of the history of eye status and decide whether the blink should be counted or the person is drowsy. 

A Finite State Machine ( FSM ) is a simple model for keeping track of events triggered by external inputs. This is done by assigning intermediate states to decide what happens when a particular input comes and which event is triggered. In our case the events are blink and drowsiness and the external input is the status of the eye. The $n^{th}$ state indicate the eye was found to be closed in the $n^{th}$ frame.

Let’s take an example to illustrate the working of the Finite State Machine. First, let’s have the following assumptions for the sake of simplicity.

1. Time between two frames is 100ms. ( Note : This may vary from computer to computer due to variations in the processor speed and resolution of camera. )

2. A blink will be valid if it lasts for at least 300ms ( or 3 frames ). ( Note : This may be different for different people as some people blink very fast and some take their own time. Also, whether you want to record very fast or partial blinks or not. )

3. The person is drowsy if the eye remains closed for more than 900ms ( or 9 frames ).

The FSM under the above assumptions for blink and drowsiness detection is drawn below.
<center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-FSM.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-FSM.jpg"/></a></center>
<center>Finite state machine showing the state transitions and respective decisions.</center>

&nbsp;
The explanation for the above diagram is given below.

1. The input is the eye status and output is the blink or drowsiness status. The bubbles represent the states. The transition from state X to Y is shown by an arrow along with the input ( Open / Closed ) and output ( Blink = 0 / 1 or Drowsiness ).

2. The initial state is 0 and stays there as long as the eye remains open ( O ) and blink is not reported ( B=0 ). This is indicated by the looping arrow on the left of state 0.

3. In state 0, If the eye closes, the state increments and goes to state 1 and still B=0 which is indicated by the arrow on top of state 0. In every subsequent states, if the eye remains closed, the state increments and waits for the next frame. 

4. In state 1, if the eye opens ( O ) then it means that the blink has lasted for less than 200ms and thus it is an invalid blink. So, we don’t report the blink ( B=0 ) and reset the state to 0. Similar transitions happen for state 2.

5. In state 3, if the eye opens ( O ) then it means that the blink has lasted for more than 300ms and thus it is a valid blink. The state is reset to 0 and blink is reported ( B=1 ). This logic is applied to all the subsequent states till state 9.

6. If the state reaches 9, it means 900 ms have passed after the eye had closed. Thus if the eye is found closed ( C ) in this state, we report drowsiness ( D=1 ) and stay in this state. This is indicated by the last looping arrow on the right of state 9.

**<font style = "color:rgb(255,0,0)">NOTE</font>** : It should be noted that in real life, the eye closes and opens gradually. In actual implementation, the eye is considered to be closed only when the eye is almost closed ( say > 80% closed ). The time taken for the gradual opening / closing of the eye ( which can be 100-200 ms ) is not accounted for in the above theory. But, in actual implementation, we account for this time and keep the blink time to less than 300ms, ( say 200ms ) so that the eye closing and opening duration is also accounted for. 

### <font style = "color:rgb(8,133,37)">Blink Detection and Drowsy Driver Code and Tutorial</font>

In [4]:
import cv2
import dlib
import time
import sys
import numpy as np
import matplotlib.pyplot as plt
from dataPath import DATA_PATH
%matplotlib inline

In [2]:
import matplotlib
matplotlib.rcParams['figure.figsize'] = (6.0,6.0)
matplotlib.rcParams['image.cmap'] = 'gray'

In [5]:
FACE_DOWNSAMPLE_RATIO = 1.5
RESIZE_HEIGHT = 360

thresh = 0.43

#global variables for dlib face landmark detector
modelPath = DATA_PATH + "models/shape_predictor_68_face_landmarks.dat"
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(modelPath)

# dlib points for eyes
leftEyeIndex = [36, 37, 38, 39, 40, 41]
rightEyeIndex = [42, 43, 44, 45, 46, 47]

# Variables for calculating FPS
blinkCount = 0
drowsy = 0
state = 0
blinkTime = 0.2     # 200 ms
drowsyTime = 1.0    # 1000 ms

In the below function we find the status of the eye ( Open / Closed ) as described in Steps 2 & 3 above

In [6]:
def checkEyeStatus( landmarks ):

  # Create a black image to be used as a mask for the eyes
  mask = np.zeros(frame.shape[:2], dtype = np.float32)

  # Create a convex hull using the points of the left and right eye
  hullLeftEye = []
  for i in range(0,len(leftEyeIndex)):
    hullLeftEye.append((landmarks[leftEyeIndex[i]][0],landmarks[leftEyeIndex[i]][1]))

  cv2.fillConvexPoly(mask, np.int32(hullLeftEye), 255)

  hullRightEye = []
  for i in range(0,len(rightEyeIndex)):
    hullRightEye.append((landmarks[rightEyeIndex[i]][0], landmarks[rightEyeIndex[i]][1]))

  cv2.fillConvexPoly(mask, np.int32(hullRightEye), 255);

  # cv2.imshow("mask",mask)

  # find the distance between the tips of left eye
  lenLeftEyeX = landmarks[leftEyeIndex[3]][0] - landmarks[leftEyeIndex[0]][0];
  lenLeftEyeY = landmarks[leftEyeIndex[3]][1] - landmarks[leftEyeIndex[0]][1];

  lenLeftEyeSquare = lenLeftEyeX*lenLeftEyeX + lenLeftEyeY*lenLeftEyeY;

  # find the area under the eye region
  eyeRegionCount = cv2.countNonZero(mask)

  # normalize the area by the length of eye
  # The threshold will not work without the normalization
  # the same amount of eye opening will have more area if it is close to the camera
  normalizedCount = eyeRegionCount/np.float32(lenLeftEyeSquare)

  eyeStatus = 1          # 1 -> Open, 0 -> closed
  if (normalizedCount < thresh):
    eyeStatus = 0

  return eyeStatus

<center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeMask1.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-eyeMask1.jpg" width=400/></a></center>
<center>Mask created from the left and right eyes</center>

In this function we describe the working of the Finite State Machine as explained in Step 4 above. For every new frame, it checks the eye status and transitions to the next state with some output. It should be noted that the output ( blink / no blink / drowsy ) depends on the present state, the previous state and the present input ( eye status ).

In [7]:
#simple finite state machine to keep track of the blinks. we can change the behaviour as needed.
def checkBlinkStatus(eyeStatus):
  global state,blinkCount,drowsy

  #open state and false blink state
  if( state >=0 and state <= falseBlinkLimit):
    # if eye is open then stay in this state
    if(eyeStatus):
      state = 0
    # else go to next state
    else:
      state += 1

  #closed state for (drowsyLimit - falseBlinkLimit) frames
  elif(state > falseBlinkLimit and state <= drowsyLimit):
    if(eyeStatus):
      state = 0
      blinkCount += 1
    else:
      state += 1

  # Extended closed state -- drowsy
  else:
    if(eyeStatus):
      state = 0
      blinkCount += 1
      drowsy = 0
    else:
      drowsy = 1
  # print "state {}, drowsy {}".format( state, drowsy)

| <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame75.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame75.jpg"/></a></center> | <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame76.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame76.jpg"/></a></center> | <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame77.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame77.jpg"/></a></center> |<center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame78.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame78.jpg"/></a></center>|
| -------- | -------- | -------- |--|

<center>Illustration of False Blink</center>

| <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame129.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame129.jpg"/></a></center> | <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame130.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame130.jpg"/></a></center> | <center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame136.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame136.jpg"/></a></center> |<center> <a href="https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame137.jpg"><img src = "https://www.learnopencv.com/wp-content/uploads/2017/12/opcv4face-w4-m6-frame137.jpg"/></a></center>|
| -------- | -------- | -------- |--|

<center>Illustration of Valid Blink</center>

In the code below, we find the facial landmark points using the dlib face and shape detector. We resize the image if required for speeding up the face detection process as this takes more time. The image is resized back to the original size before applying the points detector. If face is not detected, the function returns with a value 1.

In [8]:
def getLandmarks(im):
  imSmall = cv2.resize(im,None,
                       fx=1.0/FACE_DOWNSAMPLE_RATIO,
                       fy=1.0/FACE_DOWNSAMPLE_RATIO,
                       interpolation = cv2.INTER_LINEAR)
  #detect faces
  rects = detector(imSmall, 0)
  if len(rects) == 0:
    return 1

  #scale the points before sending to the pose predictor as we will send the original image
  newRect = dlib.rectangle(int(rects[0].left()*FACE_DOWNSAMPLE_RATIO),
                           int(rects[0].top()*FACE_DOWNSAMPLE_RATIO),
                           int(rects[0].right()*FACE_DOWNSAMPLE_RATIO),
                           int(rects[0].bottom()*FACE_DOWNSAMPLE_RATIO))

  # Create an array for storing the facial points
  points = []
  [points.append((p.x, p.y)) for p in predictor(im, newRect).parts()]
  return points

The code below is used for finding the frame rate or the time between two frames. This will be used to get the number of frames that correspond to 300ms and 1000ms for your computer. The FSM is designed on the basis of frames and not on the basis of time. You can decrease / increase the blinkTime to suit your blink speed.

In [None]:
capture = cv2.VideoCapture(0)
#####################################################################################
# Calculate the FPS for initialization
# Different computers will have relatively different speeds
# Since all operations are on frame basis
# We want to find how many frames correspond to the blink and drowsy limit

# Reading some dummy frames to adjust the sensor to the lighting
for i in range(5):
  ret, frame = capture.read()

totalTime = 0.0
validFrames = 0
dummyFrames = 50
spf = 0

while(validFrames < dummyFrames):
  validFrames += 1
  t = time.time()
  ret, frame = capture.read()
  height, width = frame.shape[:2]
  IMAGE_RESIZE = np.float32(height)/RESIZE_HEIGHT
  frame = cv2.resize(frame,None,
                     fx=1.0/IMAGE_RESIZE,
                     fy=1.0/IMAGE_RESIZE,
                     interpolation = cv2.INTER_LINEAR)

  landmarks = getLandmarks(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
  timeLandmarks = time.time() - t

  # if face not detected then dont add this time to the calculation
  if landmarks == 1:
    validFrames -= 1
    cv2.putText(frame, "Unable to detect face, Please check proper lighting", (10, 50), cv2.FONT_HERSHEY_COMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
    cv2.putText(frame, "Or Decrease FACE_DOWNSAMPLE_RATIO", (10, 150), cv2.FONT_HERSHEY_COMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
    cv2.imshow("Blink Detection Demo ",frame)
    if cv2.waitKey(1) & 0xFF == 27:
      sys.exit()
  else:
    totalTime += timeLandmarks

spf = totalTime/dummyFrames

print("Current SPF (seconds per frame) is {:.2f} ms".format(spf*1000) )

The number of frames for false blink limit and drowsy limit are calculated below. These will be used by the FSM in the checkBlinkStatus function.



In [None]:
drowsyLimit = drowsyTime/spf
falseBlinkLimit = blinkTime/spf
print ('''drowsyLimit {} ( {:.2f} ms) ,  
        False blink limit {} ( {:.2f} ms) '''.format(drowsyLimit, 
        drowsyLimit*spf*1000, falseBlinkLimit, 
        (falseBlinkLimit+1)*spf*1000))

The main function is given below. It runs in an infinite loop and gets the frame and resizes the frame to a fixed height to get rid of any variations across computers. ( you can remove the step if you have a fast computer and you want to work with full size images.)

In [None]:
# The main loop
while(1):
  try:
    t = time.time()
    ret, frame = capture.read()
    height, width = frame.shape[:2]
    IMAGE_RESIZE = np.float32(height)/RESIZE_HEIGHT
    frame = cv2.resize(frame,None,
                       fx=1.0/IMAGE_RESIZE,
                       fy=1.0/IMAGE_RESIZE,
                       interpolation = cv2.INTER_LINEAR)
    landmarks = getLandmarks(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))

    # if face not detected
    if landmarks == 1:
      cv2.putText(frame, "Unable to detect face, Please check proper lighting", (10, 50), cv2.FONT_HERSHEY_COMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
      cv2.putText(frame, "Or Decrease FACE_DOWNSAMPLE_RATIO", (10, 150), cv2.FONT_HERSHEY_COMPLEX, 0.5, (0, 0, 255), 1, cv2.LINE_AA)
      cv2.imshow("Blink Detection Demo ",frame)
      if cv2.waitKey(1) & 0xFF == 27:
        break
      continue

    # check whether eye is open or close
    eyeStatus = checkEyeStatus(landmarks)

    # pass the eyestatus to the state machine
    # to determine the blink count and drowsiness status
    checkBlinkStatus(eyeStatus)
    # Plot the eyepoints on the face for showing
    for i in range(0,len(leftEyeIndex)):
      cv2.circle(frame, (landmarks[leftEyeIndex[i]][0],landmarks[leftEyeIndex[i]][1]), 1, (0, 0, 255), thickness=1, lineType=cv2.LINE_AA)
    for i in range(0,len(rightEyeIndex)):
      cv2.circle(frame, (landmarks[rightEyeIndex[i]][0],landmarks[rightEyeIndex[i]][1]), 1, (0, 0, 255), thickness=1, lineType=cv2.LINE_AA)
    if(drowsy):
      cv2.putText(frame, "!!! DROWSY !!! ", (50, 100), cv2.FONT_HERSHEY_COMPLEX, 1.5, (0, 0, 255), 3, cv2.LINE_AA)
    else:
      cv2.putText(frame, "Blinks : {}".format(blinkCount), (50, 50), cv2.FONT_HERSHEY_COMPLEX, .9, (0,0,255), 2, cv2.LINE_AA)

    cv2.imshow("Blink Detection Demo ",frame)

    if cv2.waitKey(1) & 0xFF == 27:
      break
    print("Time taken", time.time() - t)


  except Exception as e:
    print(e)
capture.release()
cv2.destroyAllWindows()

# <font style = "color:rgb(50,120,229)">References and Further Readings</font>

1. [http://drowsydriving.org/about/facts-and-stats/](http://drowsydriving.org/about/facts-and-stats/)

2. [http://docs.opencv.org/trunk/d7/d8b/tutorial_py_face_detection.html](http://docs.opencv.org/trunk/d7/d8b/tutorial_py_face_detection.html)

3. [http://docs.opencv.org/2.4/doc/tutorials/objdetect/cascade_classifier/cascade_classifier.html](http://docs.opencv.org/2.4/doc/tutorials/objdetect/cascade_classifier/cascade_classifier.html)

4. [https://www.learnopencv.com/training-better-haar-lbp-cascade-eye-detector-opencv/](https://www.learnopencv.com/training-better-haar-lbp-cascade-eye-detector-opencv/)

5. [http://www.pyimagesearch.com/2017/04/24/eye-blink-detection-opencv-python-dlib/](http://www.pyimagesearch.com/2017/04/24/eye-blink-detection-opencv-python-dlib/)

6. [http://www.pyimagesearch.com/2017/05/08/drowsiness-detection-opencv/](http://www.pyimagesearch.com/2017/05/08/drowsiness-detection-opencv/)