References-https://www.pyimagesearch.com/2017/04/03/facial-landmarks-dlib-opencv-python/ , https://github.com/lincolnhard/head-pose-estimation/blob/master/video_test_shape.py , https://www.learnopencv.com/head-pose-estimation-using-opencv-and-dlib/

The dlib package can work upto python 3.6, not later versions so I needed to create a virtualenv of python 3.6 and then install dlib. "shape_predictor_68_face_landmarks.py" file needs to be downloaded and kept in the same directory as that of the virtualenv.

In [1]:
# import the necessary packages
from imutils.video import VideoStream
from imutils import face_utils
import matplotlib as plt
import datetime
import imutils
import time
import dlib
import cv2
import numpy as np
import math

Real-time face landmarks detection using webcam to detect the required coordinates on a detected face. Our pre-trained model directory, "shape_predictor_68_face_landmarks.py".

In [2]:
p = "shape_predictor_68_face_landmarks.py"

detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor(p)

2D coordinates of a few points : In the case of a face, the corners of the eyes, the tip of the nose, corners of the mouth
etc are chosen. Dlib’s facial landmark detector provides us with many points to choose from. 

3D locations of the same points : the 3D location of the 2D feature points are needed. 
The 3D locations of a few points in some arbitrary reference frame are used.


Using Videocapture to get live feed from webcam and we get into the loop only if the webcam is working.
The detected face is grayscaled and landmarks are found.



Defining a function to  calculate the angle by which the face is tilted from the correct posture

In [3]:
def func(p1, p2):
    #print(p1[1], p2[1], p1[0], p2[0])
    if (p1[0] - p2[0])!=0:
        m1 = (p2[1] - p1[1])/(p1[0] - p2[0])
    else:
        m1=0
        
    if m1!=0:
        angle = int(math.atan(1/m1)*180/math.pi)   
        
    else:
        angle=90
        
    
    if(p2[0] > p1[0] and p2[1] < p1[1]):
        angle = -1 * (180 - angle)
    elif(p2[0] < p1[0] and p2[1] < p1[1]):
        angle = (180 + angle)
    #print(angle)
    return angle   

In [9]:
cap = cv2.VideoCapture(0)

while(cap.isOpened()):
    # Capture frame-by-frame
    ret, image = cap.read()

    # Our operations on the frame come here
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    
    # detection of faces into webcam's image
    rects = detector(gray)
    
    
    for (i, rect) in enumerate(rects):
        # Make the prediction and transfom it to numpy array
        shape = predictor(gray, rect)
        #print(shape)
        shape = face_utils.shape_to_np(shape)
               
    
    #print(shape)
    #2D image points, here only the six required coordinates are taken
    image_points = np.array([

                                shape[30],     # Nose tip

                                shape[8],     # Chin

                                shape[36],     # Left eye left corner

                                shape[45],     # Right eye right corne

                                shape[48],     # Left Mouth corner

                                shape[54]      # Right mouth corner

                            ], dtype="double")


    # 3D model points --> universal for all images--World Coordinates 

    model_points = np.array([
                                (0.0, 0.0, 0.0),             # Nose tip

                                (0.0, -330.0, -65.0),        # Chin

                                (-225.0, 170.0, -135.0),     # Left eye left corner

                                (225.0, 170.0, -135.0),      # Right eye right corne

                                (-150.0, -150.0, -125.0),    # Left Mouth corner

                                (150.0, -150.0, -125.0)      # Right mouth corner


                         

    ])


 
    #We can approximate the optical center by the center of the image, approximate the focal length by the width of the image
    #in pixels and assume that radial distortion does not exist.
    size = image.shape
    
    focal_length = size[1]

    center = (size[1]/2, size[0]/2)

    camera_matrix = np.array(

                             [[focal_length, 0, center[0]],

                             [0, focal_length, center[1]],

                             [0, 0, 1]], dtype = "double"

                             )

 



 

    dist_coeffs = np.zeros((4,1)) # Assuming no lens distortion
    
    #SOLVEPNP_ITERATIVE Iterative method is based on Levenberg-Marquardt optimization. In this case, the
    #function finds such a pose that minimizes reprojection error, that is the sum of squared distances
    #between the observed projections imagePoints and the projected (using projectPoints() ) objectPoints .

    (success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, 
                                                              flags=cv2.SOLVEPNP_ITERATIVE)

 


 

    # Project a 3D point (0, 0, 1000.0) onto the image plane.

    # We use this to draw a line sticking out of the nose

 
    (nose_end_point2D, jacobian) = cv2.projectPoints(np.array([(0.0, 0.0, 1000.0)]), rotation_vector, 
                                                 translation_vector, camera_matrix, dist_coeffs)

    #Draw on our image, all the required six cordinate points (x,y)
    for p in image_points:

        cv2.circle(image, (int(p[0]), int(p[1])), 3, (0,0,255), -1)

 

 

    p1 = ( int(image_points[0][0]), int(image_points[0][1]))

    p2 = ( int(nose_end_point2D[0][0][0]), int(nose_end_point2D[0][0][1]))

 
    cv2.line(image, p1, p2, (255,0,0), 2)#for drawing the 3d line from the nose tip
    
    #Specifying the parameters required in the function putText which is used to write text on images
    font = cv2.FONT_HERSHEY_SIMPLEX  
    org = (10, 20) 
    fontScale = 0.5
    color = (0, 255, 0) 
    thickness = 1 
    
    
    angle=func(p1,p2)
    
    #From the angle obtained we get to know the direction in which one must rotate one's head
    if(angle < -20 and angle > -160):
        image = cv2.putText(image, "Tilt head towards left", org, font, fontScale, color, thickness, cv2.LINE_AA) 
    elif(angle > 20 and angle < 160):
        image = cv2.putText(image, "Tilt head towards right", org, font, fontScale, color, thickness, cv2.LINE_AA)
    else:
        image = cv2.putText(image, "Perfect! Open your mouth", org, font, fontScale, color, thickness, cv2.LINE_AA)
    
    # Show the image
    cv2.imshow("Output", image)
    
    #press q for exiting and turning off the webcam
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
                break
            
    

cv2.destroyAllWindows()
cap.release()
    



127.0.0.1 - - [08/Jun/2020 13:44:46] "[35m[1mGET / HTTP/1.1[0m" 500 -
Traceback (most recent call last):
  File "C:\Users\Lenovo\Anaconda3\envs\dlib_env\lib\site-packages\flask\app.py", line 2464, in __call__
    return self.wsgi_app(environ, start_response)
  File "C:\Users\Lenovo\Anaconda3\envs\dlib_env\lib\site-packages\flask\app.py", line 2450, in wsgi_app
    response = self.handle_exception(e)
  File "C:\Users\Lenovo\Anaconda3\envs\dlib_env\lib\site-packages\flask\app.py", line 1867, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "C:\Users\Lenovo\Anaconda3\envs\dlib_env\lib\site-packages\flask\_compat.py", line 39, in reraise
    raise value
  File "C:\Users\Lenovo\Anaconda3\envs\dlib_env\lib\site-packages\flask\app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "C:\Users\Lenovo\Anaconda3\envs\dlib_env\lib\site-packages\flask\app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "C:\Users\Len

In [10]:
cv2.destroyAllWindows()
cap.release()

In [27]:
#from matplotlib import pyplot as plt
#plt.imshow(image)