Computer Pointer Controller

Gaze Cursor Control program is designed to control mouse pointer using real time video input. User can provide either video or live camera streaming as input.

Project Set Up and Installation

Step: 1

Make sure you have installed openVINO 2020.2.+

Step: 2

Clone the repository: https://github.com/zeeshananjumjunaidi/gaze-cursor-control

Step: 4

Make sure you have installed python3.5
for Linux run python -m pip install -r requirements.txt
for Windows run python -m pip install -r requirements-win.txt

Step: 5

Initialize the openVINO environment:

Linux

source /opt/intel/openvino/bin/setupvars.sh -pyver 3.5

Windows

<openvino-directory\bin\setupvars.bat>

Step 3

Download following models by using OpenVINO model downloader

Linux

1. Face Detection Model

python /opt/intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "face-detection-adas-binary-0001"  --output_dir models/intel

2. Facial Landmarks Detection Model

python /opt/intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "landmarks-regression-retail-0009"  --output_dir models/intel

3. Head Pose Estimation Model

python /opt/intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "head-pose-estimation-adas-0001"  --output_dir models/intel

4. Gaze Estimation Model

python /opt/intel/openvino/deployment_tools/tools/model_downloader/downloader.py --name "gaze-estimation-adas-0002"  --output_dir models/intel

Windows

1. Face Detection Model

python <openvino-directory>/deployment_tools/tools/model_downloader/downloader.py --name "face-detection-adas-binary-0001"  --output_dir models/intel

2. Facial Landmarks Detection Model

python <openvino-directory>/deployment_tools/tools/model_downloader/downloader.py --name "landmarks-regression-retail-0009"  --output_dir models/intel

3. Head Pose Estimation Model

python <openvino-directory>/deployment_tools/tools/model_downloader/downloader.py --name "head-pose-estimation-adas-0001"  --output_dir models/intel

4. Gaze Estimation Model

python <openvino-directory>/deployment_tools/tools/model_downloader/downloader.py --name "gaze-estimation-adas-0002"  --output_dir models/intel

Demo

Edit run.sh and update the path of models.

Linux

Open terminal and navigate to project root directory. Then run ./run.sh You can edit run.sh to change parameters, like if you want to provide video file input or use model from different location etc...

Windows

Open command prompt and navigate to project root folder. Then run run.bat You can edit run.bat to change parameters, like if you want to provide video file input or use model from different location etc...

Parameters can be changed in run.sh (for Linux) & run.bat (for Windows)

Documentation

Models Documentation:

App Parameters

usage: main.py [-h] -f FACEDETECTION -fl FACIALLANDMARK -hp HEADPOSE -g
               GAZEESTIMATION -i INPUT [-l CPU_EXTENSION]
               [-prob PROB_THRESHOLD] [-d DEVICE] [-pf] [-pfl] [-php] [-pge]
               -fliph FLIP_HORIZONTAL

optional arguments:
  -h, --help            show this help message and exit
  -f FACEDETECTION, --facedetection FACEDETECTION
                        Path of Face Detection Model .xml file. required*
  -fl FACIALLANDMARK, --faciallandmark FACIALLANDMARK
                        Path of Face Landmark Model .xml file. required*
  -hp HEADPOSE, --headpose HEADPOSE
                        Path of Head Pose Model .xml file. required*
  -g GAZEESTIMATION, --gazeestimation GAZEESTIMATION
                        Path of Gaze Estimation model .xml file. required*
  -i INPUT, --input INPUT
                        Provide video path or write "cam" for camera
                        streaming. required*
  -l CPU_EXTENSION, --cpu_extension CPU_EXTENSION
                        Provide CPU extension
  -prob PROB_THRESHOLD, --prob_threshold PROB_THRESHOLD
                        probability threshold for face detection values from
                        0.1 to 1
  -d DEVICE, --device DEVICE
                        Target device to be used by OpenVINO.CPU, GPU, FPGA or
                        MYRIADRead https://docs.openvinotoolkit.org/latest/_do
                        cs_IE_DG_inference_engine_intro.html for more
                        information.default: CPU
  -pf, --previewFaceDetection
                        Preview face detection
  -pfl, --previewFaceLandmark
                        Preview face landmarks
  -php, --previewHeadPose
                        Preview head pose
  -pge, --previewGazeEstimation
                        Preview gaze esitmation vector
  -fliph FLIP_HORIZONTAL, --flip_horizontal FLIP_HORIZONTAL
                        Flip input horizontally, incase of video is flipped
                        horizontally

Project Structure

.run.sh (for Linux) Main file to run the Gaze Cursor Control pipeline.
.run.bat (for Windows) Main file to run the Gaze Cursor Control pipeline.
src folder contains all models definition, including input feeder, mouse controller, and main pipeline to run.
src/main.py Main pipeline file to run this project. This require input of models, video stream, and device to be used for inference.
src/mouse_controller.py this file contain MouseController class which take x,y coords., speed, and precision and set those values to actual mouse pointer.
src/face_detection_model.py Used for face detection
src/face_landmark_model.py Used for detecting eyes in a given face.
src/head_pose_model.py Used for detecting Head pose/ orientation.
src/gaze_estimation_model.py Used for gaze prediction given left, right eyes and head pose angles.
src/input_feeder.py Contains InputFeeder class to initialize camera and return frame sequentially.
models folder contains models provided by the Intel for face detection, landmark detection, gaze estimation, and head pose prediction.
media folder contains demo input files for testing this program.

Benchmarks

Following result is taken on Virtual Machine (Intel® Core™ i7-7700 Processor @4.40GHz).

Name	Model Load Time FP32	Inference time (FP32)
Face detection	0.172	0.008
Face Landmark detection	0.057	0.0007
Head Pose Estimation	0.062	0.002
Gaze Estimation	0.080	0.001
Total Average	0.092	0.003

Name	Model Load Time FP16	Inference time (FP16)
Face detection	0.348	0.013
Facial Landmark detection	0.105	0.0001
Head Pose detection	0.173	0.002
Gaze Estimation	0.170	0.003
Total Average Inference	0.199	0.004

Model Type	FPS
FP32	~13FPS
FP16	~12FPS

Result on Intel® Core™ i7-4702MQ @ 2.20GHz Processor

Name	Model Load Time FP32	Inference time (FP32)
Face detection	0.6873111724853516	0.030996084213256836
Face Landmark detection	0.2661552429199219	0.0010018348693847656
Head Pose Estimation	0.21300029754638672	0.001981973648071289
Gaze Estimation	0.24499297142028809	0.002998828887939453
Total Average	0.3528649210929870775	0.0092446804046630859

Name	Model Load Time FP16	Inference time (FP16)
Face detection	0.5799996852874756	0.028020381927490234
Facial Landmark detection	0.16299867630004883	0.0010178089141845703
Head Pose detection	0.17101383209228516	0.0009822845458984375
Gaze Estimation	0.21700549125671387	0.0029838085174560547
Total Average Inference	0.282754421234130865	0.008251070976257324125

Model Type	FPS
FP32	~14FPS
FP16	~16FPS

Result on IGPU Intel® HD Graphics 4600

Name	Model Load Time FP32	Inference time (FP32)
Face detection	0.3946526050567627	0.026006698608398438
Face Landmark detection	0.16802239418029785	0.0009887218475341797
Head Pose Estimation	0.18301916122436523	0.002997159957885742
Gaze Estimation	0.22898149490356445	0.0020008087158203125
Total Average	0.243668914	0.007998347

Name	Model Load Time FP16	Inference time (FP16)
Face detection	0.5998866558074951	0.026001930236816406
Facial Landmark detection	0.17197704315185547	0.0009853839874267578
Head Pose detection	0.18929433822631836	0.0020143985748291016
Gaze Estimation	0.22699999809265137	0.0020155906677246094
Total Average Inference	0.282754421234130865	0.007754326

Model Type	FPS
FP32	~16FPS
FP16	~16FPS

Results

I tested this program on Intel i7 (7700K) VM. From the results, it seems like FP16 too long time for loading, but for i7 (7700K) loading time is low and there is not much difference in model inference as FPS are almost similar in both type of the models. Higher floating point precision usually have higher accuracy.

Normally higher precision model usually takes more time for inference than lower precision models.

From the result, we can see that loading time and inference time of FP16 is slower than the FP32 by 10ms in virtual machine although it is running on i7 7700 4.40GHz. While on i7 4702MQ 2.20GHz, FP16 seems faster than FP32 by average of 2FPS and 0.20ms.

We prefer using lower precision model when we have constraints such as low power consumption and less processing power.

Testing face detection model on different hardware.

I tested Face detection model on multiple devices.

CPU - i5-6500te:iei-mustang-f100-a10 - FP32
IGPU - i5-6500te:intel-hd-530 - FP32
VPU - intel-ncs2 - FP32
FPGA - iei-mustang-f100-a10 - FP16

Model Loading Time

Inference Time

FPS

Edge Cases

If program unable to find face in video input, it will print 'Unable to detect the face' and continue to read another frame.
Model will use only one face detected in the streaming.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
bin		bin
images		images
models/intel		models/intel
src		src
.gitignore		.gitignore
README.md		README.md
requirements-win.txt		requirements-win.txt
requirements.txt		requirements.txt
run.bat		run.bat
run.sh		run.sh

zeeshananjumjunaidi/gaze-cursor-control

Folders and files

Latest commit

History

Repository files navigation