Computer Pointer Controller is an application that helps control the mouse pointer of a computer using the gaze of a person. It uses multiple deep learning based model to achieve desired results. I have built and tested this on Mac but the instructions mentioned below should work for Windows and Ubuntu with few syntax changes.
Install OpenVino Toolkit version 2020.3.194 from here Follow the instructions to setup the environment for OpenVino
The setup also requires to download some of the pre-trained deep learning models which can be downloaded using CLI or directly from the OpenVino website.
- Face Detection Model
- Facial Landmarks Detection Model
- Head Pose Estimation Model
- Gaze Estimation Model
- Face Detection Model
python "/opt/intel/openvino_2020.3.194/deployment_tools/tools/model_downloader/downloader.py" --name "face-detection-adas-binary-0001"
- for landmarks-regression-retail-0009
python "/opt/intel/openvino_2020.3.194/deployment_tools/tools/model_downloader/downloader.py" --name "landmarks-regression-retail-0009"
- for head-pose-estimation-adas-0001
python "/opt/intel/openvino_2020.3.194//deployment_tools/tools/model_downloader/downloader.py" --name "head-pose-estimation-adas-0001"
- for gaze-estimation-adas-0002
python "/opt/intel/openvino_2020.3.194/deployment_tools/tools/model_downloader/downloader.py" --name "gaze-estimation-adas-0002"
Note If not using the openvino environement already, run the below command before starting the application:
source /opt/intel/openvinso_2020.3.193/bin/setupvars.sh
python <demo.py directory>
The directory has the following structure:

Note The code is set to use webcam for the application but can be used to test on video by passing -i argument. Also, below are the command line arguments that can be used to modify the parameters but are also set to default in the code.
1. -h : Get the information about all the command line arguments
2. -fd : Specify the path of Face Detection model's xml file
3. -fl : Specify the path of Facial landmarks Detection model xml file
3. -hp : Specify the path of Head Pose Estimation model's xml file
4. -ge : Specify the path of Gaze Estimation model's xml file
5. -i : Specify the path of input video file or enter cam for taking input video from webcam
6. -d : Specify the target device to infer the video file on the model. Suppoerted devices are: CPU, GPU,FPGA (For running on FPGA used HETERO:FPGA,CPU), MYRIAD.
7. -pt : Specify the probability threshold for face detection model to detect the face accurately from video frame.
8. -spd : Specify the speed of the mouse pointer
9. -prc : Specify yhe precission of the mouse pointer
10. -ctrl : Specify whether to control mouse or not
11. -vis : Specify weather to visualize the output or not
- face-detection-adas-binary-0001
| Type | Size of Model | Time to load |
|---|---|---|
| FP32-INT1 | 1.86M | 381.69ms |
- head-pose-estimation-adas-0001
| Type | Size of Model | Time to load |
|---|---|---|
| FP32 | 7.34M | 449.49ms |
- landmarks-regression-retail-0009
| Type | Size of Model | Time to load |
|---|---|---|
| FP32 | 786KB | 186.40ms |
- gaze-estimation-adas-0002
| Type | Size of Model | Time to load |
|---|---|---|
| FP32 | 7.24M | 333.55 |
By running the above application on MacBook with intel core i9 processor and 16gb memory the results were as follow:
The above results shows that it takes a lot of time to control the mouse pointer for just one frame but if we use the inference to only show the output then application is able to process a frame in 120ms.

Make sure to enable the OpenVino enviroment before executing the demo. The firewall may also need to be turned off in some case.
-
The lighting matters greatly for the video feed, so sometimes models does not clearly views the gaze.
-
If for some reason model can not detect the face then it throws off track and pointer continues in same direction.
-
If there are more than one face detected in the frame then model takes the first detected face for control the mouse pointer.
-
Some firewall restrictions may not allow the application to control the mouse.