Acomoeye-NN: NVGaze gaze estimation with upgrades

Implementation of one-shot direct gaze estimation NN based on NVGaze as described in Nvidia paper. In a long run, the project aims to implement low latency eye-tracking for VR/AR goggles' high demands. The implementation is based on Intel's Openvino training extension from license plate repo.

I have added few additional techniques to upgrade base NVGaze implementation, which you can turn on/off in config:

🍉 coordConv (paper)
🍉 globalContext (paper)
🍉 coarseDropout (example)
🍉 fireBlocks (code)
🍉 selfAttention (paper)

I have achieved ~2.5° angular generalization error on Acomo-14 dataset, with base implementation as fast as 0.36 ms inference in openvino 🎉.

Quick Start Guide

Install deps Windows/Linux:

install nvidia driver which support at least cuda 10.0
download & install 🐍 anaconda ; set anaconda PATH (may check 'set Path as default' on Windows installer)

conda create -n tf_gpu python=3.6
conda activate tf_gpu	
conda install -c anaconda tensorflow-gpu=1.15
conda install -c conda-forge opencv=3.4
conda install -c anaconda cython
conda install -c conda-forge imgaug
git clone https://github.com/czero69/acomoeye-NN.git

Datasets Acomo-14

Dataset for 8 subjects in 14 tests, 49 gaze points each test. Each gaze point has 1k photos of 'narrowing' pupil with gaze direction label: gazeX (yaw), gazeY (pitch). Rotations are applied in order yaw, pitch as extrinsic rotation. Note that it is different notation compared to NVGaze datasets. Dataset was gathered with a 300Hz IR camera mounted in Oculus DK2 on 40°x40° FOV. Dataset is ready to be trained, images are cropped and resized to 127x127.

💿 Download Acomo-14 dataset.

⚡ for train use: Train6-merged-monocular-R-shuffled-train.csv (80% gaze points from 7 subjects in 13 tests)
⚡ for valid use: Train6-merged-monocular-R-shuffled-test-SMALL.csv (100% gaze points from 1 unseen subject plus 20% gaze points from 7 subjects in 13 tests)

Run

1️⃣ Edit paths for train and eval in .config.py:

cd tensorflow_toolkit/eyetracking/
gedit acomo_basic/config.py

2️⃣ Training & eval:

python tools/train.py acomo_basic/config.py
python tools/eval.py acomo_basic/config.py

#️⃣ You can run train and eval concurrently, so later you can plot per-subject accuracies in tensorboard.

Tips

🔸 To run eval or training on tensorflow CPU and not GPU set in .config in eval/train class:

CUDA_VISIBLE_DEVICES = ""

🔸 To export openvino model (you must install openvino environment, see below):

python tools/export.py --data_type FP32 acomo_basic/config.py

🔸 To check out openvino inference engine, 79000 - exported iteration number:

python tools/infer_ie.py --model model_acomo_basic/export_79000/IR/FP32/eyenet.xml --config acomo_basic/config.py /path/to/img/0005353.png

🔸 To plot loss, accuracies, per-subject accuracies etc. run tensorboard:

cd ./model_acomo_basic
tensorboard --logdir ./ --port 5999
localhost:5999		# paste in web browser

Results

🔷 Table 1. Inference time for 127x127 input and first layer L=16.

_{inference engine}	_baseline	_coord- Conv	_{global- Context}	_{coarse- Dropout}	_fireBlocks	_attention	_cgDa	_cgDaf
_{tf-gpu (ms)}	_0.7568	_0.7691	_1.2115	_0.7636	_1.5305	_0.8589	_1.3492	_2.0812
_{tf-cpu (ms)}	_0.6877	_0.8687	_0.9158	_0.6959	_1.1433	_0.7450	_1.1114	_2.0415
_{openvino-cpu (ms)}	_0.3621	_0.3977	_1.0357	_0.3643	_0.6118	_0.4936	_1.2463	_1.4081
----------------	--------	---------	-------	-------	---------	---------	----	-----
_{parameters count}	₁₅₇₇₅₅	₁₅₈₀₄₃	₁₅₈₄₅₉	₁₅₇₇₅₅	₂₂₄₂₄	₁₅₉₈₅₃	₁₆₀₈₄₅	₂₅₅₁₄

🔷 Table 2. Generalization for unseen subjects, angular error in degrees, trained for 1M iterations. Error with affine calibration (calibrated, bottom) and without affine calibration (raw, upper) is reported. (N/T), N - number of the subject when test, T - number of the subject when train. NVGaze Datasets were used 'as is' forex. NVGaze-AR was not cropped to pupil location. Input res: 127x127, first layer L=16.

_{Dataset raw/calibrated}	_baseline	_coord- conv	_{global- context}	_{coarse- dropout}	_fireBlock	_attention	_cgda	_cgdaf
_{NVGaze-AR (2/40)}	_{6.37° 4.88°}	_{6.16° 3.81°}	_{6.93° 4.43°}	_{7.34° 5.34°}	_{6.41° 4.91°}	_{6.41° 4.99°}	_{9.40° 5.65°}	_{9.64° 6.46°}
_{NVGaze-VR (4/9)}	_{3.49° 2.33°}	_{3.00° 2.51°}	_{3.30° 2.57°}	_{3.58° 2.63°}	_{3.27° 2.67°}	_{3.04° 2.29°}	_{2.79° 2.47°}	_{3.21° 2.69°}
_{Acomo (1/8)}	_{5.24° 3.44°}	_{4.93° 3.48°}	_{5.11° 2.68°}	_{6.17° 3.66°}	_{4.23° 3.28°}	_{4.86° 3.22°}	_{7.51° 3.86°}	_{3.99° 2.57°}

🔷 Table 3. Generalization for new gaze vectors (amongst known subjects), angular error in degrees, trained for 1M iterations. Error with affine calibration (calibrated, bottom) and without affine calibration (raw, upper) is reported. (N), N - number of the subject when test. NVGaze Datasets were used 'as is' forex. NVGaze-AR was not cropped to pupil location. Input res: 127x127, first layer L=16.

_{Dataset raw/calibrated}	_baseline	_coord- conv	_{global- context}	_{coarse- dropout}	_fireBlock	_attention	_cgda	_cgdaf
_{NVGaze-syntetic (40)}	_{2.31° 2.09°}	_{2.27° 2.14°}	_{2.23° 2.04°}	_{2.25° 2.13°}	_{2.15° 2.07°}	_{1.79° 1.69°}	_{1.99° 1.95°}	_{1.88° 1.76°}
_{NVGaze-AR (42)}	_{3.76° 3.23°}	_{3.51° 2.87°}	_{3.81° 3.17°}	_{3.95° 3.36°}	_{3.95° 3.43°}	_{3.45° 3.00°}	_{4.10° 3.32°}	_{4.37° 3.61°}
_{NVGaze-VR (9)}	_{2.89° 2.48°}	_{2.52° 2.09°}	_{2.62° 2.16°}	_{2.73° 2.28°}	_{2.86° 2.55°}	_{2.63° 2.34°}	_{2.42° 2.26°}	_{2.45° 2.24°}
_{Acomo (8)}	_{4.05° 3.01°}	_{3.84° 2.97°}	_{3.72° 2.66°}	_{4.39° 3.24°}	_{3.51° 2.97°}	_{3.84° 2.98°}	_{4.50° 3.21°}	_{3.41° 2.67°}

Choosing the best upgrade technique for each dataset, proposed upgrades gives better avaraged accuracy, as follows:

-0.82° generalization new subject raw error
-0.70° generalization new subject affine calibrated error
-0.51° generalization new gaze vectors raw error
-0.38° generalization new gaze vectors affine calibrated error

OpenVINO Training Extensions

OpenVINO Training Extensions provide a convenient environment to train Deep Learning models and convert them using OpenVINO™ Toolkit for optimized inference.

Setup OpenVINO Training Extensions

Clone repository in the working directory

cd /<path_to_working_dir>
git clone https://github.com/opencv/openvino_training_extensions.git

Install prerequisites

sudo apt-get install libturbojpeg python3-tk python3-pip virtualenv

Citation

Please cite NVGaze paper if you use this NN architecture:

@inproceedings{kim2019,
	author = {Kim, Joohwan and Stengel, Michael and Majercik, Alexander and De Mello, Shalini and Dunn, David and Laine, Samuli and McGuire, Morgan and Luebke, David},
	title = {NVGaze: An Anatomically-Informed Dataset for Low-Latency, Near-Eye Gaze Estimation},
	booktitle = {Proceedings of the SIGCHI Conference on Human Factors in Computing Systems},
	series = {CHI '19},
	year = {2019},
	isbn = {978-1-4503-5970-2/19/05},
	location = {Glasgow, Scotland UK},
	numpages = {10},
	url = {https://sites.google.com/nvidia.com/nvgaze},
	doi = {10.1145/3290605.3300780},
	acmid = {978-1-4503-5970-2/19/05},
	publisher = {ACM},
	address = {New York, NY, USA},
	keywords = {eye tracking, machine learning, dataset, virtual reality},
}

@INPROCEEDINGS{7410785,
  author={E. {Wood} and T. {Baltruaitis} and X. {Zhang} and Y. {Sugano} and P. {Robinson} and A. {Bulling}},
  booktitle={2015 IEEE International Conference on Computer Vision (ICCV)}, 
  title={Rendering of Eyes for Eye-Shape Registration and Gaze Estimation}, 
  year={2015},
  volume={},
  number={},
  pages={3756-3764},}

If you find LPRNet useful in your research, or you are using FireBlocks implementation, please, consider to cite the following paper:

@article{icv2018lprnet,
title={LPRNet: License Plate Recognition via Deep Neural Networks},
author={Sergey Zherzdev and Alexey Gruzdev},
journal={arXiv:1806.10447},
year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
images		images
tensorflow_toolkit/eyetracking		tensorflow_toolkit/eyetracking
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

tensorflow_toolkit/eyetracking

tensorflow_toolkit/eyetracking

tools

tools

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Acomoeye-NN: NVGaze gaze estimation with upgrades

Quick Start Guide

Install deps Windows/Linux:

Datasets Acomo-14

Run

Tips

Results

OpenVINO Training Extensions

Setup OpenVINO Training Extensions

Citation

About

Releases

Packages

Languages

License

czero69/acomoeye-NN

Folders and files

Latest commit

History

Repository files navigation

Acomoeye-NN: NVGaze gaze estimation with upgrades

Quick Start Guide

Install deps Windows/Linux:

Datasets Acomo-14

Run

Tips

Results

OpenVINO Training Extensions

Setup OpenVINO Training Extensions

Citation

About

Resources

License

Stars

Watchers

Forks

Languages