implementation of "Head Mounted Pupil Tracking Using Convolutional Neural Network"
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
DEMO
detector
evaluator
rebuttal
train_evaluator
LICENSE
README.md
gen_detector.sh
gen_videofiles.py
videofiles_example.txt

README.md

PLEASE CHECK THE GITHUB FOR THE LATEST VERSION:

Paper

REQUIREMENT:

  • opencv 3.3.0
  • tensorflow 1.2.1
  • tensorflow slim
  • python 3.6

HOW TO USE IT

  1. set up opencv dir in ./detector/CMakeLists.txt:
SET("OpenCV_DIR" "<path to the /opencv/build/>")
  1. download the LPW dataset and decompress into ./LPW

  2. generate videofile.txt

python gen_videofiles.py
  1. compile detector:
cd detector
make
  1. run the detector
./detector/PupilDetection
  1. wait until PupilDetection finish

  2. download pretrained model and decompress into ./pretrain/

  3. run evaluator

python evaluator/evaluator.py
  1. check the result in ./result

Explaination of the result

Structure:

- result
    - <alpha><Is_average_filter><videonumber> (it's the result of each video, so there are 64 files like this. e.g.  0.005False60)
    - <alpha><Is_average_filter><finish time stamp>: e.g. 0.005False60time.struct_time(tm_year=2017, tm_mon=8, tm_mday=22, tm_hour=16, tm_min=12, tm_sec=26, tm_wday=1, tm_yday=234, tm_isdst=1) (it's the result of all each video)

For the result of each video:

line 1: (e.g. ./LPW/23/2.avi) is the file path of the video
line 2: (e.g. NEEDTOIMPROVE11) represent the NumberOfFrame(upperbound)-NumberOfFrame(evaluator). It's not relevant to the paper.
line 3 - line 503: (e.g. Pixcel  242: 0.999) means the accuracy(0.999) with condition that distance(point(predict),point(groundtruth))

For the result of all videos:

line 1: (e.g. NEEDTOIMPROVE11) represent the NumberOfFrame(upperbound)-NumberOfFrame(evaluator). It's not relevant to the paper.
line 2 - line 502: (e.g. Pixcel  242: 0.999) means the accuracy(0.999) with condition that distance(point(predict),point(groundtruth))

Optional: fine-tune

  1. make the dataset
python ./train_evaluator/makedataset.py
  1. train
download pretrain vgg model and put it into /train_evaluator/pretrain_vgg
python ./train_evaluator/train.py

Code reference

  • ./detector/algo.h ./detector/blob_gen.h ./detector/canny_impl.h ./detector/filter_edges.h ./detector/find_best_edge.h:We use the first part of ElSe algorithm[1], which is based on morphologic feature as one of the answer candidates.
  • ./evaluator/vgg.pyWe use VGG-16[2] architecture.
  • ./LPW/ We use part of (about 1/80) [LPW dataset] to fine-tune the network, and estimate the method on this dataset.

[1]:Fuhl, Wolfgang, et al. "Else: Ellipse selection for robust pupil detection in real-world environments." Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications. ACM, 2016.

[2]:Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).