Skip to content

[IEEE SPL] End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context

License

Notifications You must be signed in to change notification settings

zgchen33/MCGaze

Repository files navigation

End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context (aka. Multi Clue Gaze) PWC

Yiran Guan*, Zhuoguang Chen*, Wenzheng Zeng, Zhiguo Cao, Yang Xiao

Huazhong University of Science and Technology

*: equal contribution, †: corresponding author

🥰Our work has been accepted by IEEE Signal Processing Letters!

✨Demo code has been added to this repo!

Inspired by gaze360-demo and yolov5-crowdhuman. We use gaze estimation for each person in a video and visualize it. You can see MCGaze_demo for more details.

Introduction

This repository contains the official implementation of the paper "End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context".

We propose to facilitate video gaze estimation via capturing spatial-temporal interaction context among head, face, and eye in an end-to-end learning way, which has not been well concerned yet. Experiments on the challenging Gaze360 dataset verify the superiority of our proposition.

Results and models

In our work, we test our model in two different dataset settings (Gaze360-setting and l2CS-setting(i.e., only consider face detectable samples)) for fair comparison with the previous methods.

You can download the checkpoint for the model from the link inside the table.

Setting Backbone MAE-Front180 Weight
Gaze360-setting R-50 10.74 Google Drive
l2cs-setting R-50 9.81 Google Drive

Get Started

Prepare your python environment

  1. Create a new conda environment:

    conda create -n MCGaze python=3.9
    conda activate MCGaze
  2. Install Pytorch (1.7.1 is recommended).

    pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
  3. Install MMDetection.

    • Install MMCV-full first. 1.4.8 is recommended.

      pip install mmcv-full==1.4.8 -f https://download.openmmlab.com/mmcv/dist/cu110/torch1.7.1/index.html
    • cd MCGaze
      pip install -v -e .

    If you encounter difficulties during use, please open a new issue or contact us.

Prepare your dataset

  1. Download Gaze360 dataset from official.

  2. Download the train.txt and test.txt in Gaze360's official GitHub repo.

  3. Using our code to reorganize the file structure. You should modify the paths first.

    • python tools/gaze360_img_reorganize.py
  4. Download the COCO format annotation from this annotations, and put them into corresponding folders.

Here is the right hierarchy of folder MCGaze/data below:

 └── data
     |
     ├── gaze360
     |   ├── train_rawframes
     |   |   ├── 1
     |   |   |   ├── 00000.png
     |   |   |   ├── 00001.png
     |   |   |   └── ...
     |   |   ├── 2
     |   |   └── ...
     |   |     
     |   ├── test_rawframes
     |   |   ├── 1
     |   |   |   ├── 00000.png
     |   |   |   ├── 00001.png
     |   |   |   └── ...
     |   |    
     |   ├── train.json
     |   └── test.json
     |
     ├── l2cs
     |   ├── train_rawframes
     |   |   ├── 1
     |   |   |   ├── 00000.png
     |   |   |   └── ...
     |   |   ├── 2
     |   |   └── ...
     |   |     
     |   ├── test_rawframes
     |   ├── train.json
     |   └── test.json
     └──

Inference & Evaluation

  • Run the commands below for inference and evaluation in different settings.

If you want to evaluate the model without training by yourself, you need to download our checkpoints (we recommend that you can create a new folder "ckpts" and put the files in it).

And remember to check if the file paths of shells are right.

Gaze360-setting
bash tools/test_gaze360.sh
l2cs-setting
bash tools/test_l2cs.sh

Training

  • Run the commands below to begin training in different settings.
Gaze360-setting
bash tools/train_gaze360.sh
l2cs-setting
bash tools/train_l2cs.sh

Acknowledgement

This code is inspired by MPEblink, TeViT and MMDetection. Thanks for their great contributions to the computer vision community.

Citation

If MCGaze is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@article{guan2023end,
  title={End-to-End Video Gaze Estimation via Capturing Head-Face-Eye Spatial-Temporal Interaction Context},
  author={Guan, Yiran and Chen, Zhuoguang and Zeng, Wenzheng and Cao, Zhiguo and Xiao, Yang},
  journal={IEEE Signal Processing Letters},
  volume={30},
  pages={1687--1691},
  year={2023},
  publisher={IEEE}
}

About

[IEEE SPL] End-to-end Video Gaze Estimation via Capturing Head-face-eye Spatial-temporal Interaction Context

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages