Skip to content

ut-vision/UniGaze

Repository files navigation

[WACV 2026] UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training

[arxiv], [project page], [online demo], [Huggingface paper page], PyPI version

Jiawei Qin1, Xucong Zhang2, Yusuke Sugano1,

1The University of Tokyo, 2Delft University of Technology

grid

Overview

This repository contains the official PyTorch implementation of both MAE pre-training and unigaze.

Todo:

  • ✅ Release pre-trained MAE checkpoints (B, L, H) and gaze estimation training code.
  • ✅ Release UniGaze models for inference.
  • ✅ Code for predicting gaze from videos
  • ✅ (2025 June 08 updated) Release the MAE pre-training code.
  • ✅ (2025 August 25 updated) Online demo is available.
  • ✅ Easier pip installation.
  • ✅ (2026 March updated) Release the gaze dataset normalization code.

Easy use UniGaze

You can install UniGaze with the pip command:

pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118
pip install timm==0.3.2
pip install unigaze
pip install -r requirements.txt 

You can find our UniGaze on the PyPI page: https://pypi.org/project/unigaze/

Available Models

Model name Backbone Training Data
unigaze_b16_joint UniGaze-B Joint Datasets
unigaze_l16_joint UniGaze-L Joint Datasets
unigaze_h14_joint UniGaze-H Joint Datasets
unigaze_h14_cross_X UniGaze-H ETH-XGaze
import unigaze
model = unigaze.load("unigaze_h14_joint", device="cuda")   # downloads weights from HF on first use

Predicting Gaze from Videos

To predict gaze direction from videos, use the following script:

projdir=<...>/UniGaze/unigaze
cd ${projdir}
python predict_gaze_video.py \
    --model_name "unigaze_h14_joint"  \
    -i ./input_video 

Pre-training (MAE)

Please refer to MAE Pre-Training.

Training (Gaze Estimation)

For detailed training instructions, please refer to UniGaze Training.


Loading Pretrained Models

  • You can refer to load_mae.ipynb for instructions on loading the model and integrating it into your own codebase.
    • If you want to load the MAE, use custom_pretrained_path arguments.
## Loading MAE-backbone only - this will not load the gaze_fc
mae_h14 = MAE_Gaze(model_type='vit_h_14', custom_pretrained_path='checkpoints/mae_h14/mae_h14_checkpoint-299.pth')

Citation

If you find our work useful for your research, please consider citing:

@article{qin2025unigaze,
  title={UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training},
  author={Qin, Jiawei and Zhang, Xucong and Sugano, Yusuke},
  journal={IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2025}
}

We also acknowledge the excellent work on MAE.


License:

This model is licensed under the ModelGo Attribution-NonCommercial-ResponsibleAI License, Version 2.0 (MG-NC-RAI-2.0); you may use this model only in compliance with the License. You may obtain a copy of the License at

https://github.com/Xtra-Computing/ModelGo/blob/main/MGL/V2/MG-BY-NC-RAI/LICENSE

A comprehensive introduction to the ModelGo license can be found here: https://www.modelgo.li/


Beyond human eye gaze estimation:

Our method also works for different "faces":

grid


Contact

If you have any questions, feel free to contact Jiawei Qin at jqin@iis.u-tokyo.ac.jp.

About

Official PyTorch implementation of the paper: UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors