[WACV 2026] UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training

[arxiv], [project page], [online demo], [Huggingface paper page],

Jiawei Qin¹, Xucong Zhang², Yusuke Sugano¹,

¹The University of Tokyo, ²Delft University of Technology

Overview

This repository contains the official PyTorch implementation of both MAE pre-training and unigaze.

Todo:

✅ Release pre-trained MAE checkpoints (B, L, H) and gaze estimation training code.
✅ Release UniGaze models for inference.
✅ Code for predicting gaze from videos
✅ (2025 June 08 updated) Release the MAE pre-training code.
✅ (2025 August 25 updated) Online demo is available.
✅ Easier pip installation.
✅ (2026 March updated) Release the gaze dataset normalization code.

Easy use UniGaze

You can install UniGaze with the pip command:

pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118
pip install timm==0.3.2
pip install unigaze
pip install -r requirements.txt

You can find our UniGaze on the PyPI page: https://pypi.org/project/unigaze/

Available Models

Model name	Backbone	Training Data
`unigaze_b16_joint`	UniGaze-B	Joint Datasets
`unigaze_l16_joint`	UniGaze-L	Joint Datasets
`unigaze_h14_joint`	UniGaze-H	Joint Datasets
`unigaze_h14_cross_X`	UniGaze-H	ETH-XGaze

import unigaze
model = unigaze.load("unigaze_h14_joint", device="cuda")   # downloads weights from HF on first use

Predicting Gaze from Videos

To predict gaze direction from videos, use the following script:

projdir=<...>/UniGaze/unigaze
cd ${projdir}
python predict_gaze_video.py \
    --model_name "unigaze_h14_joint"  \
    -i ./input_video

Pre-training (MAE)

Please refer to MAE Pre-Training.

Training (Gaze Estimation)

For detailed training instructions, please refer to UniGaze Training.

Loading Pretrained Models

You can refer to load_mae.ipynb for instructions on loading the model and integrating it into your own codebase.
- If you want to load the MAE, use custom_pretrained_path arguments.

## Loading MAE-backbone only - this will not load the gaze_fc
mae_h14 = MAE_Gaze(model_type='vit_h_14', custom_pretrained_path='checkpoints/mae_h14/mae_h14_checkpoint-299.pth')

Citation

If you find our work useful for your research, please consider citing:

@article{qin2025unigaze,
  title={UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training},
  author={Qin, Jiawei and Zhang, Xucong and Sugano, Yusuke},
  journal={IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2025}
}

We also acknowledge the excellent work on MAE.

License:

This model is licensed under the ModelGo Attribution-NonCommercial-ResponsibleAI License, Version 2.0 (MG-NC-RAI-2.0); you may use this model only in compliance with the License. You may obtain a copy of the License at

https://github.com/Xtra-Computing/ModelGo/blob/main/MGL/V2/MG-BY-NC-RAI/LICENSE

A comprehensive introduction to the ModelGo license can be found here: https://www.modelgo.li/

Beyond human eye gaze estimation:

Our method also works for different "faces":

Contact

If you have any questions, feel free to contact Jiawei Qin at jqin@iis.u-tokyo.ac.jp.

Name		Name	Last commit message	Last commit date
Latest commit History 108 Commits
MAE		MAE
docs		docs
facedata_preparation		facedata_preparation
gazedata_preparation		gazedata_preparation
unigaze		unigaze
unigaze_easy		unigaze_easy
LICENSE.txt		LICENSE.txt
README.md		README.md
gaze_cat_dog.png		gaze_cat_dog.png
requirements.txt		requirements.txt
teaser.png		teaser.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[WACV 2026] UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training

Overview

Todo:

Easy use UniGaze

Available Models

Predicting Gaze from Videos

Pre-training (MAE)

Training (Gaze Estimation)

Loading Pretrained Models

Citation

License:

Beyond human eye gaze estimation:

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

[WACV 2026] UniGaze: Towards Universal Gaze Estimation via Large-scale Pre-Training

Overview

Todo:

Easy use UniGaze

Available Models

Predicting Gaze from Videos

Pre-training (MAE)

Training (Gaze Estimation)

Loading Pretrained Models

Citation

License:

Beyond human eye gaze estimation:

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages