Shoubin Yu1,2 · Lei Shu1 · Antoine Yang1 · Yao Fu1 · Srinivas Sunkara1 · Maria Wang1 · Jindong Chen1 · Mohit Bansal2 · Boqing Gong1
1 Google DeepMind 2 University of North Carolina at Chapel Hill
Ego2Web is a benchmark for evaluating web agents grounded in egocentric video understanding. It connects real-world human activities with web-based tasks, enabling research at the intersection of embodied perception and web interaction.
If you find this work useful, please cite:
@article{yu2026ego2web,
title={Ego2Web: Benchmarking Web Agents with Egocentric Video Grounding},
author={Yu, Shoubin and Shu, Lei and Yang, Antoine and Fu, Yao and
Sunkara, Srinivas and Wang, Maria and Chen, Jindong and
Bansal, Mohit and Gong, Boqing},
journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2026}
}
