OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data

Recent advances in large multimodal models have significantly advanced video comprehension, yet their performance remains limited in first-person scenarios. The interactive nature of egocentric videos is critical for applications like embodied intelligence, but introduces complex visual contexts that conventional models struggle to capture. To bridge this gap, we introduce OpenMMEgo with innovations across three dimensions: data, model, and training strategy. To provide rich spatiotemporal visual knowledge, we curate a large-scale, high-quality dataset named OME10M, comprising over 8.2M egocentric video QA pairs synthesized from Ego4D series. We also establish OMEBench, a comprehensive benchmark for rigorous egocentric understanding assessment. To alleviate the frequent viewpoint shifts inherent in egocentric videos, we implement semantic-aware visual token compression. Further, a curriculum learning strategy is complemented to foster stable learning across various data complexities. OpenMMEgo consistently improves the performance of LMMs on egocentric benchmarks without sacrificing general video understanding performance. Notably, Qwen2.5-VL tuned with OpenMMEgo substantially outperforms other models of the same size in egocentric video understanding.

Code

We will release our code and data soon.

Citation

If you find our work useful, please consider citing us!

@inproceedings{
hao2025openmmego,
title={Open{MME}go: Enhancing Egocentric Understanding for {LMM}s with Open Weights and Data},
author={Hao, Luo and Zihao, Yue and Wanpeng, Zhang and Yicheng, Feng and Sipeng, Zheng and Deheng, Ye and Zongqing, Lu},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data

Code

Citation

About

Uh oh!

Releases

Packages

License

BeingBeyond/OpenMMEgo

Folders and files

Latest commit

History

Repository files navigation

OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data

Code

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages