Skip to content

BeingBeyond/OpenMMEgo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data

Recent advances in large multimodal models have significantly advanced video comprehension, yet their performance remains limited in first-person scenarios. The interactive nature of egocentric videos is critical for applications like embodied intelligence, but introduces complex visual contexts that conventional models struggle to capture. To bridge this gap, we introduce OpenMMEgo with innovations across three dimensions: data, model, and training strategy. To provide rich spatiotemporal visual knowledge, we curate a large-scale, high-quality dataset named OME10M, comprising over 8.2M egocentric video QA pairs synthesized from Ego4D series. We also establish OMEBench, a comprehensive benchmark for rigorous egocentric understanding assessment. To alleviate the frequent viewpoint shifts inherent in egocentric videos, we implement semantic-aware visual token compression. Further, a curriculum learning strategy is complemented to foster stable learning across various data complexities. OpenMMEgo consistently improves the performance of LMMs on egocentric benchmarks without sacrificing general video understanding performance. Notably, Qwen2.5-VL tuned with OpenMMEgo substantially outperforms other models of the same size in egocentric video understanding.

Code

We will release our code and data soon.

Citation

If you find our work useful, please consider citing us!

@inproceedings{
hao2025openmmego,
title={Open{MME}go: Enhancing Egocentric Understanding for {LMM}s with Open Weights and Data},
author={Hao, Luo and Zihao, Yue and Wanpeng, Zhang and Yicheng, Feng and Sipeng, Zheng and Deheng, Ye and Zongqing, Lu},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025}
}

About

OpenMMEgo: Enhancing Egocentric Understanding for LMMs with Open Weights and Data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published