Skip to content

JiuTian-VL/MoME

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models

School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen
*Equal contribution †Corresponding author

The Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024)

[Paper] [Project Page]

🔥 Details will be released. Stay tuned 🍻 👍

Hits


If you find this work useful for your research, please kindly cite our paper and star our repo.

Updates

  • [09/2024] Project page released!
  • [09/2024] MoME has been accepted by NeurIPS 2024!
  • [07/2024] Arxiv paper released.

Introduction

This is the github repository of MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models. In this work, we propose a mixture of multimodal experts (MoME) to mitigate task interference and obtain a generalist MLLM.

Our MoME is composed of two key components, a mixture of vision experts (MoVE) and a mixture of language experts (MoLE). MoVE can adaptively modulate the features transformed from various vision encoders, and has a strong compatibility in transformation architecture. MoLE incorporates sparsely gated experts into LLMs to achieve painless improvements with roughly unchanged inference costs.

The architecture of the proposed MoME model:

Multitasking Benchmark

We collected 24 datasets and categorized them into four groups for instruction-tuning and evaluation:

Evaluation results

Here we list the multitasking performance comparison of MoME and baselines. Please refer to our paper for more details.

MoVE MoLE SOTA

Qualitative Examples

Qualitative Examples

Citation

If you find this work useful for your research, please kindly cite our paper:

@inproceedings{shen2024mome,
    title={MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models}, 
    author={Shen, Leyang and Chen, Gongwei and Shao, Rui and Guan, Weili and Nie, Liqiang},
    booktitle={Advances in neural information processing systems},
    year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published