Skip to content

qiuzh20/EMoE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Emergent Mixture-of-Experts: Can Dense Pre-trained Transformers Benefit from Emergent Modular Structures?

Repo Overview

For more details about the method, please refer to our paper.

In this section, we provide an overview of the three main components of the repository.

Vision: This section includes experiments involving full fine-tuning on Domainbed. It focuses on vision-related tasks and evaluations.

Language: Within this section, you will find experiments related to full fine-tuning and LoRA tuning on the GLUE dataset. Additionally, experiments on ID GLUE and OOD GLUE-X are conducted. These experiments primarily pertain to language-related tasks.

Demons: This section provides the data sources and plotting code for all the images featured in the original paper.

Tutel: Modified from Tutel MoE to support the added avg-k gaing.

For specific instructions on running the code for each component, please refer to the README.md file within the corresponding folder.

Environments

Please prepare the Vision and Language environments separately and follow the instructions in each part. Importantly, we add functions like avg-k gating to the original Tutel MoE, so please install tutel from the local file tutel through cd ./tutel and pip install ./.

Acknowledgement

The MoE module is built on Tutel MoE. Notice we have added the avg-k function to the original gate, so please install it from the local file tutel and follow the corresponding instructions.

The Vision codebase is built on GMoE and original Domainbed

The language training module is built on Transformers, OOD evaluation is built on GLUE-X.

The MoE split method is built on MoEfication

License

This source code is released under the MIT license, included here.

About

Official PyTorch Implementation of EMoE: Emergent Mixture-of-Experts: Can Dense Pre-trained Transformers Benefit from Emergent Modular Structures?

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published