Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Problems with Mixture-of-Experts (MoE) #367

Open
nikit-srivastava opened this issue Mar 16, 2024 · 1 comment
Open

[BUG] Problems with Mixture-of-Experts (MoE) #367

nikit-srivastava opened this issue Mar 16, 2024 · 1 comment

Comments

@nikit-srivastava
Copy link

Hello,

Thank you for the nice work with this training framework. However, I have noticed that there's a problem with inference, conversion and fine-tuning of MoE based GPT model. The following is a list of issues that point the same but have not been yet addressed:

In general, the inference example (generate_text.sh) does not work when --num-experts is set to a value higher than 1. Also, the conversion scripts (convert_checkpoint) are not equipped to handle MoE models.

I would like to request the attention of repository maintainers to this issue. Personally, this issue is being a big roadblock in our research and prevents us from analyzing or publishing our findings. We would be really grateful if this can be resolved soon.

If you need any other information or access to model weights to test, please feel free to ask. With my current knowledge, I can also offer to fix/implement features if you point me in the right direction.

@suu990901
Copy link

Have you solved this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants