You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried the examples (pretrain gpt and gpt with MoE) but failed to run both.
Running the pretrain gpt example shows an error like "Element 1 of tensors does not require grad and does not have a grad_fn"
Running MoE examples always show an error saying ep_size is not valid argument when calling moe in deepspeed (i tried from deepspeed from 0.5.0 to 0.6.1; unfortunately, none works).
Could anyone kindly help me with the issues?
Thanks
The text was updated successfully, but these errors were encountered:
@starkhu and @getao -- can you please use the MoE examples with the main branch? Our moe branch is now old but the support has been merged to the main branch already.
hyoo
pushed a commit
to hyoo/Megatron-DeepSpeed
that referenced
this issue
Apr 21, 2023
Hi,
I tried the examples (pretrain gpt and gpt with MoE) but failed to run both.
Running the pretrain gpt example shows an error like "Element 1 of tensors does not require grad and does not have a grad_fn"
Running MoE examples always show an error saying ep_size is not valid argument when calling moe in deepspeed (i tried from deepspeed from 0.5.0 to 0.6.1; unfortunately, none works).
Could anyone kindly help me with the issues?
Thanks
The text was updated successfully, but these errors were encountered: