forked from NVIDIA/Megatron-LM
Issues: microsoft/Megatron-DeepSpeed
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Inquiry on Sequence Parallel Support for VocabParallelEmbedding
#389
opened May 18, 2024 by
qinxiangyujiayou
Sequence Parallel is incompatible with Rotary Positional Embedding
#385
opened May 9, 2024 by
anogkongda
Call for Conversion from Huggingface to Megads with MoE
#381
opened Apr 24, 2024 by
ControllableGeneration
Loss is increasing when fine-tuning from a Megatron-Deepspeed pretrained checkpoint.
#358
opened Mar 5, 2024 by
SefaZeng
Unreasonably low throughput on HGX-H100s
bug
Something isn't working
#357
opened Mar 1, 2024 by
GuanhuaWang
FileNotFoundError: [Errno 2] No such file or directory: 'dataset/index-cache/xxx_doc_idx.npy'
bug
Something isn't working
#356
opened Mar 1, 2024 by
GuanhuaWang
how to convert deepspeed model to megatron, when pp=2, tp=2, nnode=2
#329
opened Jan 11, 2024 by
lonelydancer
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.