ZERO + Model Parallel Implementation Example #3828
adhithadias
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have been trying to find an example where Deepspeed is used with both ZeRO 2/3 with Model Parallelism (Tensor parallelism or tensor slicing precisely). Still, I have not been able to find an example. The Megatron-Deepspeed repository has examples where DP+MP is used, but I think ZeRO is not used for DP. DeepspeedExamples repository contains examples with ZeRO but I don't see an example of ZeRO + MP. Could someone please direct me to an example with both ZeRO and MP using Deepspeed?
Beta Was this translation helpful? Give feedback.
All reactions