orphan: |
---|
Scale to 1 trillion+ parameters with multiple distributed strategies.
.. displayitem:: :header: Scale with distributed strategies :description: Learn about different distributed strategies to reach bigger model parameter sizes. :col_css: col-md-6 :button_link: ../accelerators/gpu_intermediate.html :height: 150 :tag: intermediate
.. displayitem:: :header: Reach 1 trillion parameters on GPUs :description: Scale to 1 trillion params on GPUs with FSDP and Deepspeed. :col_css: col-md-6 :button_link: ../advanced/model_parallel.html :height: 150 :tag: advanced