Skip to content

Latest commit

 

History

History
31 lines (23 loc) · 983 Bytes

distributed.rst

File metadata and controls

31 lines (23 loc) · 983 Bytes

Distributed Training APIs

SageMaker distributed training libraries offer both data parallel and model parallel training strategies. They combine software and hardware technologies to improve inter-GPU and inter-node communications. They extend SageMaker’s training capabilities with built-in options that require only small code changes to your training scripts.

The SageMaker Distributed Data Parallel Library

.. toctree::
    :maxdepth: 2

    smd_data_parallel
    sdp_versions/latest
    smd_data_parallel_use_sm_pysdk
    smd_data_parallel_release_notes/smd_data_parallel_change_log

The SageMaker Distributed Model Parallel Library

.. toctree::
   :maxdepth: 2

   smd_model_parallel
   smp_versions/latest
   smd_model_parallel_general
   smd_model_parallel_release_notes/smd_model_parallel_change_log