Skip to content

DeepSpeed v0.1.0

Compare
Choose a tag to compare
@jeffra jeffra released this 19 May 06:41
· 2175 commits to master since this release
c61e23b

DeepSpeed 0.1.0 Release Notes

Features

  • Distributed Training with Mixed Precision
    • 16-bit mixed precision
    • Single-GPU/Multi-GPU/Multi-Node
  • Model Parallelism
    • Support for Custom Model Parallelism
    • Integration with Megatron-LM
  • Memory and Bandwidth Optimizations
    • Zero Redundancy Optimizer (ZeRO) stage 1 with all-reduce
    • Constant Buffer Optimization (CBO)
    • Smart Gradient Accumulation
  • Training Features
    • Simplified training API
    • Gradient Clipping
    • Automatic loss scaling with mixed precision
  • Training Optimizers
    • Fused Adam optimizer and arbitrary torch.optim.Optimizer
    • Memory bandwidth optimized FP16 Optimizer
    • Large Batch Training with LAMB Optimizer
    • Memory efficient Training with ZeRO Optimizer
  • Training Agnostic Checkpointing
  • Advanced Parameter Search
    • Learning Rate Range Test
    • 1Cycle Learning Rate Schedule
  • Simplified Data Loader
  • Performance Analysis and Debugging