diff --git a/docs/_pages/training.md b/docs/_pages/training.md index 9c95fa7857ca..e31651cc487a 100644 --- a/docs/_pages/training.md +++ b/docs/_pages/training.md @@ -44,7 +44,7 @@ optimizations on advanced hyperparameter tuning and optimizers. For example: | 64 V100 GPUs | DeepSpeed | **8.68** hr| | 16 V100 GPUs | DeepSpeed | **33.22** hr| - *BERT codes and tutorials will be available soon.* + *BERT code and tutorials will be available soon.* * DeepSpeed trains GPT2 (1.5 billion parameters) 3.75x faster than state-of-art, NVIDIA Megatron on Azure GPUs.