Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
9d6110e
Update installation instructions
tjruwase Sep 4, 2020
e525e68
Format fix
tjruwase Sep 4, 2020
d492642
Merge branch 'master' into olruwase/docs
Sep 6, 2020
326573c
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
tjruwase Sep 6, 2020
15f80a2
Merge branch 'olruwase/docs' of github.com:microsoft/DeepSpeed into o…
tjruwase Sep 6, 2020
e3f93df
ZeRO tutorial
tjruwase Sep 9, 2020
7cf4898
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
tjruwase Sep 9, 2020
f532570
Format fixes
tjruwase Sep 9, 2020
2707f03
Merge branch 'master' into olruwase/docs
tjruwase Sep 9, 2020
9a9eda4
Merge branch 'master' into olruwase/docs
tjruwase Sep 10, 2020
12e0312
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
tjruwase Sep 10, 2020
b10a11c
ZeRO-Offload
tjruwase Sep 10, 2020
c8ae5c8
Merge branch 'olruwase/docs' of github.com:microsoft/DeepSpeed into o…
tjruwase Sep 10, 2020
6dd6276
ZeRO and ZeRO-Offload tutorials
tjruwase Sep 10, 2020
d61c679
Update navigation page
tjruwase Sep 10, 2020
934684d
Format fixes
tjruwase Sep 10, 2020
4b90869
Merge branch 'master' into olruwase/docs
jeffra Sep 10, 2020
2f745e7
Add yuxhe feedback
tjruwase Sep 10, 2020
2b81602
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
tjruwase Sep 10, 2020
481f743
Merge branch 'olruwase/docs' of github.com:microsoft/DeepSpeed into o…
tjruwase Sep 10, 2020
35eeb1d
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
tjruwase Sep 10, 2020
6bd6171
Fix blog post link
tjruwase Sep 10, 2020
5582e3a
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
tjruwase Sep 16, 2020
b11522f
Fix OneBit-Adam link
tjruwase Sep 16, 2020
1f6decc
Merge branch 'master' into olruwase/docs
tjruwase Sep 16, 2020
79b7615
Fix date link
tjruwase Sep 16, 2020
adb0e91
Merge branch 'master' of github.com:microsoft/DeepSpeed into olruwase…
tjruwase Sep 16, 2020
4959fdd
Merge branch 'olruwase/docs' of github.com:microsoft/DeepSpeed into o…
tjruwase Sep 16, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/_pages/config-json.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,8 @@ title: "DeepSpeed Configuration JSON"

| Fields | Value | Example |
| ------ | ------------------------------------------------------------ | ------------------------------ |
| type | The scheduler name. See [here](https://deepspeed.readthedocs.io/en/latest/deepspeed.pt.html) for list of support schedulers. | `"1Cycle"` |
| params | Dictionary of parameters to instantiate scheduler. The parameter names should match scheduler constructor signature. | `{"lr": 0.001, "eps": 1e-8}` |
| type | The scheduler name. See [here](https://deepspeed.readthedocs.io/en/latest/deepspeed.pt.html) for list of support schedulers. | `"WarmupLR"` |
| params | Dictionary of parameters to instantiate scheduler. The parameter names should match scheduler constructor signature. | `{"warmup_min_lr": 0, "warmup_max_lr": 0.001}` |

Example of ***scheduler***

Expand Down
2 changes: 1 addition & 1 deletion docs/_posts/2020-09-09-onebit-adam-news.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,6 @@ across distributed devices. We introduce a new algorithm - 1-bit Adam - and
its efficient implementation in DeepSpeed. 1-bit Adam offers the ***same convergence*** as Adam, incurs up to ***5x less communication*** that enables up to ***3.5x higher throughput for BERT-Large pretraining*** and up to ***2.7x higher throughput for SQuAD fine-tuning*** on bandwidth-limited clusters.

* Brief overview, see our [press release]({{ site.press_release_v3 }}).
* Detailed technology deep dive, see our [blog post](https://www.deepspeed.ai/news/2020/09/09/onebit-adam-blog-post.html).
* Detailed technology deep dive, see our [blog post](https://www.deepspeed.ai/news/2020/09/08/onebit-adam-blog-post.html).
* Tutorial on how to reproduce our results, see our [1-bit Adam tutorial](/tutorials/onebit-adam/).
* The source code for 1-bit Adam can be found in the [DeepSpeed repo](https://github.com/microsoft/deepspeed). The implementation of 1-bit Adam is in [onebit_adam.py](https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/fp16/onebit_adam.py) and CUDA-Aware communication for 1-bit Adam is in [custom_collectives.py](https://github.com/microsoft/DeepSpeed/blob/master/deepspeed/runtime/custom_collectives.py). Example codes to try this feature can be found in the [DeepSpeedExamples repo](https://github.com/microsoft/deepspeedexamples) as shown in the [tutorial](/tutorials/onebit-adam/).
2 changes: 1 addition & 1 deletion docs/_tutorials/onebit-adam.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: "1-bit Adam: Up to 5x less communication volume and up to 2x faster training"
---

In this tutorial, we are going to introduce the 1-bit Adam optimizer in DeepSpeed. 1-bit Adam can improve model training speed on communication-constrained clusters, especially for communication-intensive large models by reducing the overall communication volume by up to 5x. Detailed description of the 1-bit Adam algorithm, its implementation in DeepSpeed, and performance evaluation is available from our [blog post](https://www.deepspeed.ai/news/2020/09/09/onebit-adam-blog-post.html).
In this tutorial, we are going to introduce the 1-bit Adam optimizer in DeepSpeed. 1-bit Adam can improve model training speed on communication-constrained clusters, especially for communication-intensive large models by reducing the overall communication volume by up to 5x. Detailed description of the 1-bit Adam algorithm, its implementation in DeepSpeed, and performance evaluation is available from our [blog post](https://www.deepspeed.ai/news/2020/09/08/onebit-adam-blog-post.html).

To illustrate the benefits and usage of 1-bit Adam optimizer in DeepSpeed, we use the following two training tasks as examples:

Expand Down