Port of DeepSpeed's overview and feature discussion. #27

ShadenSmith · 2020-02-06T00:44:27Z

I left a few TODOs as placeholders for the other documentation assignments.

The other major TODO is to ensure that our documentation (especially this README.md landing page) is nicely aligned with the blog post. We should be able to read the blog post and easily connect claims to our documentation here.

ShadenSmith · 2020-02-07T02:01:36Z

Closing as this was included in the Megatron PR.

…ster_weights Load checkpoints with different DP degree

* Support nvidia bert dataset * Format fixes * E2E run of Nvidia Data with SQUAD 90.6 F1 * Minor fixes * Update README * Update README

Make torch version check numeric

* Pull changes from DeepSpeed * Update op builder compatibility * Update sparse_attn.py Co-authored-by: sid <sidney.black@aleph-alpha.de>

commit 747f4202c55b50431fb1d3434cafd7332322a037 (HEAD, origin/xpu-main, origin/HEAD) Author: Guo Yejun <yejun.guo@intel.com> Date: Thu Oct 20 19:43:48 2022 +0800 transformee.py: use torch.gelu instead of small op combination (#27)

ShadenSmith and others added 3 commits February 5, 2020 16:34

Ported DeepSpeed overview.

5544934

Renamed subsection

6b927fe

Formatting table of contents

cfce4dc

ShadenSmith requested a review from yuxionghe February 6, 2020 00:44

ShadenSmith added the documentation Improvements or additions to documentation label Feb 6, 2020

This was linked to issues Feb 6, 2020

Port DeepSpeed overview documentation #22

Closed

Table of contents in README.md #25

Closed

ShadenSmith requested a review from jeffra February 6, 2020 00:46

tjruwase approved these changes Feb 6, 2020

View reviewed changes

ShadenSmith closed this Feb 7, 2020

ShadenSmith deleted the readme_pr branch February 7, 2020 21:35

jeffra pushed a commit to jeffra/DeepSpeed that referenced this pull request May 15, 2020

Merge pull request microsoft#27 from microsoft/olruwase/checkpoint_ma…

77f92b6

…ster_weights Load checkpoints with different DP degree

gongwei-130 mentioned this pull request Aug 7, 2020

'CUDA error: an illegal memory access was encountered' in forward #308

Open

GrvLeo mentioned this pull request Oct 22, 2020

Fail to use Zero-offload: "ModuleNotFoundError: No module named 'deepspeed.ops.adam.cpu_adam_op'" #483

Closed

rraminen pushed a commit to rraminen/DeepSpeed that referenced this pull request Apr 28, 2021

Support nvidia bert dataset (microsoft#27)

47766e0

* Support nvidia bert dataset * Format fixes * E2E run of Nvidia Data with SQUAD 90.6 F1 * Minor fixes * Update README * Update README

garvct mentioned this pull request Jun 29, 2021

Bert training model failed when add --deepspeed_transformer_kernel #1155

Open

rraminen pushed a commit to rraminen/DeepSpeed that referenced this pull request Jul 26, 2021

Merge pull request microsoft#27 from rraminen/SWDEV_295133

7c6bb76

Make torch version check numeric

liamcli pushed a commit to determined-ai/DeepSpeed that referenced this pull request Sep 27, 2021

Update Triton / Sparse attn (microsoft#27)

04a52ad

* Pull changes from DeepSpeed * Update op builder compatibility * Update sparse_attn.py Co-authored-by: sid <sidney.black@aleph-alpha.de>

lambda7xx mentioned this pull request Feb 24, 2023

[BUG] Zero-Inference usage error with .init_inference() #2372

Closed

phalexo mentioned this pull request Oct 11, 2023

[BUG] The code for deepspeed.comm.comm.monitored_barrier() #4488

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port of DeepSpeed's overview and feature discussion. #27

Port of DeepSpeed's overview and feature discussion. #27

ShadenSmith commented Feb 6, 2020

ShadenSmith commented Feb 7, 2020

Port of DeepSpeed's overview and feature discussion. #27

Port of DeepSpeed's overview and feature discussion. #27

Conversation

ShadenSmith commented Feb 6, 2020

ShadenSmith commented Feb 7, 2020