DeepSpeed Chat #3186

tjruwase · 2023-04-11T17:24:43Z

DeepSpeed Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

https://github.com/microsoft/DeepSpeed/tree/master/blogs/deepspeed-chat

* Merge chatgpt v2 to v3 - finalized (#484) * [squash] staging chatgpt v1 (#463) Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: yaozhewei <zheweiy@berkeley.edu> Co-authored-by: Tunji Ruwase <olruwase@microsoft.com> * [partial] formatting fixes * quantizer fixes * fix for bert tests * formatting fixes * re-enable _param_slice_mappings in z2 * Enable the QKV requires_grad when in training mode (#466) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * fixes for attention enable_training flag * commit to trigger CI * fix for distil-bert param * fixes for training context errors * remove reza's qkv-optimization (#469) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * Chatgpt - Fuse lora params at HybridEngine (#472) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * add option to enable non-pin mode (#473) * Chatgpt - fuse lora non pinned case (#474) * Fix fuse/unfuse lora for Z3 and non-pinned parameter * unfuse_lora_weight for non-pinned case * fix the multiple issue for lora parameters * formatting * fuse lora only when available --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * Chatgpt/release inference cache (#475) * Fix fuse/unfuse lora for Z3 and non-pinned parameter * unfuse_lora_weight for non-pinned case * release/retake the inference cache after/before generate * remove duplicated _fuse_lora function * fix formatting * fix hybrid-engine config issue * update formatting * Chatgpt - fuse qkv v2 (#478) Co-authored-by: Jeff Rasley <jerasley@microsoft.com> * ChatGPT: Refactor Hybrid Engine Config (#477) Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> * Inference Workspace Tweaks (#481) * Safety checks around inference workspace allocation, extra flushing * Formatting fixes * Merge fix * Chatgpt/inference tp (#480) * Update the merged-QKV weights only if there is difference with the model parameter * remove the hard-coded size * always reset qkv params to updated ones after running step * Add the infernce-tp group and tensor sharding to run inference in model-parallel mode * optimize the gather/mp-sharding part * Add hybrid_engine changes * fix config issue * Formatting fixes. Reset_qkv duplicate removal. * fix bloom container. * fix format. --------- Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> * fix formatting * more clean-up --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: yaozhewei <zheweiy@berkeley.edu> Co-authored-by: Tunji Ruwase <olruwase@microsoft.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: Connor Holmes <connorholmes@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> * fix a bug on lora-fusion (#487) * Cholmes/v3 workspace bugfixes (#488) * Miscellaneous workspace fixes, new config param * Fix typo --------- Co-authored-by: Reza Yazdani <44502768+RezaYazdaniAminabadi@users.noreply.github.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: yaozhewei <zheweiy@berkeley.edu> Co-authored-by: Tunji Ruwase <olruwase@microsoft.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Lok Chand Koppaka <lokoppak@microsoft.com> Co-authored-by: Connor Holmes <connorholmes@microsoft.com>

jeffra

🎉🎉🎉

awan-10 and others added 3 commits April 10, 2023 13:41

Use current AG method from comm (#492)

65344e6

bump to 0.9.0 (#498)

36a3659

tjruwase requested review from jeffra, samyam, mrwyattii, RezaYazdaniAminabadi, awan-10, cmikeh2 and arashb as code owners April 11, 2023 17:24

Merge branch 'master' into staging-deepspeed-chat-v2

49d9d1c

awan-10 changed the title ~~RLHF Training support~~ DeepSpeed Chat Release Apr 11, 2023

tjruwase changed the title ~~DeepSpeed Chat Release~~ DeepSpeed-Chat Apr 11, 2023

jeffra mentioned this pull request Apr 11, 2023

DeepSpeed Chat Release microsoft/DeepSpeedExamples#264

Merged

tjruwase changed the title ~~DeepSpeed-Chat~~ DeepSpeed Chat Apr 11, 2023

awan-10 approved these changes Apr 11, 2023

View reviewed changes

Merge branch 'master' into staging-deepspeed-chat-v2

562d4c4

jeffra approved these changes Apr 11, 2023

View reviewed changes

Merge branch 'master' into staging-deepspeed-chat-v2

d250654

RezaYazdaniAminabadi approved these changes Apr 11, 2023

View reviewed changes

jeffra merged commit 47f9f13 into master Apr 11, 2023

jeffra deleted the staging-deepspeed-chat-v2 branch April 11, 2023 18:53

tjruwase mentioned this pull request Apr 12, 2023

[BUG] terminate called after throwing an instance of 'std::bad_alloc' #3126

Open

mrwyattii mentioned this pull request Apr 13, 2023

Fix for Stable Diffusion #3218

Merged

conglongli added the deepspeed-chat Related to DeepSpeed-Chat label Apr 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSpeed Chat #3186

DeepSpeed Chat #3186

tjruwase commented Apr 11, 2023

jeffra left a comment

DeepSpeed Chat #3186

DeepSpeed Chat #3186

Conversation

tjruwase commented Apr 11, 2023

jeffra left a comment

Choose a reason for hiding this comment