DeepSpeed4Science #4357

conglongli · 2023-09-18T20:58:57Z

Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies: https://deepspeed4science.ai/

Core DeepSpeed4Science Team:
Shuaiwen Leon Song (DeepSpeed4Science lead), Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Xiaoxia (Shirley) Wu, Masahiro Tanaka, Martin Cai, Adam Graham, Charlie Zhou, Yuxiong He (DeepSpeed team lead)

Main author of the DS4Sci_EvoformerAttention: Shiyang Chen @cctry

* fix conv_flops_compute when padding is a str when stride=1 * fix error * change type of paddings to tuple * fix padding calculation * apply formatting check --------- Co-authored-by: Cheng Li <pistasable@gmail.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

* Update profiler.py * pre-commit run --all-files * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Cheng Li <pistasable@gmail.com>

* zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format

Co-authored-by: Stephen Youn <styoun@microsoft.com> Co-authored-by: Arash Bakhtiari <arash@bakhtiari.org> Co-authored-by: Cheng Li <pistasable@gmail.com> Co-authored-by: Ethan Doe <yidoe@microsoft.com> Co-authored-by: yidoe <68296935+yidoe@users.noreply.github.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>

Co-authored-by: HeyangQin <heyangqin@microsoft.com> Co-authored-by: GuanhuaWang <alexwgh333@gmail.com> Co-authored-by: cmikeh2 <connorholmes@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com>

* zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format * add ZeRO++ Japanese blog * add links --------- Co-authored-by: HeyangQin <heyangqin@microsoft.com> Co-authored-by: Conglong Li <conglong.li@gmail.com>

* fix autotuner when backward is not called * fix format --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

Co-authored-by: Jeff Rasley <jerasley@microsoft.com>

* Bug fix * Fixed formatting error --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

Co-authored-by: Stephen Youn <styoun@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>

* Integrating evoformer attention * add cutlass version check * Updaate error message * add benchmark * Update * Update evoformer_attn.py * Update run_evoformer_test.py * Update evoformer_attn.py * Update run_evoformer_test.py * support more GPU archs * add copyright * add tests * Fix bugs * Update benchmark * update * Fix nvcc macro * clean code * fix formatting * fix yaml import * skip unit test when not compatible * fix yaml requirement * revert changes * update tutorial * update * fix formatting * fix format * skip evoformer attn in pre-compile-ops * revert changes * update tutorial * fix cutlass check * update tutorial * refactor tutorial * revise * Updated the Megatron-DS section (#565) * Updated the Megatron-DS section * minor fix * minor fix * minor fix * separate evoformer tutorial * Revised the ds4science landing page (#566) * Updated the Megatron-DS section * minor fix * minor fix * minor fix * Revised the landing page * Revised the landing page * Removing unused file * fix links image position * modify main page * fix doc --------- Co-authored-by: Shiyang Chen <csycfl@gmail.com> Co-authored-by: Minjia Zhang <33713995+minjiaz@users.noreply.github.com>

* origin/master: Allow multiple inference engines in single script (microsoft#4384) adds triton flash attention2 kernel (microsoft#4337) Fix llama meta tensor loading in AutoTP and kernel injected inference (microsoft#3608) Fix min torch version (microsoft#4375) Fix multinode runner to properly append to PDSH_SSH_ARGS_APPEND (microsoft#4373) add the missing method (microsoft#4363) Openfold fix (microsoft#4368) deepspeed4science japanese blog (microsoft#4369) deepspeed4science chinese blog (microsoft#4366) Enable workflow dispatch on Torch 1.10 CI tests (microsoft#4361) Update conda env to have max pydantic version (microsoft#4362) add deepspeed4science blog link (microsoft#4364) added check to avoid undefined behavior when the input_id length is greater than max_tokens (microsoft#4349) Add the policy to run llama model from the official repo (microsoft#4313) fix deepspeed4science links (microsoft#4358) DeepSpeed4Science (microsoft#4357) Support InternLM (microsoft#4137) Pass base_dir to model files can be loaded for auto-tp/meta-tensor. (microsoft#4348)

HeyangQin and others added 30 commits June 21, 2023 11:51

zero++ tutorial PR (#3783)

df1859d

fix interpolate flops compute (#3782)

a8c182a

use Flops Profiler to test model.generate() (#2515)

c4c442f

* Update profiler.py * pre-commit run --all-files * Delete .DS_Store * Delete .DS_Store * Delete .DS_Store --------- Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Cheng Li <pistasable@gmail.com>

revert PR #3611 (#3786)

fc9e1ee

bump to 0.9.6

40045dc

ZeRO++ chinese blog (#3793)

49a0a1b

* zeropp chinese blog * try better quality images * make title larger * even larger... * various fix * center captions * more fixes * fix format

remove staging trigger (#3792)

2c62cb4

adding zero++ to navigation panel of deepspeed.ai (#3796)

01b843a

Bug Fixes for autotuner and flops profiler (#1880)

b4a2c0a

* fix autotuner when backward is not called * fix format --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>

Missing strided copy for gated MLP (#3788)

b7e1010

Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

Requires grad checking. (#3789)

e5b1ead

Co-authored-by: Jeff Rasley <jerasley@microsoft.com>

bump to 0.10.0

9c756cf

Fix Bug in transform.cu (#3534)

a204edc

* Bug fix * Fixed formatting error --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

bug fix: triton importing error (#3799)

f6e2e38

Co-authored-by: Stephen Youn <styoun@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>

Merge branch 'master' of github.com:microsoft/DeepSpeed

c1a7d3c

Merge branch 'master' of github.com:microsoft/DeepSpeed

65ed548

Merge branch 'master' of github.com:microsoft/DeepSpeed

d7ac329

Merge branch 'master' of github.com:microsoft/DeepSpeed

83f1102

Merge branch 'master' of github.com:microsoft/DeepSpeed

16555b2

Merge branch 'master' of github.com:microsoft/DeepSpeed

9d7b654

Merge branch 'master' of github.com:microsoft/DeepSpeed

c121f90

Merge branch 'master' of github.com:microsoft/DeepSpeed

f6b2962

Merge branch 'master' of github.com:microsoft/DeepSpeed

dd6bb04

Merge branch 'master' of github.com:microsoft/DeepSpeed

6e5a1f1

Merge branch 'master' of github.com:microsoft/DeepSpeed

1fbbbbf

Merge branch 'master' of github.com:microsoft/DeepSpeed

e44eb86

jeffra and others added 8 commits September 14, 2023 18:00

Merge branch 'master' of github.com:microsoft/DeepSpeed

1ba7b2a

Merge branch 'master' of github.com:microsoft/DeepSpeed

98e04da

Merge branch 'master' of github.com:microsoft/DeepSpeed

cddb8bc

Merge branch 'master' of github.com:microsoft/DeepSpeed

1be044f

Merge branch 'master' of github.com:microsoft/DeepSpeed

1bd4f94

Merge branch 'master' of github.com:microsoft/DeepSpeed

e58eb07

Merge branch 'master' of github.com:microsoft/DeepSpeed

00dfab9

conglongli requested review from RezaYazdaniAminabadi, jeffra, mrwyattii, awan-10, cmikeh2, arashb, tjruwase and loadams as code owners September 18, 2023 20:58

conglongli removed request for arashb, jeffra, tjruwase, mrwyattii, RezaYazdaniAminabadi and loadams September 18, 2023 20:59

awan-10 approved these changes Sep 18, 2023

View reviewed changes

conglongli enabled auto-merge September 18, 2023 22:11

conglongli added this pull request to the merge queue Sep 18, 2023

cmikeh2 approved these changes Sep 18, 2023

View reviewed changes

Merged via the queue into master with commit f876d81 Sep 18, 2023
16 checks passed

conglongli deleted the staging-deepspeed4science-v1 branch September 18, 2023 23:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSpeed4Science #4357

DeepSpeed4Science #4357

conglongli commented Sep 18, 2023

DeepSpeed4Science #4357

DeepSpeed4Science #4357

Conversation

conglongli commented Sep 18, 2023