microsoft / DeepSpeed Public

Notifications You must be signed in to change notification settings
Fork 3.9k
Star 33.3k

Code
Issues 956
Pull requests 146
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: microsoft/DeepSpeed

Labels 32 Milestones 0

New pull request New

146 Open 2,625 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Monitor was always enabled causing performance degradation

#5633 opened Jun 9, 2024 by deepcharm

Loading…

stage_1_and_2: optimize clip calculation to use clamp

#5632 opened Jun 9, 2024 by nelyahu

Loading…

reduce all-to-all communication volume when both expert and non-expert are tensor-parallel

#5626 opened Jun 7, 2024 by taozhiwei

Loading…

Hybrid Offloading for ZeRO3

#5625 opened Jun 7, 2024 by tohtana • Draft

fix: quantization with DeepSpeed HE

#5624 opened Jun 6, 2024 by Atry

Loading…

Add support for Phi-3 small to FastGen

#5614 opened Jun 4, 2024 by adk9 • Draft

fixes in _partition_param_sec function

#5613 opened Jun 4, 2024 by mmhab

Loading…

[INF] Enable torch compile for inference

#5612 opened Jun 4, 2024 by oelayan7

Loading…

Upgrade HPU image to v1.16.0.

#5610 opened Jun 4, 2024 by vshekhawat-hlab

Loading…

Fixed Windows inference build.

#5609 opened Jun 3, 2024 by costin-eseanu

Loading…

Add an argument to enable the injection of missing state during the conversion of universal checkpoints

#5608 opened Jun 3, 2024 by xylian86

Loading…

Fix overlap communication of ZeRO stage 1 and 2

#5606 opened Jun 3, 2024 by penn513

Loading…

pipe/_exec_backward_pass: fix immediate grad update

#5605 opened Jun 3, 2024 by nelyahu

Loading…

[CPU] Allow deepspeed.comm.inference_all_reduce in torch.compile graph

#5604 opened Jun 3, 2024 by delock

Loading…

state_dict_factory: llama checkpoint - support SWIGLU

#5601 opened Jun 2, 2024 by nelyahu

Loading…

WA for Torch-compile-Z3-act-apt accuracy issue from the Pytorch repo

#5590 opened May 30, 2024 by NirSonnenschein

Loading…

FastGen H100 MoE support: Add PyTorch multi-gemm MOE implementation

#5586 opened May 29, 2024 by HeyangQin

Loading…

Update profiler.py

#5584 opened May 29, 2024 by gameofdimension

Loading…

Remove compile wrapper to simplify access to model attributes

#5581 opened May 29, 2024 by tohtana

Loading…

reduce cpu host overhead when using moe

#5578 opened May 29, 2024 by ranzhejiang

Loading…

_exec_forward_pass: place zeros(1) on the same device as the param

#5576 opened May 28, 2024 by nelyahu

Loading…

Reuse KV cache of prefixes

#5572 opened May 27, 2024 by tohtana • Draft

[CPU] SHM based allreduce improvement for small message size

#5571 opened May 27, 2024 by delock

Loading…

assumption of torch.initial_seed function accepting seed arg in DeepSpeedAccelerator abstract class is incorrect

#5569 opened May 26, 2024 by polisettyvarma

Loading…

Add support for Microsoft Phi-3 model to DeepSpeed-FastGen

#5559 opened May 21, 2024 by adk9

Loading…

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly