Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Compressedbackend for Onebit optimizers #5473

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

Liangliang-Ma
Copy link
Contributor

In the process of adding onebit optimizers support for XPU devices, we have noticed that for different accelerator, the main difference of implementation of compressed_allreduce lies on packbits and unpackbits. CUDA uses cupy and NPU uses torch_npu. Instead of replace these to xpu only functions, we provided a CompressedBackend to do the compressed_allreduce work where users can add their own packbits/unpackbits kernels, which is a general path for all kinds of accelerators.

In this PR, we:

  1. Add CompressedBackend for onebitAdam, onebitLamb and zerooneAdam
  2. Add XPU implement of packbits/unpackbits with SYCL, built in PackbitsBuilder
  3. Add tests for onebit with CompressedBackend


at::Tensor packbits(at::Tensor tensor, int input_size, int rank)
{
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Liangliang-Ma the function documentation needs to be moved to line 39 right before the function def line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@delock
Copy link
Contributor

delock commented May 7, 2024

@tjruwase this PR is an approach to abstract the generic part of 1bit-adam and implment accelerator dependent part with DeepSpeed custom op builder. So 1bit-adam does not need to depend on accelerator specific libraries.

@inkcherry I remember you investigated in 1bit adam portability before, FYI this PR implement a portable version of 1bit adam support.

@Liangliang-Ma Liangliang-Ma marked this pull request as draft May 11, 2024 08:50
@Liangliang-Ma Liangliang-Ma marked this pull request as ready for review May 14, 2024 06:44
@Liangliang-Ma
Copy link
Contributor Author

Hi @tjruwase , could you please help to review this PR? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants