add deepspeed #21708

hadim · 2023-01-11T23:16:36Z

Checklist

hadim · 2023-01-11T23:16:42Z

Another attempt to get deepspeed in conda-forge.

Close Package request: deepspeed #18420
Previous attempts:
This PR cannot be merged before ninja-python is available on conda-forge: Add python-ninja #19098

conda-forge-webservices · 2023-01-11T23:16:42Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/deepspeed) and found it was in an excellent condition.

hadim · 2023-01-12T12:27:16Z

@jaimergp @hmaarrfk: I have some questions regarding that recipe.

In short, pytorch is only needed at build time when the deepspeed ops are being built which can only happen when CUDA is available.

I tried to play with the selectors, so the CPU build is not dependent on a specific pytorch version. The goal is to reduce the number of unneeded builds here.

Please let me know what you think.

recipes/deepspeed/meta.yaml

hmaarrfk · 2023-01-12T13:25:51Z

I guess you can build for slightly more cudas now, but the migrator will have a hard time updating you to the latest version of pytorch if you don't add the skips now.

This looks great!

Co-authored-by: Mark Harfouche <mark.harfouche@gmail.com>

hadim · 2023-01-12T13:45:57Z

Thanks @hmaarrfk. Regarding the skip for CPU and pytorch during build time, do you think this would not cause any issue for conda/mamba when it comes to pick the appropriate builds? I am afraid the CPU build will be picked more easily since it's not tight to a pytorch version versus the GPU builds that are tight to it (but maybe it's all good there?).

I guess you can build for slightly more cudas now

I guess I cannot do that in that PR and will wait for this to be merged.

hmaarrfk · 2023-01-12T14:10:33Z

GPU builds that are tight to it

It may prefer the GPU because it has fewer dependenices. Use a build number trick for most robustness
https://github.com/conda-forge/pytorch-cpu-feedstock/blob/main/recipe/meta.yaml#L4

recipes/deepspeed/bld.bat

recipes/deepspeed/build.sh

Co-authored-by: Mark Harfouche <mark.harfouche@gmail.com>

hmaarrfk · 2023-01-15T14:28:59Z

Is there anything else you need from staged-recipies or can we merge?

hadim · 2023-01-15T19:07:11Z

I am ok to merge and address the ninja dep issue in the feedstock: #21708 (comment)

Also what should I do about CUDA arch above? Should I hard code the list?

hmaarrfk · 2023-01-15T19:10:38Z

Yes. please copy the list from pytorch.
https://github.com/conda-forge/pytorch-cpu-feedstock/blob/main/recipe/build_pytorch.sh#L94

hadim · 2023-01-15T20:27:46Z

@hmaarrfk LGTM for me here

hmaarrfk · 2023-01-15T21:21:24Z

  csrc/transformer/dropout_kernels.cu(102): error: no operator "*" matches these operands
              operand types are: __half2 * const __half2

  csrc/transformer/dropout_kernels.cu(103): error: no operator "*" matches these operands
              operand types are: __half2 * const __half2

  csrc/transformer/dropout_kernels.cu(216): error: no operator "*" matches these operands
              operand types are: __half2 * const __half2

  csrc/transformer/dropout_kernels.cu(217): error: no operator "*" matches these operands
              operand types are: __half2 * const __half2

  csrc/transformer/dropout_kernels.cu(335): error: no operator "*" matches these operands
              operand types are: __half2 * const __half2

  csrc/transformer/dropout_kernels.cu(336): error: no operator "*" matches these operands
              operand types are: __half2 * const __half2

you might have to raise the min version

hadim · 2023-01-15T22:11:13Z

@hmaarrfk seems ok now

hadim · 2023-01-15T23:10:10Z

Thanks @hmaarrfk !

hmaarrfk · 2023-01-15T23:12:27Z

no problem.

h-vetinari

Thanks for packaging this!

I think the following is not up to scratch though and should be fixed soon.

recipes/deepspeed/meta.yaml

add deepspeed

9b33d41

hadim mentioned this pull request Jan 11, 2023

Add python-ninja #19098

Closed

9 tasks

hadim and others added 7 commits January 11, 2023 18:27

WIP

b18bed1

WIP

89ed0e5

trigger ci

ded7c6d

WIP

c0aa7bb

Update meta.yaml

bc298a8

Update meta.yaml

0393a32

Update meta.yaml

f40f832

hadim commented Jan 12, 2023

View reviewed changes

recipes/deepspeed/meta.yaml Show resolved Hide resolved

hadim mentioned this pull request Jan 12, 2023

Conda recipe microsoft/DeepSpeed#1002

Closed

hmaarrfk reviewed Jan 12, 2023

View reviewed changes

recipes/deepspeed/meta.yaml Outdated Show resolved Hide resolved

Update recipes/deepspeed/meta.yaml

fbb9a39

Co-authored-by: Mark Harfouche <mark.harfouche@gmail.com>

build number tricks

a72da11

hmaarrfk reviewed Jan 12, 2023

View reviewed changes

recipes/deepspeed/bld.bat Outdated Show resolved Hide resolved

hmaarrfk reviewed Jan 12, 2023

View reviewed changes

recipes/deepspeed/build.sh Outdated Show resolved Hide resolved

hmaarrfk reviewed Jan 12, 2023

View reviewed changes

recipes/deepspeed/build.sh Outdated Show resolved Hide resolved

recipes/deepspeed/build.sh Show resolved Hide resolved

hadim and others added 2 commits January 12, 2023 10:43

Update recipes/deepspeed/build.sh

6129c74

Co-authored-by: Mark Harfouche <mark.harfouche@gmail.com>

WIP

fa1e7d9

cuda arch list

783e3ed

hadim added 2 commits January 15, 2023 16:31

WIP

d5c32b0

WIP

491d4e9

hmaarrfk merged commit 2cf45ea into conda-forge:main Jan 15, 2023

hadim deleted the deepspeed branch January 15, 2023 23:09

hadim mentioned this pull request Jan 15, 2023

Ninja dep conda-forge/deepspeed-feedstock#1

Open

weiji14 mentioned this pull request Jan 16, 2023

Add deepspeed #19021

Closed

9 tasks

h-vetinari reviewed Jan 18, 2023

View reviewed changes

recipes/deepspeed/meta.yaml Show resolved Hide resolved

hadim mentioned this pull request Jan 18, 2023

Don't exclude all dso conda-forge/deepspeed-feedstock#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add deepspeed #21708

add deepspeed #21708

hadim commented Jan 11, 2023 •

edited

hadim commented Jan 11, 2023 •

edited

conda-forge-webservices bot commented Jan 11, 2023

hadim commented Jan 12, 2023

hmaarrfk commented Jan 12, 2023

hadim commented Jan 12, 2023

hmaarrfk commented Jan 12, 2023

hmaarrfk commented Jan 15, 2023

hadim commented Jan 15, 2023

hmaarrfk commented Jan 15, 2023

hadim commented Jan 15, 2023

hmaarrfk commented Jan 15, 2023

hadim commented Jan 15, 2023

hadim commented Jan 15, 2023

hmaarrfk commented Jan 15, 2023

h-vetinari left a comment

add deepspeed #21708

add deepspeed #21708

Conversation

hadim commented Jan 11, 2023 • edited

hadim commented Jan 11, 2023 • edited

conda-forge-webservices bot commented Jan 11, 2023

hadim commented Jan 12, 2023

hmaarrfk commented Jan 12, 2023

hadim commented Jan 12, 2023

hmaarrfk commented Jan 12, 2023

hmaarrfk commented Jan 15, 2023

hadim commented Jan 15, 2023

hmaarrfk commented Jan 15, 2023

hadim commented Jan 15, 2023

hmaarrfk commented Jan 15, 2023

hadim commented Jan 15, 2023

hadim commented Jan 15, 2023

hmaarrfk commented Jan 15, 2023

h-vetinari left a comment

Choose a reason for hiding this comment

hadim commented Jan 11, 2023 •

edited

hadim commented Jan 11, 2023 •

edited