Add `pyproject.toml` with legacy build backend to keep most logic in `setup.py` #7033

loadams · 2025-02-13T18:10:29Z

Test all build cases this needs to support.
Test Windows builds/support. - Successfully built deepspeed-0.16.5+1d869d1f-cp311-cp311-win_amd64.whl
Confirm pre-compiling ops - works with --no-build-isolation
Test commit hashes are added to dev builds (when building wheels, error from python -m build - Successfully built deepspeed-0.16.5+1d869d1f.tar.gz and deepspeed-0.16.5+unknown-py3-none-any.whl
Add pyproject.toml to path triggers similar to setup.py.

The main goal of this effort is to become compliant with the coming changes to pip in 25.1 listed here which will break editable installs. Future PRs will fully move from setup.py to pyproject.toml

Fixes: #7031

MII equivalent PR: deepspeedai/DeepSpeed-MII#555
DS-Kernels equivalent PR: deepspeedai/DeepSpeed-Kernels#20

pyproject.toml

jeffra · 2025-02-14T04:34:39Z

@mrwyattii we just went through some of this with arctic training. If it’s helpful @loadams let’s discuss on slack a bit. There’s a ton that’s currently happening in setup.py, this could be a big lift? But I agree, needs to happen!

loadams · 2025-02-19T18:31:50Z

Edit: this is no longer correct with latest changes.

The current problem is that the logic inside setup.py aside from the call to setup() isn't run. This means that we don't append cupy into the requirements file, and tests fail as a result. This impacts other parts of the build experience, so we will need to do more work to switch to a modern build backend.

This change is required to successfully build fp_quantizer extension on ROCm. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Logan Adams <loadams@microsoft.com>

@tjruwase

cc @tjruwase @jomayeri --------- Co-authored-by: root <root@ftqtmec25000000.taxzvufipdhelhupulxcbvr15f.ux.internal.cloudapp.net> Signed-off-by: Logan Adams <loadams@microsoft.com>

Fix #7029 - Add Chinese blog for deepspeed windows - Fix format in README.md Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Adding compile support for AIO library on AMD GPUs. --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Logan Adams <loadams@microsoft.com>

Make trace cache warnings configurable, and disabled by default. Fix #6985, #4081, #5033, #5006, #5662 --------- Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Update CUDA compute capability for cross compile according to wiki page. https://en.wikipedia.org/wiki/CUDA#GPUs_supported --------- Signed-off-by: Hongwei <hongweichen@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Logan Adams <loadams@microsoft.com>

…ently, so we aren't seeing cupy installed. Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Logan Adams <loadams@microsoft.com>

Propagate API change. Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

- add zero2 test - minor fix with transformer version update & ds master merge. Signed-off-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

bf16 with moe refresh optimizer state from bf16 ckpt will raise IndexError: list index out of range Signed-off-by: shaomin <wukon1992@gmail.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

@loadams

**Auto-generated PR to update version.txt after a DeepSpeed release** Released version - 0.16.4 Author - @loadams Co-authored-by: loadams <loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

@jeffra

@jeffra and I fixed this many years ago, so bringing this doc to a correct state. --------- Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: Logan Adams <loadams@microsoft.com>

Description This PR includes Tecorigin SDAA accelerator support. With this PR, DeepSpeed supports SDAA as backend for training tasks. --------- Signed-off-by: siqi <siqi@tecorigin.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Keeps lines within PEP 8 length limits. Enhances readability with a single, concise expression. Preserves original functionality. --------- Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: shaomin <wukon1992@gmail.com> Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: siqi <siqi@tecorigin.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Wei Wu <wuwei211x@gmail.com> Signed-off-by: ShellyNR <shelly.nahir@live.biu.ac.il> Signed-off-by: Lai, Yejing <yejing.lai@intel.com> Signed-off-by: Hongwei <hongweichen@microsoft.com> Signed-off-by: Liang Cheng <astarxp777@gmail.com> Signed-off-by: A-transformer <astarxp777@gmail.com> Co-authored-by: Raza Sikander <srsikander@habana.ai> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Max Kovalenko <mkovalenko@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: wukong1992 <wukong1992@users.noreply.github.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Co-authored-by: loadams <loadams@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: siqi654321 <siqi202311@163.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Wei Wu <45323446+U-rara@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Shelly Nahir <73890534+ShellyNR@users.noreply.github.com> Co-authored-by: snahir <snahir@habana.ai> Co-authored-by: Yejing-Lai <yejing.lai@intel.com> Co-authored-by: A-transformer <astarxp777@gmail.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Unpin transformers version for all workflows except `nv-torch-latest-v100` as this still has a tolerance issue with some quantization tests. Signed-off-by: Logan Adams <loadams@microsoft.com>

Resolves #6997 This PR conditionally quotes environment variable values—only wrapping those containing special characters (like parentheses) that could trigger bash errors. Safe values remain unquoted. --------- Signed-off-by: Saurabh <saurabhkoshatwar1996@gmail.com> Signed-off-by: Saurabh Koshatwar <saurabhkoshatwar1996@gmail.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Correct the BACKWARD_PREFETCH_SUBMIT mismatch FORWARD_PREFETCH_SUBMIT = 'forward_prefetch_submit' --------- Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: shaomin <wukon1992@gmail.com> Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: siqi <siqi@tecorigin.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Wei Wu <wuwei211x@gmail.com> Signed-off-by: ShellyNR <shelly.nahir@live.biu.ac.il> Signed-off-by: Lai, Yejing <yejing.lai@intel.com> Signed-off-by: Hongwei <hongweichen@microsoft.com> Signed-off-by: A-transformer <astarxp777@gmail.com> Co-authored-by: Raza Sikander <srsikander@habana.ai> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Max Kovalenko <mkovalenko@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: wukong1992 <wukong1992@users.noreply.github.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Co-authored-by: loadams <loadams@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: siqi654321 <siqi202311@163.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Wei Wu <45323446+U-rara@users.noreply.github.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Shelly Nahir <73890534+ShellyNR@users.noreply.github.com> Co-authored-by: snahir <snahir@habana.ai> Co-authored-by: Yejing-Lai <yejing.lai@intel.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

…Tests (#7146) Enhancing ci/nightly coverage for gaudi2 device Tests added : test_autotp_training.py test_ulysses.py test_linear::TestLoRALinear and test_linear::TestBasicLinear test_ctx::TestEngine these provide coverage for model_parallesim and linear feature. The tests are stable. 10/10 runs pass. New tests addition is expected to increase ci time by 3-4 mins and nightly job time by 15 min. Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: Logan Adams <loadams@microsoft.com>

Changes from huggingface/transformers#36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

@tjruwase

@tjruwase Don't merge yet, I will leave a comment when it is ready for merge. Thank you. --------- Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

) This PR is a continuation of the efforts to improve DeepSpeed performance when using PyTorch compile. Dynamo breaks the graph because `flat_tensor.requires_grad = False`: * Is a side-effecting operation on tensor metadata * Occurs in a context where Dynamo expects static tensor properties for tracing `flat_tensor.requires_grad` is redundant and can be safely removed because: * `_allgather_params()` function is already decorated with `@torch.no_grad()` which ensures the desired property * `flat_tensor` is created using the `torch.empty()` which sets the `requires_grad=False` by default. --------- Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

ZeRO3 requires explicit cleaning in tests when reusing the environment. This PR adds `destroy` calls to the tests to free memory and avoid potential errors due to memory leaks. Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: c8ef <c8ef@outlook.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Hongwei <hongweichen@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Signed-off-by: Logan Adams <loadams@microsoft.com>

agronholm · 2025-04-09T15:50:18Z

pyproject.toml

+    "setuptools>=64",
+    "torch",
+    "wheel"


If you depend on setuptools 70.1 or later, you won't need wheel.

Suggested change

"setuptools>=64",

"torch",

"wheel"

"setuptools>=70.1",

"torch"

stas00 reviewed Feb 14, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

loadams changed the title ~~Add pyproject.toml~~ Add pyproject.toml with legacy build backend to keep most logic in setup.py Feb 19, 2025

loadams marked this pull request as ready for review February 25, 2025 18:37

loadams requested review from jeffra, sfc-gh-mwyatt and tjruwase February 25, 2025 19:44

rraminen and others added 22 commits March 25, 2025 08:51

[ROCm] Enable fp_quantizer on ROCm (#7027)

230f479

This change is required to successfully build fp_quantizer extension on ROCm. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Add pyproject.toml

130c11c

Signed-off-by: Logan Adams <loadams@microsoft.com>

Formatting fix

b2e16cb

Signed-off-by: Logan Adams <loadams@microsoft.com>

add gds chinese blog (#7034)

72bfb70

cc @tjruwase @jomayeri --------- Co-authored-by: root <root@ftqtmec25000000.taxzvufipdhelhupulxcbvr15f.ux.internal.cloudapp.net> Signed-off-by: Logan Adams <loadams@microsoft.com>

Add chinese blog for deepspeed windows, and fix format (#7035)

48c02ad

Fix #7029 - Add Chinese blog for deepspeed windows - Fix format in README.md Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

AIO on ROCM (#7023)

4996cae

Adding compile support for AIO library on AMD GPUs. --------- Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Update setuptools min requirement

52709fa

Signed-off-by: Logan Adams <loadams@microsoft.com>

Switch build to legacy

9eb0618

Signed-off-by: Logan Adams <loadams@microsoft.com>

Add no-build isolation

36ce373

Signed-off-by: Logan Adams <loadams@microsoft.com>

Control trace cache warnings (#7039)

a0ff11a

Make trace cache warnings configurable, and disabled by default. Fix #6985, #4081, #5033, #5006, #5662 --------- Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Update setup.py handling of ROCm cupy (#7051)

5326873

Signed-off-by: Logan Adams <loadams@microsoft.com>

nv-ds-chat breaks with latest transformers (#7052)

c22be1a

Signed-off-by: Logan Adams <loadams@microsoft.com>

Test with non legacy backend

2873a11

Signed-off-by: Logan Adams <loadams@microsoft.com>

Need legacy backend to execute setup.py, though it executes it differ…

4e88463

…ently, so we aren't seeing cupy installed. Signed-off-by: Logan Adams <loadams@microsoft.com>

Update to actually use legacy backend

6d837ca

Signed-off-by: Logan Adams <loadams@microsoft.com>

Rename aio_thread_count to intra_op_parallelism (#7056)

5b2f713

Propagate API change. Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

add autoTP training zero2 tests (#7049)

c0f4235

- add zero2 test - minor fix with transformer version update & ds master merge. Signed-off-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Update version.txt after 0.16.4 release (#7063)

41718ad

**Auto-generated PR to update version.txt after a DeepSpeed release** Released version - 0.16.4 Author - @loadams Co-authored-by: loadams <loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

fix an outdated doc wrt CUDA_VISIBLE_DEVICES (#7058)

c83ade6

@jeffra and I fixed this many years ago, so bringing this doc to a correct state. --------- Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: Logan Adams <loadams@microsoft.com>

A-transformer and others added 12 commits March 25, 2025 08:51

Unpin transformers version for most workflows (#7139)

6de20f6

Unpin transformers version for all workflows except `nv-torch-latest-v100` as this still has a tolerance issue with some quantization tests. Signed-off-by: Logan Adams <loadams@microsoft.com>

Update container version that runs on A6000 tests. (#7153)

4ca7ba5

Changes from huggingface/transformers#36654 in transformers cause issues with the torch 2.5 version we were using. This just updated us to use a newer version. --------- Signed-off-by: Logan Adams <loadams@microsoft.com>

fix leak of z3 buffer

2db922f

Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

[NFC] Typo fix in SP layer. (#7152)

7b7ac9e

Signed-off-by: c8ef <c8ef@outlook.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

Link AutoTP blog in the front page (#7167)

31ec2b7

Signed-off-by: Hongwei <hongweichen@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Signed-off-by: Logan Adams <loadams@microsoft.com>

loadams force-pushed the loadams/pyproject-toml branch from 705edb3 to 31ec2b7 Compare March 25, 2025 15:51

loadams requested review from tohtana, jomayeri and hwchen2017 as code owners March 25, 2025 15:51

loadams and others added 2 commits March 25, 2025 08:54

Merge branch 'master' into loadams/pyproject-toml

e40df22

Remove unneeded requires in build system declaration

4c32a9d

Signed-off-by: Logan Adams <loadams@microsoft.com>

sfc-gh-mwyatt approved these changes Mar 25, 2025

View reviewed changes

loadams and others added 6 commits March 25, 2025 13:14

Add build to the pyproject

c4b24fd

Add no torch build triggers

b330b4e

Signed-off-by: Logan Adams <loadams@microsoft.com>

Remove no build isolation from nv-torch-latest

3601c29

Merge branch 'master' into loadams/pyproject-toml

e0d9ba4

Merge branch 'master' into loadams/pyproject-toml

42e42a0

Merge branch 'master' into loadams/pyproject-toml

3c94f51

loadams mentioned this pull request Apr 9, 2025

Support complicated use cases with TiedLayerSpec #7208

Merged

agronholm reviewed Apr 9, 2025

View reviewed changes

loadams added 3 commits April 9, 2025 12:21

Merge branch 'master' into loadams/pyproject-toml

221f2be

Merge branch 'master' into loadams/pyproject-toml

dfcc24d

Merge branch 'master' into loadams/pyproject-toml

b4ed94a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `pyproject.toml` with legacy build backend to keep most logic in `setup.py` #7033

Add `pyproject.toml` with legacy build backend to keep most logic in `setup.py` #7033

Uh oh!

loadams commented Feb 13, 2025 •

edited

Loading

Uh oh!

Uh oh!

jeffra commented Feb 14, 2025

Uh oh!

loadams commented Feb 19, 2025 •

edited

Loading

Uh oh!

agronholm Apr 9, 2025

Uh oh!

Uh oh!

Add pyproject.toml with legacy build backend to keep most logic in setup.py #7033

Are you sure you want to change the base?

Add pyproject.toml with legacy build backend to keep most logic in setup.py #7033

Uh oh!

Conversation

loadams commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jeffra commented Feb 14, 2025

Uh oh!

loadams commented Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agronholm Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Add `pyproject.toml` with legacy build backend to keep most logic in `setup.py` #7033

Add `pyproject.toml` with legacy build backend to keep most logic in `setup.py` #7033

loadams commented Feb 13, 2025 •

edited

Loading

loadams commented Feb 19, 2025 •

edited

Loading