Added __HIP_PLATFORM_AMD__=1 #4570

rraminen · 2023-10-26T03:44:30Z

This PR is required in addition to #4539 to define HIP_PLATFORM_AMD on ROCm.

This is required for DeepSpeed extensions build in those docker images with PyTorch built before pytorch/pytorch#111975.

cc: @jeffdaily @jithunnair-amd

* Remove PP Grad Tail Check (microsoft#2538) * Only communicate grad tail if it exists Co-authored-by: Dashiell Stander <dash.stander@gmail.com> * Revert previous patch and just always send the grad tail * Formatting --------- Co-authored-by: Dashiell Stander <dash.stander@gmail.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> * Added __HIP_PLATFORM_AMD__=1 (microsoft#4570) * fix multiple definition while building evoformer (microsoft#4556) Current builder for evoformer use the same name for `attention.cpp` and `attention.cu`, leading to same intermediate filename `attention.o`: ```shell march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe - isystem /home/zejianxie/.conda/envs/dll/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/zejianxie/.conda/envs/dll/include build/temp.linux-x86_64-cpython- 310/csrc/deepspeed4science/evoformer_attn/attention.o build/temp.linux-x86_64-cpython- 310/csrc/deepspeed4science/evoformer_attn/attention.o build/temp.linux-x86_64-cpython- 310/csrc/deepspeed4science/evoformer_attn/attention_back.o ``` and ``` `attention_impl(at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&)': tmpxft_0012bef1_00000000-6_attention.compute_86.cudafe1.cpp:(.text+0x330): multiple definition of `attention_impl(at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&, at::Tensor&)'; build/temp.linux-x86_64-cpython-310/csrc/deepspeed4science/evoformer_attn/attention.o:tmpxft_0012bef1_00000000-6_attention.compute_86.cudafe1.cpp:(.text+0x330): first defined here /home/zejianxie/.conda/envs/dll/bin/../lib/gcc/x86_64-conda-linux-gnu/11.4.0/../../../../x86_64-conda-linux-gnu/bin/ld: build/temp.linux-x86_64-cpython-310/csrc/deepspeed4science/evoformer_attn/attention.o:(.bss+0x0): multiple definition of `torch::autograd::(anonymous namespace)::graph_task_id'; build/temp.linux-x86_64-cpython-310/csrc/deepspeed4science/evoformer_attn/attention.o:(.bss+0x0): first defined here ``` I use following to reproduce and confirm my fix works: ``` git clone https://github.com/NVIDIA/cutlass --depth 1 CUTLASS_PATH=$PWD/cutlass DS_BUILD_EVOFORMER_ATTN=1 pip install ./DeepSpeed --global-option="build_ext" ``` ![image](https://github.com/microsoft/DeepSpeed/assets/41792945/9e406b37-330c-431c-8bf9-6be378dee4ff) Co-authored-by: Conglong Li <conglong.li@gmail.com> * Update ccl.py --------- Co-authored-by: Quentin Anthony <qganthony@yahoo.com> Co-authored-by: Dashiell Stander <dash.stander@gmail.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Ramya Ramineni <62723901+rraminen@users.noreply.github.com> Co-authored-by: Xie Zejian <xiezej@gmail.com> Co-authored-by: Conglong Li <conglong.li@gmail.com>

@jeffdaily

This PR is required in addition to #4539 to define HIP_PLATFORM_AMD on ROCm. This is required for DeepSpeed non-JIT build. For JIT build we have #4570. This is required for DeepSpeed extensions build in those docker images with PyTorch built before pytorch/pytorch#111975. cc: @jeffdaily @jithunnair-amd Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

@jeffdaily

This PR is required in addition to microsoft#4539 to define HIP_PLATFORM_AMD on ROCm. This is required for DeepSpeed non-JIT build. For JIT build we have microsoft#4570. This is required for DeepSpeed extensions build in those docker images with PyTorch built before pytorch/pytorch#111975. cc: @jeffdaily @jithunnair-amd Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

Added __HIP_PLATFORM_AMD__=1

30a83c5

rraminen requested review from jeffra, RezaYazdaniAminabadi and cmikeh2 as code owners October 26, 2023 03:44

jeffdaily-ms approved these changes Oct 26, 2023

View reviewed changes

loadams approved these changes Oct 26, 2023

View reviewed changes

loadams enabled auto-merge October 26, 2023 16:09

loadams added this pull request to the merge queue Oct 26, 2023

Merged via the queue into microsoft:master with commit 764f5b0 Oct 26, 2023
15 checks passed

rraminen mentioned this pull request Oct 30, 2023

Added __HIP_PLATFORM_AMD__=1 for non JIT build #4585

Merged

baodii pushed a commit to baodii/DeepSpeed that referenced this pull request Nov 7, 2023

Added __HIP_PLATFORM_AMD__=1 (microsoft#4570)

72a0401

mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024

Added __HIP_PLATFORM_AMD__=1 (microsoft#4570)

5996e8d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added __HIP_PLATFORM_AMD__=1 #4570

Added __HIP_PLATFORM_AMD__=1 #4570

rraminen commented Oct 26, 2023

Added __HIP_PLATFORM_AMD__=1 #4570

Added __HIP_PLATFORM_AMD__=1 #4570

Conversation

rraminen commented Oct 26, 2023