-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Open
Description
Hello, I am trying to build deepspeed ops from source on AMD MI300A. I am using rocm-6.3.3 on MI300A.
deepspeed env variables. These variables are sourced before building. I am using the amdclang compiler
# Point to ROCm
export ROCM_PATH=/opt/rocm-6.3.3/
export cc=amdclang
export CC=amdclang++
export PYTORCH_ROCM_ARCH="gfx942"
export GPU_ARCHS="gfx942"
# Configure DeepSpeed Build Flags
export DS_BUILD_OPS=1
export DS_BUILD_CUTLASS_OPS=0
export DS_BUILD_EVOFORMER_ATTN=0
export DS_BUILD_FP_QUANTIZER=0
export DS_BUILD_GDS=0
export DS_BUILD_RAGGED_DEVICE_OPS=0
export DS_BUILD_SPARSE_ATTN=0
export DS_BUILD_DEEP_COMPILE=0
Commands for building:
cd /path/to/Deepspeed/
source venv/bin/activate
pip install --no-build-isolation -e .
Error message:
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/c10/util/Deprecated.h:24:43: note: expanded from macro 'C10_DEPRECATED_MESSAGE'
24 | #define C10_DEPRECATED_MESSAGE(message) [[deprecated(message)]]
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:437:27: warning: 'data' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
437 | g.data<scalar_t>(),
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:247:7: note: 'data' has been explicitly marked deprecated here
247 | T * data() const {
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:437:27: warning: 'data<float>' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
437 | g.data<scalar_t>(),
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:246:3: note: 'data<float>' has been explicitly marked deprecated here
246 | C10_DEPRECATED_MESSAGE("Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead.")
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/c10/util/Deprecated.h:24:43: note: expanded from macro 'C10_DEPRECATED_MESSAGE'
24 | #define C10_DEPRECATED_MESSAGE(message) [[deprecated(message)]]
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:446:32: warning: 'data' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
446 | w_l2_i.data<scalar_t>(),
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:247:7: note: 'data' has been explicitly marked deprecated here
247 | T * data() const {
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:446:32: warning: 'data<float>' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
446 | w_l2_i.data<scalar_t>(),
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:246:3: note: 'data<float>' has been explicitly marked deprecated here
246 | C10_DEPRECATED_MESSAGE("Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead.")
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/c10/util/Deprecated.h:24:43: note: expanded from macro 'C10_DEPRECATED_MESSAGE'
24 | #define C10_DEPRECATED_MESSAGE(message) [[deprecated(message)]]
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:447:32: warning: 'data' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
447 | u_l2_i.data<scalar_t>());
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:247:7: note: 'data' has been explicitly marked deprecated here
247 | T * data() const {
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:447:32: warning: 'data<float>' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
447 | u_l2_i.data<scalar_t>());
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:246:3: note: 'data<float>' has been explicitly marked deprecated here
246 | C10_DEPRECATED_MESSAGE("Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead.")
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/c10/util/Deprecated.h:24:43: note: expanded from macro 'C10_DEPRECATED_MESSAGE'
24 | #define C10_DEPRECATED_MESSAGE(message) [[deprecated(message)]]
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:451:44: warning: 'data' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
451 | num_blocks, w_l2_i.data<scalar_t>(), u_l2_i.data<scalar_t>());
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:247:7: note: 'data' has been explicitly marked deprecated here
247 | T * data() const {
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:451:44: warning: 'data<float>' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
451 | num_blocks, w_l2_i.data<scalar_t>(), u_l2_i.data<scalar_t>());
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:246:3: note: 'data<float>' has been explicitly [[deprecated(message)]]
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:471:32: warning: 'data' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
471 | u_l2_i.data<scalar_t>(),
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:247:7: note: 'data' has been explicitly marked deprecated here
247 | T * data() const {
| ^
csrc/lamb/fused_lamb_hip_kernel.hip:471:32: warning: 'data<float>' is deprecated: Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead. [-Wdeprecated-declarations]
471 | u_l2_i.data<scalar_t>(),
| ^
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/ATen/core/TensorBody.h:246:3: note: 'data<float>' has been explicitly marked deprecated here
246 | C10_DEPRECATED_MESSAGE("Tensor.data<T>() is deprecated. Please use Tensor.data_ptr<T>() instead.")
| ^
/lustrds_benchmarks/DeepSpeed/csrc/transformer/inference/includes -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/csrc/includes -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-DBF16_AVAILABLE -DROCM_WAVEFRONT_SIZE=64 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1016\" -DTORCH_EXTENSION_NAME=kernelsinference_core_ops -D_GLIBCXX_USE_CXX11_ABI=1 --offload-arch=gfx942 -fno-gpu-rdc
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:367:12: error: use of undeclared identifier '__ll2bfloat16_rn'; did you mean '__ll2float_rn'?
367 | return __ll2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~
| __ll2float_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:576:32: note: '__ll2float_rn' declared here
576 | __device__ static inline float __ll2float_rn(long long int x) { return (float)x; }
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:372:12: error: use of undeclared identifier '__int2bfloat16_rn'; did you mean '__int2float_rn'?
372 | return __int2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~~
| __int2float_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:545:32: note: '__int2float_rn' declared here
545 | __device__ static inline float __int2float_rn(int x) { return (float)x; }
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:377:12: error: use of undeclared identifier '__short2bfloat16_rn'; did you mean '__float22bfloat162_rn'?
377 | return __short2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~~~~
| __float22bfloat162_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_bf16.h:574:45: note: '__float22bfloat162_rn' declared here
574 | __BF16_HOST_DEVICE_STATIC__ __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:377:32: error: no viable conversion from 'int16_t' (aka 'short') to 'float2' (aka 'HIP_vector_type<float, 2>')
377 | return __short2bfloat16_rn(val);
| ^~~
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:471:9: note: candidate constructor not viable: no known conversion from 'int16_t' (aka 'short') to 'const HIP_vector_type<float, 2> &' for 1st argument
471 | HIP_vector_type(const HIP_vector_type&) = default;
| ^ ~~~~~~~~~~~~~~~~~~~~~~
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:474:9: note: candidate constructor not viable: no known conversion from 'int16_t' (aka 'short') to 'HIP_vector_type<float, 2> &&' for 1st argument
474 | HIP_vector_type(HIP_vector_type&&) = default;
| ^ ~~~~~~~~~~~~~~~~~
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:466:9: note: candidate template ignored: requirement 'sizeof...(Us) == 2U' was not satisfied [with Us = <int16_t>]
466 | HIP_vector_type(Us... xs) noexcept
| ^
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:457:9: note: explicit constructor is not a candidate
457 | HIP_vector_type(U x_) noexcept
| ^
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_bf16.h:574:80: note: passing argument to parameter 'a' here
574 | __BF16_HOST_DEVICE_STATIC__ __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:382:12: error: use of undeclared identifier '__int2bfloat16_rn'; did you mean '__int2float_rn'?
382 | return __int2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~~
| __int2float_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:545:32: note: '__int2float_rn' declared here
545 | __device__ static inline float __int2float_rn(int x) { return (float)x; }
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:387:12: error: use of undeclared identifier '__ull2bfloat16_rn'; did you mean '__ull2float_rn'?
387 | return __ull2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~~
| __ull2float_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:629:32: note: '__ull2float_rn' declared here
629 | __device__ static inline float __ull2float_rn(unsigned long long int x) { return (float)x; }
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:392:12: error: use of undeclared identifier '__uint2bfloat16_rn'; did you mean '__uint2float_rn'?
392 | return __uint2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~~~
| __uint2float_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:598:32: note: '__uint2float_rn' declared here
598 | __device__ static inline float __uint2float_rn(unsigned int x) { return (float)x; }
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:397:12: error: use of undeclared identifier '__ushort2bfloat16_rn'; did you mean '__float22bfloat162_rn'?
397 | return __ushort2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~~~~~
| __float22bfloat162_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_bf16.h:574:45: note: '__float22bfloat162_rn' declared here
574 | __BF16_HOST_DEVICE_STATIC__ __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:397:33: error: no viable conversion from 'uint16_t' (aka 'unsigned short') to 'float2' (aka 'HIP_vector_type<float, 2>')
397 | return __ushort2bfloat16_rn(val);
| ^~~
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:471:9: note: candidate constructor not viable: no known conversion from 'uint16_t' (aka 'unsigned short') to 'const HIP_vector_type<float, 2> &' for 1st argument
471 | HIP_vector_type(const HIP_vector_type&) = default;
| ^ ~~~~~~~~~~~~~~~~~~~~~~
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:474:9: note: candidate constructor not viable: no known conversion from 'uint16_t' (aka 'unsigned short') to 'HIP_vector_type<float, 2> &&' for 1st argument
474 | HIP_vector_type(HIP_vector_type&&) = default;
| ^ ~~~~~~~~~~~~~~~~~
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:466:9: note: candidate template ignored: requirement 'sizeof...(Us) == 2U' was not satisfied [with Us = <uint16_t>]
466 | HIP_vector_type(Us... xs) noexcept
| ^
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_vector_types.h:457:9: note: explicit constructor is not a candidate
457 | HIP_vector_type(U x_) noexcept
| ^
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_bf16.h:574:80: note: passing argument to parameter 'a' here
574 | __BF16_HOST_DEVICE_STATIC__ __hip_bfloat162 __float22bfloat162_rn(const float2 a) {
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:402:12: error: use of undeclared identifier '__uint2bfloat16_rn'; did you mean '__uint2float_rn'?
402 | return __uint2bfloat16_rn(val);
| ^~~~~~~~~~~~~~~~~~
| __uint2float_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:598:32: note: '__uint2float_rn' declared here
598 | __device__ static inline float __uint2float_rn(unsigned int x) { return (float)x; }
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:416:12: error: use of undeclared identifier '__float2bfloat162_rn'; did you mean '__float22bfloat162_rn'?
416 | return __float2bfloat162_rn(val);
| ^~~~~~~~~~~~~~~~~~~~
| __float22bfloat162_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_hip_bf16.h:574:45: note:
| __float2int_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:473:30: note: '__float2int_rn' declared here
473 | __device__ static inline int __float2int_rn(float x) { return (int)__ocml_rint_f32(x); }
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:556:12: error: use of undeclared identifier '__bfloat162ull_rn'; did you mean '__float2ull_rn'?
556 | return __bfloat162ull_rn(val);
| ^~~~~~~~~~~~~~~~~
| __float2ull_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:502:49: note: '__float2ull_rn' declared here
502 | __device__ static inline unsigned long long int __float2ull_rn(float x) {
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:583:12: error: use of undeclared identifier '__bfloat162uint_rn'; did you mean '__float2uint_rn'?
583 | return __bfloat162uint_rn(val);
| ^~~~~~~~~~~~~~~~~~
| __float2uint_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:491:39: note: '__float2uint_rn' declared here
491 | __device__ static inline unsigned int __float2uint_rn(float x) {
| ^
In file included from deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip:10:
/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes/conversion_utils_hip.h:610:12: error: use of undeclared identifier '__bfloat162uint_rn'; did you mean '__float2uint_rn'?
610 | return __bfloat162uint_rn(val);
| ^~~~~~~~~~~~~~~~~~
| __float2uint_rn
/opt/rocm-6.3.3/lib/llvm/bin/../../../include/hip/amd_detail/amd_device_functions.h:491:39: note: '__float2uint_rn' declared here
491 | __device__ static inline unsigned int __float2uint_rn(float x) {
| ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated when compiling for gfx942.
failed to execute:/opt/rocm-6.3.3/lib/llvm/bin/clang++ --offload-arch=gfx942 -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/core_ops/bias_activations -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/core_ops/blas_kernels -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/core_ops/cuda_layer_norm -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/core_ops/cuda_rms_norm -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/core_ops/gated_activations -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/core_ops/cuda_linear -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/DeepSpeed/deepspeed/inference/v2/kernels/includes -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/lib/python3.11/site-packages/torch/include/THH -I/opt/rocm-6.3.3/include -I/lustre/hpe/ws13/ws13.a/ws/hpchpate-ds_benchmarks/venv-deepspeed/include -I/opt/cray/pe/python/3.11.7/include/python3.11 -c -x hip deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.hip -o "/tmp/tmpvj8fusoa.build-temp/deepspeed/inference/v2/kernels/core_ops/bias_activations/bias_activation_hip.o" -fPIC -D__HIP_PLATFORM_AMD__=1 -DUSE_ROCM=1 -DHIPBLAS_V2 -DCUDA_HAS_FP16=1 -D__HIP_NO_HALF_OPERATORS__=1 -D__HIP_NO_HALF_CONVERSIONS__=1 -O3 -std=c++17 -U__HIP_NO_HALF_OPERATORS__ -U__HIP_NO_HALF_CONVERSIONS__ -U__HIP_NO_HALF2_OPERATORS__ -DROCM_VERSION_MAJOR=6 -DROCM_VERSION_MINOR=3 -DBF16_AVAILABLE -DROCM_WAVEFRONT_SIZE=64 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1016\" -DTORCH_EXTENSION_NAME=kernelsinference_core_ops -D_GLIBCXX_USE_CXX11_ABI=1 -fno-gpu-rdc
error: command '/opt/rocm-6.3.3/bin/hipcc' failed with exit code 1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building editable for deepspeed
Failed to build deepspeed
[notice] A new release of pip is available: 25.3 -> 26.0.1
[notice] To update, run: pip install --upgrade pip
error: failed-wheel-build-for-install
× Failed to build installable wheels for some pyproject.toml based projects
╰─> deepspeed
There seems to be some issue with some variable types that have maybe different naming convention for hip. Could you please look into this and let me know possible solution.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels