-
Notifications
You must be signed in to change notification settings - Fork 25.1k
BFloat16 support for torch.sort #58196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 2b33ff1 (more details on the Dr. CI page):
2 failures not recognized by patterns:
🚧 1 fixed upstream failure:These were probably caused by upstream breakages that were already fixed.
Please rebase on the
|
template <> | ||
struct cub::FpLimits<c10::BFloat16> | ||
{ | ||
static __host__ __device__ __forceinline__ c10::BFloat16 Max() { | ||
unsigned short max_word = 0x7F7F; | ||
return reinterpret_cast<c10::BFloat16&>(max_word); | ||
} | ||
|
||
static __host__ __device__ __forceinline__ c10::BFloat16 Lowest() { | ||
unsigned short lowest_word = 0xFF7F; | ||
return reinterpret_cast<c10::BFloat16&>(lowest_word); | ||
} | ||
}; | ||
|
||
template <> struct cub::NumericTraits<c10::BFloat16>: cub::BaseTraits<cub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeffdaily Is there any way to make this hack work on ROCm? Looks like in https://github.com/ROCmSoftwarePlatform/cub-hip/blob/hip_port_1.7.4/cub/util_type.cuh#L1102-L1194, there is no specialization for __half
, so I think specializing for c10::BFloat16
won't work either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I was looking at an obsolete repo. The new repo does have __half
specialization. But still, ROCm CI is failing:
May 13 02:15:01 In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/hip/Randperm.hip:6:
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:89:18: error: explicit specialization of non-template struct 'FpLimits'
May 13 02:15:01 struct ::hipcub::FpLimits<c10::BFloat16>
May 13 02:15:01 ^ ~~~~~~~~~~~~~~~
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:89:18: error: no struct named 'FpLimits' in namespace 'hipcub'
May 13 02:15:01 struct ::hipcub::FpLimits<c10::BFloat16>
May 13 02:15:01 ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:89:18: error: cannot define or redeclare 'FpLimits' here because namespace 'detail' does not enclose namespace 'hipcub'
May 13 02:15:01 struct ::hipcub::FpLimits<c10::BFloat16>
May 13 02:15:01 ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:30: error: explicit specialization of non-template struct 'NumericTraits'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01 ^ ~~~~~~~~~~~~~~~
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:30: error: no struct named 'NumericTraits' in namespace 'hipcub'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01 ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:30: error: cannot define or redeclare 'NumericTraits' here because namespace 'detail' does not enclose namespace 'hipcub'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01 ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:70: error: unknown template name 'BaseTraits'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01 ^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:91: error: no member named 'FLOATING_POINT' in namespace 'hipcub'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01 ~~~~~~~~~~^
May 13 02:15:01 8 errors generated when compiling for gfx900.
May 13 02:15:01 CMake Error at torch_hip_generated_Randperm.hip.o.cmake:192 (message):
May 13 02:15:01 Error generating file
May 13 02:15:01 /var/lib/jenkins/workspace/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_Randperm.hip.o
May 13 02:15:01
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zasdfgbnm ,
Can you please let us know how the compilation problem was fixed for ROCm build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not solved, bfloat16 is not compiled for rocm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, that makes sense. It's okay to skip bfloat16 radix sort for ROCm. We'll look into adding bf16 support for radix sort into hipcub/rocprim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internal issue to track adding bf16 support for radix sort into hipcub/rocprim: SWDEV-288399
Can you also please remove fillSliceWithIndex_kernel from SortingCommon.cuh? I don't think it's used anywhere. |
Summary: Pull Request resolved: pytorch#58196 Reviewed By: anjali411 Differential Revision: D28721364 Pulled By: ngimel fbshipit-source-id: 0785f7100fb76d69da7a73022c7d2eb43c91fa6e
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: #71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: #71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196) cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd Pull Request resolved: pytorch/pytorch#71226 Reviewed By: malfet Differential Revision: D34152115 Pulled By: seemethere fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f (cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
Summary: The changes add support for dtype BF16 for sort operator in ROCm. Relates - #58196 Relanding the change - #71226 jeffdaily jithunnair-amd dllehr-amd Please review this PR. Pull Request resolved: #72854 Reviewed By: zou3519 Differential Revision: D34284313 Pulled By: malfet fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
Summary: The changes add support for dtype BF16 for sort operator in ROCm. Relates - #58196 Relanding the change - #71226 jeffdaily jithunnair-amd dllehr-amd Please review this PR. Pull Request resolved: #72854 Reviewed By: zou3519 Differential Revision: D34284313 Pulled By: malfet fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b (cherry picked from commit e9e7e3e)
Summary: The changes add support for dtype BF16 for sort operator in ROCm. Relates - pytorch/pytorch#58196 Relanding the change - pytorch/pytorch#71226 jeffdaily jithunnair-amd dllehr-amd Please review this PR. Pull Request resolved: pytorch/pytorch#72854 Reviewed By: zou3519 Differential Revision: D34284313 Pulled By: malfet fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b (cherry picked from commit e9e7e3e0472b726ff2fd5f115962d3c835fb33db)
Summary: The changes add support for dtype BF16 for sort operator in ROCm. Relates - pytorch/pytorch#58196 Relanding the change - pytorch/pytorch#71226 jeffdaily jithunnair-amd dllehr-amd Please review this PR. Pull Request resolved: pytorch/pytorch#72854 Reviewed By: zou3519 Differential Revision: D34284313 Pulled By: malfet fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b (cherry picked from commit e9e7e3e0472b726ff2fd5f115962d3c835fb33db)
No description provided.