Skip to content

Conversation

zasdfgbnm
Copy link
Collaborator

No description provided.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 12, 2021

💊 CI failures summary and remediations

As of commit 2b33ff1 (more details on the Dr. CI page):


  • 2/3 failures introduced in this PR
  • 1/3 broken upstream at merge base 60af6e9 on May 25 from 12:31am to 8:40pm

2 failures not recognized by patterns:

Job Step Action
GitHub Actions Linux CI (pytorch-linux-xenial-py3.6-gcc5.4) / test Clean up docker images 🔁 rerun
CircleCI docker-pytorch-linux-bionic-cuda10.2-cudnn7-py3.6-clang9 Check if image should be built 🔁 rerun

🚧 1 fixed upstream failure:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is older than viable/strict, run these commands:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@zasdfgbnm zasdfgbnm marked this pull request as ready for review May 13, 2021 00:42
Comment on lines +65 to +79
template <>
struct cub::FpLimits<c10::BFloat16>
{
static __host__ __device__ __forceinline__ c10::BFloat16 Max() {
unsigned short max_word = 0x7F7F;
return reinterpret_cast<c10::BFloat16&>(max_word);
}

static __host__ __device__ __forceinline__ c10::BFloat16 Lowest() {
unsigned short lowest_word = 0xFF7F;
return reinterpret_cast<c10::BFloat16&>(lowest_word);
}
};

template <> struct cub::NumericTraits<c10::BFloat16>: cub::BaseTraits<cub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffdaily Is there any way to make this hack work on ROCm? Looks like in https://github.com/ROCmSoftwarePlatform/cub-hip/blob/hip_port_1.7.4/cub/util_type.cuh#L1102-L1194, there is no specialization for __half, so I think specializing for c10::BFloat16 won't work either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I was looking at an obsolete repo. The new repo does have __half specialization. But still, ROCm CI is failing:

May 13 02:15:01 In file included from /var/lib/jenkins/workspace/aten/src/ATen/native/hip/Randperm.hip:6:
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:89:18: error: explicit specialization of non-template struct 'FpLimits'
May 13 02:15:01 struct ::hipcub::FpLimits<c10::BFloat16>
May 13 02:15:01                  ^       ~~~~~~~~~~~~~~~
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:89:18: error: no struct named 'FpLimits' in namespace 'hipcub'
May 13 02:15:01 struct ::hipcub::FpLimits<c10::BFloat16>
May 13 02:15:01        ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:89:18: error: cannot define or redeclare 'FpLimits' here because namespace 'detail' does not enclose namespace 'hipcub'
May 13 02:15:01 struct ::hipcub::FpLimits<c10::BFloat16>
May 13 02:15:01        ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:30: error: explicit specialization of non-template struct 'NumericTraits'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01                              ^            ~~~~~~~~~~~~~~~
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:30: error: no struct named 'NumericTraits' in namespace 'hipcub'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01                    ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:30: error: cannot define or redeclare 'NumericTraits' here because namespace 'detail' does not enclose namespace 'hipcub'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01                    ~~~~~~~~~~^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:70: error: unknown template name 'BaseTraits'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01                                                                      ^
May 13 02:15:01 /var/lib/jenkins/workspace/aten/src/ATen/hip/cub.cuh:102:91: error: no member named 'FLOATING_POINT' in namespace 'hipcub'
May 13 02:15:01 template <> struct ::hipcub::NumericTraits<c10::BFloat16>: ::hipcub::BaseTraits<::hipcub::FLOATING_POINT, true, false, unsigned short, c10::BFloat16> {};
May 13 02:15:01                                                                                 ~~~~~~~~~~^
May 13 02:15:01 8 errors generated when compiling for gfx900.
May 13 02:15:01 CMake Error at torch_hip_generated_Randperm.hip.o.cmake:192 (message):
May 13 02:15:01   Error generating file
May 13 02:15:01   /var/lib/jenkins/workspace/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/hip/./torch_hip_generated_Randperm.hip.o
May 13 02:15:01 

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zasdfgbnm ,
Can you please let us know how the compilation problem was fixed for ROCm build.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not solved, bfloat16 is not compiled for rocm.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that makes sense. It's okay to skip bfloat16 radix sort for ROCm. We'll look into adding bf16 support for radix sort into hipcub/rocprim.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal issue to track adding bf16 support for radix sort into hipcub/rocprim: SWDEV-288399

@zasdfgbnm zasdfgbnm requested a review from ngimel May 13, 2021 00:45
@agolynski agolynski added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 13, 2021
@ngimel
Copy link
Collaborator

ngimel commented May 13, 2021

Can you also please remove fillSliceWithIndex_kernel from SortingCommon.cuh? I don't think it's used anywhere.

@zasdfgbnm zasdfgbnm deleted the sort-bfloat16 branch May 27, 2021 16:48
deniskokarev pushed a commit to deniskokarev/pytorch that referenced this pull request Jun 9, 2021
Summary: Pull Request resolved: pytorch#58196

Reviewed By: anjali411

Differential Revision: D28721364

Pulled By: ngimel

fbshipit-source-id: 0785f7100fb76d69da7a73022c7d2eb43c91fa6e
facebook-github-bot pushed a commit that referenced this pull request Jun 23, 2021
Summary:
Fixes #56176 via #58196

CC zasdfgbnm ngimel ptrblck

Pull Request resolved: #59977

Reviewed By: mrshenli

Differential Revision: D29315018

Pulled By: ngimel

fbshipit-source-id: 0a87e7f155a97225fc6b2ec5dc0dc38a23156b41
facebook-github-bot referenced this pull request Feb 14, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: #71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
pytorchmergebot referenced this pull request Feb 14, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: #71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c)
cyyever referenced this pull request in cyyever/pytorch_private Feb 15, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 15, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 15, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 16, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 16, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 17, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 17, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 17, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 17, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 17, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 17, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 17, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 20, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 20, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 20, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
cyyever referenced this pull request in cyyever/pytorch_private Feb 21, 2022
Summary:
Related to [https://github.com/pytorch/pytorch/issues/58196](https://github.com/pytorch/pytorch/pull/58196)

cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH amathews-amd

Pull Request resolved: pytorch/pytorch#71226

Reviewed By: malfet

Differential Revision: D34152115

Pulled By: seemethere

fbshipit-source-id: 53841c91976bdb5a0002362f22a54ec23aa2f78f
(cherry picked from commit 963027c7f28cf20e1c4e5722eb62b5629e735a8e)
facebook-github-bot pushed a commit that referenced this pull request Feb 23, 2022
Summary:
The changes add support for dtype BF16 for sort operator in ROCm.

Relates - #58196

Relanding the change - #71226

jeffdaily jithunnair-amd dllehr-amd Please review this PR.

Pull Request resolved: #72854

Reviewed By: zou3519

Differential Revision: D34284313

Pulled By: malfet

fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
pytorchmergebot pushed a commit that referenced this pull request Feb 23, 2022
Summary:
The changes add support for dtype BF16 for sort operator in ROCm.

Relates - #58196

Relanding the change - #71226

jeffdaily jithunnair-amd dllehr-amd Please review this PR.

Pull Request resolved: #72854

Reviewed By: zou3519

Differential Revision: D34284313

Pulled By: malfet

fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
(cherry picked from commit e9e7e3e)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 3, 2022
Summary:
The changes add support for dtype BF16 for sort operator in ROCm.

Relates - pytorch/pytorch#58196

Relanding the change - pytorch/pytorch#71226

jeffdaily jithunnair-amd dllehr-amd Please review this PR.

Pull Request resolved: pytorch/pytorch#72854

Reviewed By: zou3519

Differential Revision: D34284313

Pulled By: malfet

fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
(cherry picked from commit e9e7e3e0472b726ff2fd5f115962d3c835fb33db)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 3, 2022
Summary:
The changes add support for dtype BF16 for sort operator in ROCm.

Relates - pytorch/pytorch#58196

Relanding the change - pytorch/pytorch#71226

jeffdaily jithunnair-amd dllehr-amd Please review this PR.

Pull Request resolved: pytorch/pytorch#72854

Reviewed By: zou3519

Differential Revision: D34284313

Pulled By: malfet

fbshipit-source-id: abcfea84ea53874008d56416425849e990ebf15b
(cherry picked from commit e9e7e3e0472b726ff2fd5f115962d3c835fb33db)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants