[CUDA] `at::native::countRadixUsingMask` misuses `__activemask` intrinsic #98157

JackWolfard · 2023-04-02T02:42:35Z

🐛 Describe the bug

Within at::native::countRadixUsingMask, the __activemask intrinsic is used to conduct a warp vote (__ballot_sync(__activemask(), vote)) between the threads counting the distribution of the radix within the input data.

Since at least CUDA 9, __activemask should not be used to determine which threads are along the same execution path as detailed in this NVIDIA blog post. Since there is not a guarantee all threads within the same execution path are in the same active thread group, the distribution count results can be off resulting in an wrong assumption of unique results in at::native::findPattern which leads to a data race. Can affect topk, kthvalue, and median operators since all three rely on at::native::radixSelect.

The issue is hard to reproduce since even though CUDA does not guarantee threads in the same execution path operate within the same active thread group, in practice, this divergence is rare.

Example crash when using topk

import torch
x = torch.tensor([0, 1, 2]).cuda()
x.topk(1)

aten/src/ATen/native/cuda/TensorTopK.cu:144:gatherTopK: assertion: outputSliceSize >= writeIndexStar

Versions

Affects all versions since at least #17544

cc @ngimel

The text was updated successfully, but these errors were encountered:

Fixes #98157 Pull Request resolved: #98159 Approved by: https://github.com/ngimel, https://github.com/kit1980

JackWolfard added a commit to JackWolfard/pytorch that referenced this issue Apr 2, 2023

Fix misuse of active mask (pytorch#98157)

c80e4be

JackWolfard added a commit to JackWolfard/pytorch that referenced this issue Apr 2, 2023

Fix misuse of active mask (pytorch#98157)

828a4f6

JackWolfard mentioned this issue Apr 2, 2023

Fix misuse of active mask (#98157) #98159

Closed

lezcano added module: cuda Related to torch.cuda, and CUDA support in general triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 3, 2023

JackWolfard added a commit to JackWolfard/pytorch that referenced this issue Apr 6, 2023

Fix misuse of active mask (pytorch#98157)

703304c

JackWolfard added a commit to JackWolfard/pytorch that referenced this issue Apr 6, 2023

Fix misuse of active mask (pytorch#98157)

bb6c937

pytorchmergebot pushed a commit to JackWolfard/pytorch that referenced this issue Apr 11, 2023

Fix misuse of active mask (pytorch#98157)

5b4e898

pytorchmergebot closed this as completed in 1ff0a03 Apr 11, 2023

ZainRizvi pushed a commit that referenced this issue Apr 19, 2023

Fix misuse of active mask (#98157) (#98159)

74284d7

Fixes #98157 Pull Request resolved: #98159 Approved by: https://github.com/ngimel, https://github.com/kit1980

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CUDA] `at::native::countRadixUsingMask` misuses `__activemask` intrinsic #98157

[CUDA] `at::native::countRadixUsingMask` misuses `__activemask` intrinsic #98157

JackWolfard commented Apr 2, 2023 •

edited by pytorch-bot bot

[CUDA] at::native::countRadixUsingMask misuses __activemask intrinsic #98157

[CUDA] at::native::countRadixUsingMask misuses __activemask intrinsic #98157

Comments

JackWolfard commented Apr 2, 2023 • edited by pytorch-bot bot

🐛 Describe the bug

Versions

[CUDA] `at::native::countRadixUsingMask` misuses `__activemask` intrinsic #98157

[CUDA] `at::native::countRadixUsingMask` misuses `__activemask` intrinsic #98157

JackWolfard commented Apr 2, 2023 •

edited by pytorch-bot bot