Bool Tensor for CUDA #18166

izdeby · 2019-03-19T02:32:23Z

Stack from ghstack:

Bool Tensor for CUDA #18166 Bool Tensor for CUDA
Resolving comments from Bool Tensor for CPU PR #18165 Resolved comments from Bool Tensor for CPU PR

This PR enables bool tensor creation and some basic operations for the CPU backend. This is a part of Bool Tensor feature implementation work. The whole plan looks like this:

Storage Implementation [Done]
Tensor Creation.
a) CPU [Done]
b) CUDA [This PR]
Tensor Conversions.
Tensor Indexing.
Tensor Operations.
Back compatibility related changes.

Change:
Enable bool tensor in CUDA with the following operations:

torch.zeros
torch.tensor
torch.ones
torch.rand/rand_like/randint/randint_like
torch.full
torch.full_like
torch.empty
torch.empty_like

Tested via unit tests and local scripts.

Differential Revision: D14605104

aten/src/ATen/AccumulateType.h

izdeby · 2019-03-19T21:01:57Z

@gchanan @ezyang, please take a look once you have time.
Please note:

i enabled 'cat' for bool as test_print was failing in test_torch and i've figured out that we probably need it for bool.
GENERATE_KERNEL2 related changes are needed as 'generate_uniform' is being used in 'uniform' for THC random.

aten/src/ATen/gen.py

ezyang · 2019-03-20T14:46:12Z

aten/src/ATen/native/cuda/Copy.cu

+  static void apply(Tensor& dst, const Tensor& src) {
+    CUDA_tensor_apply2<dst_T, bool>(
+      dst, src, [] __device__(dst_T & dst_val, const bool& src_val) {
+        dst_val = static_cast<dst_T>(static_cast<native::inter_copy_type_t<dst_T>>(src_val));


Err, was this specialization necessary because __ldg doesn't have a bool overload?

That might be true, but it's pretty shocking to me. It's definitely better if we keep __ldg here. Maybe this means we should read it in as a uint8_t, cast it to bool, and then do the final static cast to destination?

Should we just implement __ldg for bool like we did for Half?

I'd probably wait until there is another case.

I don't think there's a guarantee that sizeof(bool) == sizeof(uint8_t), so you cannot read it as uint8_t?

aten/src/THC/THCGenerateBoolType.h

aten/src/THC/generic/THCTensorMath.cu

c10/util/Half.h

test/test_torch.py

ezyang

Patch seems basically reasonable. However, there are some major issues.

aten/src/ATen/Declarations.cwrap

aten/src/ATen/function_wrapper.py

aten/src/ATen/gen.py

aten/src/ATen/native/native_functions.yaml

aten/src/THC/generic/THCTensorMath.h

c10/util/Half.h

test/test_torch.py

izdeby · 2019-04-02T14:25:02Z

@pytorchbot retest this please

izdeby · 2019-04-02T15:10:07Z

@pytorchbot retest this please

izdeby · 2019-04-02T18:01:23Z

@pytorchbot retest this please

izdeby · 2019-04-02T18:52:16Z

@pytorchbot retest this please

Summary: Pull Request resolved: pytorch/pytorch#18166 ghimport-source-id: a8e2ba2d966e49747a55701c4f6863c5e24d6f14 Stack from [ghstack](https://github.com/ezyang/ghstack): * **#18166 Bool Tensor for CUDA** * #18165 Resolved comments from Bool Tensor for CPU PR ------ This PR enables bool tensor creation and some basic operations for the CPU backend. This is a part of Bool Tensor feature implementation work. The whole plan looks like this: 1. Storage Implementation [Done] 2. Tensor Creation. a) CPU [Done] b) CUDA [This PR] 3. Tensor Conversions. 4. Tensor Indexing. 5. Tensor Operations. 6. Back compatibility related changes. Change: Enable bool tensor in CUDA with the following operations: torch.zeros torch.tensor torch.ones torch.rand/rand_like/randint/randint_like torch.full torch.full_like torch.empty torch.empty_like Tested via unit tests and local scripts. Differential Revision: D14605104 fbshipit-source-id: b7d7340a7d70edd03a109222d271e68becba762c

facebook-github-bot · 2019-04-03T04:17:39Z

This pull request has been merged in b832b99.

Summary: Pull Request resolved: #18505 ghimport-source-id: f3c9b92 Stack from [ghstack](https://github.com/ezyang/ghstack): * **#18505 [WIP]Added numpy conversion** * #18166 Bool Tensor for CUDA Differential Revision: D14646403 fbshipit-source-id: 79d39d692c778ce1981c1d35b1c33e3d93111041

Summary: Pull Request resolved: #18583 ghimport-source-id: 2b19414 Stack from [ghstack](https://github.com/ezyang/ghstack): * **#18583 Added indexing for bool tensors and bool Indices** * #18505 Added numpy conversion * #18166 Bool Tensor for CUDA ----------- This PR enables bool tensor indexing and indexing with bool indices. This is a part of Bool Tensor feature implementation work. The whole plan looks like this: 1. Storage Implementation [Done] 2. Tensor Creation. a) CPU [Done] b) CUDA [In review] 3. Tensor Conversions. [In review] 4. Tensor Indexing. [This PR] 5. Tensor Operations. 6. Back compatibility related changes. TODO: as a follow up, we should move nonzero method from TH to Aten to make code cleaner. Change: ``` v = torch.tensor([True, False, True], dtype=torch.bool) boolIndices = torch.tensor([True, False, False], dtype=torch.bool) v[boolIndices] -> tensor([True], dtype=torch.bool) v = torch.randn(5, 7, 3) boolIndices = torch.tensor([True, False, True, True, False], dtype=torch.bool) v[boolIndices] -> tensor([[[ 0.5885, -0.3322, 0.7388], [ 1.1182, 0.7808, -1.1492], [-0.7952, 0.5255, -0.0251], [ 0.7128, 0.8099, 1.2689], [-0.7018, -1.4733, -0.3732], [ 0.4503, 0.4986, -1.1605], [ 0.3348, -1.3767, -0.2976]], [[-2.0303, -0.4720, -0.1448], [-0.1914, -0.6821, 2.0061], [-1.0420, -0.1872, -0.3438], [ 1.7587, -0.4183, -0.7577], [ 1.0094, -0.1950, -0.2430], [ 0.1174, 0.3308, -0.5700], [ 0.1110, -0.2714, 1.3006]], [[-0.1946, -1.4747, -0.4650], [-1.0567, 1.0110, -0.2809], [ 0.3729, -0.5699, 0.0815], [-0.7733, -0.8316, 0.1674], [ 1.2000, -0.3745, -1.1679], [ 1.7105, 0.9851, -0.1907], [-1.1077, 0.2086, -0.0548]]]) ``` Differential Revision: D14673403 fbshipit-source-id: 2b88ec2c7eb26a4f5ef64f8707fb68068d476fc9

Bool Tensor for CUDA

b8eb688

izdeby mentioned this pull request Mar 19, 2019

Resolving comments from Bool Tensor for CPU PR #18165

Closed

izdeby changed the title ~~Bool Tensor for CUDA~~ [WIP] Bool Tensor for CUDA Mar 19, 2019

Update on "[WIP] Bool Tensor for CUDA"

15f779a

izdeby changed the title ~~[WIP] Bool Tensor for CUDA~~ Bool Tensor for CUDA Mar 19, 2019

izdeby requested review from ezyang and gchanan March 19, 2019 04:04

gchanan reviewed Mar 19, 2019

View reviewed changes

aten/src/ATen/AccumulateType.h Outdated Show resolved Hide resolved

izdeby changed the title ~~Bool Tensor for CUDA~~ [WIP]Bool Tensor for CUDA Mar 19, 2019

izdeby added 2 commits March 19, 2019 09:54

Update on "[WIP]Bool Tensor for CUDA"

9939c0c

Update on "[WIP]Bool Tensor for CUDA"

3dc49f4

izdeby changed the title ~~[WIP]Bool Tensor for CUDA~~ Bool Tensor for CUDA Mar 19, 2019

izdeby requested a review from gchanan March 19, 2019 23:25