Fix Error with torch.flip() for cuda tensors when dims=() #50325

dheerajgattupalli · 2021-01-09T16:17:52Z

The method flip_check_errors was being called in cuda file which had a condition to throw an exception for when dims size is <=0 changed that to <0 and added seperate condition for when equal to zero to return from the method... the return was needed because after this point the method was performing check expecting a non-zero size dims ...

Also removed the comment/condition written to point to the issue

@mruberry @kshitij12345 please review this once

facebook-github-bot · 2021-01-09T16:18:02Z

💊 CI failures summary and remediations

As of commit 9c1ed86 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 8 times.

dheerajgattupalli · 2021-01-09T16:37:39Z

Hi,
is it ok if I make one more commit for fixing the tab issue or is it better to close this and a create new cleaner PR?

I think the test_tensorexpr.py file was also updated ... so I have to remove that change too... Sorry maybe this needs to be closed i will try and create a cleaner PR...

kshitij12345

Hi @dheerajgattupalli,

Changes look good.

I think the test_tensorexpr.py file was also updated ... so I have to remove that change too... Sorry maybe this needs to be closed i will try and create a cleaner PR...

I don't think you need to make any changes in test_tensorexpr.py (maybe I could be wrong).
It is fine to push fixes in this PR itself. No worries about that.

I forgot to mention that you'll also have to update the code here.

pytorch/torch/testing/_internal/common_methods_invocations.py

Lines 642 to 657 in d4c1684

    
           def sample_inputs_flip(op_info, device, dtype, requires_grad): 
        
               tensors = ( 
        
                   make_tensor((S, M, S), device, dtype, low=None, high=None, requires_grad=requires_grad), 
        
                   make_tensor((S, 0, M), device, dtype, low=None, high=None, requires_grad=requires_grad) 
        
               ) 
        
               dims = ((0, 1, 2), (0,), (0, 2), (-1,)) 
        
               # On CUDA, `dims=()` errors out with IndexError 
        
               # Reference: https://github.com/pytorch/pytorch/issues/49982 
        
               if device == 'cpu': 
        
                   dims = dims + ((),)  # type: ignore 
        
               samples = [SampleInput(tensor, kwargs={'dims': dim}) for tensor, dim in product(tensors, dims)] 
        
               return samples

After fixing that part, you can run the flip test in test_ops.py.

aten/src/ATen/native/TensorTransformations.h

… Removed a left over tab in header file

dheerajgattupalli · 2021-01-09T17:26:14Z

Hi @kshitij12345 ,

Ignore the part about test_tensorexpr.py...yeah it's not needed here i was confusing this with other issue..

Made the changes you mentioned ... now the quick-checks test is also successful... Hopefully, everything is correct now...

Thanks ...

codecov · 2021-01-09T20:36:31Z

Codecov Report

Merging #50325 (9c1ed86) into master (d4c1684) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #50325      +/-   ##
==========================================
- Coverage   80.71%   80.71%   -0.01%     
==========================================
  Files        1904     1904              
  Lines      206686   206684       -2     
==========================================
- Hits       166830   166827       -3     
- Misses      39856    39857       +1

kshitij12345

LGTM! Even all the CI tests are passing. Few minor updates.

@mruberry will review and shepherd the PR.

kshitij12345 · 2021-01-09T21:17:12Z

aten/src/ATen/native/TensorTransformations.h

@@ -10,8 +10,11 @@ namespace at {
 namespace native {

 static inline void flip_check_errors(int64_t total_dims, int64_t flip_dims_size, IntArrayRef dims) {
+  if (flip_dims_size==0){


nit: Formatting if (flip_dims_size == 0) {

kshitij12345 · 2021-01-09T21:18:39Z

torch/testing/_internal/common_methods_invocations.py

-    # Reference: https://github.com/pytorch/pytorch/issues/49982
-    if device == 'cpu':
-        dims = dims + ((),)  # type: ignore
+    dims = dims + ((),)  # type: ignore


You can just do:
dims =((0, 1, 2), (0,), (0, 2), (-1,), ())

mstfbl · 2021-01-10T16:38:00Z

I'm wondering, is there a specific reason why flip_check_errors(int64_t total_dims, int64_t flip_dims_size, IntArrayRef dims) is being called in Tensor flip_cuda(const Tensor& self, IntArrayRef dims) (line 74), but not in Tensor flip_cpu(const Tensor& self, IntArrayRef dims)? It doesn't make sense to me why we're not verifying the number of axis in dim when calling Tensor.flip(dim) in CPU.

pytorch/aten/src/ATen/native/cuda/TensorTransformations.cu

Lines 70 to 83 in f6f0fde

    
           // Flip tensor given a list of dims 
        
           Tensor flip_cuda(const Tensor& self, IntArrayRef dims) { 
        
             auto in_tensor = self; 
        
             const int64_t flip_dims_size = dims.size(), total_dims = in_tensor.dim(), N = in_tensor.numel(); 
        
             flip_check_errors(total_dims, flip_dims_size, dims); 
        
             int64_t block_size = 512; 
        
             dim3 dim_block(block_size); 
        
             dim3 dim_grid((N + block_size - 1) / block_size); 
        
             auto out_tensor = at::empty_like(in_tensor, LEGACY_CONTIGUOUS_MEMORY_FORMAT); 
        
             if (out_tensor.numel() == 0) { 
        
               return out_tensor; 
        
             }

pytorch/aten/src/ATen/native/TensorTransformations.cpp

Lines 48 to 62 in 4774c68

    
           Tensor flip_cpu(const Tensor& self, IntArrayRef dims) { 
        
             auto in_tensor = self; 
        
             const int64_t total_dims = in_tensor.dim(); 
        
             auto flip_dims_b = at::dim_list_to_bitset(dims, total_dims); 
        
             Tensor out_tensor = at::empty_like(in_tensor, LEGACY_CONTIGUOUS_MEMORY_FORMAT); 
        
             // create contiguous strides for input tensor 
        
             auto stride_contiguous_v = std::vector<int64_t>(total_dims); 
        
             for (int64_t i = total_dims - 1; i >= 0; i--) { 
        
               if (i == total_dims - 1) { 
        
                 stride_contiguous_v[i] = 1; 
        
               } else { 
        
                 stride_contiguous_v[i] = std::max<int64_t>(in_tensor.size(i + 1), 1) * stride_contiguous_v[i + 1]; 
        
               } 
        
             }

If this omission is a mistake, I suggest flip_check_errors is called in Tensor flip_cpu as well. This should be an easy change:

Tensor flip_cpu(const Tensor& self, IntArrayRef dims) {
  auto in_tensor = self;
  const int64_t total_dims = in_tensor.dim(), flip_dims_size = dims.size();
  auto flip_dims_b = at::dim_list_to_bitset(dims, total_dims);
  Tensor out_tensor = at::empty_like(in_tensor, LEGACY_CONTIGUOUS_MEMORY_FORMAT);
  flip_check_errors(total_dims, flip_dims_size, dims);

  // create contiguous strides for input tensor
  auto stride_contiguous_v = std::vector<int64_t>(total_dims);
  for (int64_t i = total_dims - 1; i >= 0; i--) {
    if (i == total_dims - 1) {
      stride_contiguous_v[i] = 1;
    } else {
      stride_contiguous_v[i] = std::max<int64_t>(in_tensor.size(i + 1), 1) * stride_contiguous_v[i + 1];
    }
  }

CC @mruberry @kshitij12345 @dheerajgattupalli

dheerajgattupalli · 2021-01-10T17:02:34Z

Hi @mstfbl ,

yeah, i agree ... adding the flip_check_erros method in CPU version also will make it more consistent... I ran the tests available for torch.flip with the change and it didn't cause any issue. will push it if @kshitij12345 and @mruberry also agree...

kshitij12345 · 2021-01-11T06:31:39Z

@mstfbl Good Question! Thanks for looking into it.

@dheerajgattupalli Thanks for trying the change.

From the sample below, we can see that CPU side of code does similar check.

>>> import torch
>>> a = torch.ones((4,3,2,2))
>>> torch.flip(a, (0, 0))  # CPU Repeated Dims
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dim 0 appears multiple times in the list of dims
>>> torch.flip(a.cuda(), (0, 0)) # CUDA Repeated Dims
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: dims has duplicates, original flip dims size=2, but unique flip dims size=1
>>> torch.flip(a, (0, 7))  # CPU Dim out-of-range
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: Dimension out of range (expected to be in range of [-4, 3], but got 7)
>>> torch.flip(a.cuda(), (0, 7)) # CUDA Dim out-of-range
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: The max flip dims out of range, got max flip dims=7

Besides reporting in different format, both do the same checks.

On CPU side,at::dim_list_to_bitset function verifies the relevant checks and also returns the relevant bitset for the flip dims,

pytorch/aten/src/ATen/WrapDimUtilsMulti.h

Lines 15 to 24 in 9d8bd21

    
           static inline std::bitset<dim_bitset_size> dim_list_to_bitset(IntArrayRef dims, int64_t ndims) { 
        
             TORCH_CHECK(ndims <= (int64_t) dim_bitset_size, "only tensors with up to ", dim_bitset_size, " dims are supported"); 
        
             std::bitset<dim_bitset_size> seen; 
        
             for (size_t i = 0; i < dims.size(); i++) { 
        
               size_t dim = maybe_wrap_dim(dims[i], ndims); 
        
               TORCH_CHECK(!seen[dim], "dim ", dim, " appears multiple times in the list of dims"); 
        
               seen[dim] = true; 
        
             } 
        
             return seen; 
        
           }

Also note that there are relevant cases for both devices in the test,

pytorch/test/test_shape_ops.py

Lines 354 to 362 in 9d8bd21

    
           # not allow flip on the same dim more than once 
        
           self.assertRaises(RuntimeError, lambda: data.flip(0, 1, 1)) 
        
           # not allow empty list as input 
        
           self.assertRaises(TypeError, lambda: data.flip()) 
        
           # not allow size of flip dim > total dims 
        
           self.assertRaises(IndexError, lambda: data.flip(0, 1, 2, 3)) 
        
           # not allow dim > max dim 
        
           self.assertRaises(IndexError, lambda: data.flip(3))

So I think it is fine the way it is.

kshitij12345

LGTM! Great job!

@mruberry will review and have the binding approval.

mruberry · 2021-01-11T15:52:13Z

aten/src/ATen/native/TensorTransformations.h

@@ -10,8 +10,11 @@ namespace at {
 namespace native {

 static inline void flip_check_errors(int64_t total_dims, int64_t flip_dims_size, IntArrayRef dims) {


flip_check_errors used to be called in from flip_cpu, too. I believe it was accidentally removed by https://github.com/pytorch/pytorch/pull/13344/files. Could we add it back to flip_cpu?

Oh I see the above discussion suggests these checks may be redundant on CPU and we don't need them. Thanks @kshitij12345 for pointing out they're already tested for.

mruberry · 2021-01-11T15:52:29Z

aten/src/ATen/native/TensorTransformations.h

@@ -10,8 +10,11 @@ namespace at {
 namespace native {

 static inline void flip_check_errors(int64_t total_dims, int64_t flip_dims_size, IntArrayRef dims) {
+  if (flip_dims_size==0) {


if (flip_dims_size == 0) { - spaces around the == operator

should I add this? the pull request is already approved new commit will not have any problem right?

It's OK, you don't need to change it.

mruberry

Thank you for fixing this issue, @dheerajgattupalli, and thank you for reviewing it, @kshitij12345.

facebook-github-bot

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-01-12T13:47:01Z

@mruberry merged this pull request in 314351d.

Fix Error with torch.flip() for cuda tensors when dims=()

c976993

facebook-github-bot added the cla signed label Jan 9, 2021

Replaced a tab with space in a header file

18c9728

pytorchbot added the open source label Jan 9, 2021

kshitij12345 reviewed Jan 9, 2021

View reviewed changes

aten/src/ATen/native/TensorTransformations.h Outdated Show resolved Hide resolved

Removed Condition/Comment in sample_inputs_flip regarding this issue.…

a47054e

… Removed a left over tab in header file

kshitij12345 reviewed Jan 9, 2021

View reviewed changes

Fixed some formatting

9c1ed86

dheerajgattupalli requested a review from kshitij12345 January 10, 2021 04:52

mruberry self-requested a review January 10, 2021 10:33

mruberry added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 10, 2021

kshitij12345 approved these changes Jan 11, 2021

View reviewed changes

mruberry reviewed Jan 11, 2021

View reviewed changes

mruberry approved these changes Jan 11, 2021

View reviewed changes

facebook-github-bot reviewed Jan 11, 2021

View reviewed changes

facebook-github-bot closed this in 314351d Jan 12, 2021

facebook-github-bot added the Merged label Jan 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Error with torch.flip() for cuda tensors when dims=() #50325

Fix Error with torch.flip() for cuda tensors when dims=() #50325

dheerajgattupalli commented Jan 9, 2021 •

edited

Loading

facebook-github-bot commented Jan 9, 2021 •

edited

Loading

dheerajgattupalli commented Jan 9, 2021 •

edited

Loading

kshitij12345 left a comment

dheerajgattupalli commented Jan 9, 2021

codecov bot commented Jan 9, 2021 •

edited

Loading

kshitij12345 left a comment •

edited

Loading

kshitij12345 Jan 9, 2021

kshitij12345 Jan 9, 2021

mstfbl commented Jan 10, 2021 •

edited

Loading

dheerajgattupalli commented Jan 10, 2021 •

edited

Loading

kshitij12345 commented Jan 11, 2021

kshitij12345 left a comment

mruberry Jan 11, 2021

mruberry Jan 11, 2021

mruberry Jan 11, 2021

dheerajgattupalli Jan 11, 2021

mruberry Jan 11, 2021

mruberry left a comment

facebook-github-bot left a comment

facebook-github-bot commented Jan 12, 2021

	def sample_inputs_flip(op_info, device, dtype, requires_grad):
	tensors = (
	make_tensor((S, M, S), device, dtype, low=None, high=None, requires_grad=requires_grad),
	make_tensor((S, 0, M), device, dtype, low=None, high=None, requires_grad=requires_grad)
	)

	dims = ((0, 1, 2), (0,), (0, 2), (-1,))

	# On CUDA, `dims=()` errors out with IndexError
	# Reference: https://github.com/pytorch/pytorch/issues/49982
	if device == 'cpu':
	dims = dims + ((),) # type: ignore

	samples = [SampleInput(tensor, kwargs={'dims': dim}) for tensor, dim in product(tensors, dims)]

	return samples

		@@ -10,8 +10,11 @@ namespace at {
		namespace native {

		static inline void flip_check_errors(int64_t total_dims, int64_t flip_dims_size, IntArrayRef dims) {

Fix Error with torch.flip() for cuda tensors when dims=() #50325

Fix Error with torch.flip() for cuda tensors when dims=() #50325

Conversation

dheerajgattupalli commented Jan 9, 2021 • edited Loading

facebook-github-bot commented Jan 9, 2021 • edited Loading

💊 CI failures summary and remediations

dheerajgattupalli commented Jan 9, 2021 • edited Loading

kshitij12345 left a comment

Choose a reason for hiding this comment

dheerajgattupalli commented Jan 9, 2021

codecov bot commented Jan 9, 2021 • edited Loading

Codecov Report

kshitij12345 left a comment • edited Loading

Choose a reason for hiding this comment

kshitij12345 Jan 9, 2021

Choose a reason for hiding this comment

kshitij12345 Jan 9, 2021

Choose a reason for hiding this comment

mstfbl commented Jan 10, 2021 • edited Loading

dheerajgattupalli commented Jan 10, 2021 • edited Loading

kshitij12345 commented Jan 11, 2021

kshitij12345 left a comment

Choose a reason for hiding this comment

mruberry Jan 11, 2021

Choose a reason for hiding this comment

mruberry Jan 11, 2021

Choose a reason for hiding this comment

mruberry Jan 11, 2021

Choose a reason for hiding this comment

dheerajgattupalli Jan 11, 2021

Choose a reason for hiding this comment

mruberry Jan 11, 2021

Choose a reason for hiding this comment

mruberry left a comment

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Jan 12, 2021

dheerajgattupalli commented Jan 9, 2021 •

edited

Loading

facebook-github-bot commented Jan 9, 2021 •

edited

Loading

dheerajgattupalli commented Jan 9, 2021 •

edited

Loading

codecov bot commented Jan 9, 2021 •

edited

Loading

kshitij12345 left a comment •

edited

Loading

mstfbl commented Jan 10, 2021 •

edited

Loading

dheerajgattupalli commented Jan 10, 2021 •

edited

Loading