scatter_ throwing a RunTimeError #40359

alfoudari · 2020-06-21T21:50:17Z

🐛 Bug

I'm trying to turn a tensor into three different tensors for a CNN with three channels. The following happened while testing.

To Reproduce

Steps to reproduce the behavior:

(Pdb) y
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0.])
(Pdb) (y == 0).nonzero()
tensor([[0],
        [1],
        [2],
        [3],
        [4],
        [5],
        [6],
        [7],
        [8]])
(Pdb) y.scatter_(0, (y == 0).nonzero(), 1)
*** RuntimeError: size.size() == stride.size() INTERNAL ASSERT FAILED at /pytorch/aten/src/ATen/native/Resize.h:96, please report a bug to PyTorch.

Expected behavior

A tensor with 1 at every index.

Environment

$ python collect_env.py
Collecting environment information...
PyTorch version: 1.5.0
Is debug build: No
CUDA used to build PyTorch: 10.2

OS: Arch Linux
GCC version: (Arch Linux 9.3.0-1) 9.3.0
CMake version: version 3.17.1

Python version: 3.8
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.5.0
[conda] Could not collect

Additional context

So squeeze solves it, however, I'm leaving this issue since an internal assert failed:

>>> y = torch.zeros([9])
>>> y.scatter(0, (y == 0).nonzero().squeeze(), 1)
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1.])

The text was updated successfully, but these errors were encountered:

ngimel · 2020-06-22T21:34:14Z

cc @nikitaved - is something missing in the shape checks? We should not be getting internal assert.

nikitaved · 2020-06-22T21:45:34Z

I cannot reproduce on master. This is what I see

In [1]: import torch

In [2]: y = torch.zeros([9])

In [3]: y.scatter_(0, (y == 0).nonzero(), 1)
/home/nik/miniconda3/envs/pytorch-cuda-dev/bin/ipython:1: UserWarning: This overload of nonzero is deprecated:
        nonzero()
Consider using one of the following signatures instead:
        nonzero(*, bool as_tuple) (Triggered internally at  ../torch/csrc/utils/python_arg_parser.cpp:766.)
  #!/home/nik/miniconda3/envs/pytorch-cuda-dev/bin/python
Out[3]: tensor([1., 1., 1., 1., 1., 1., 1., 1., 1.])

Collecting environment information...
PyTorch version: 1.6.0a0+04cc47d
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 18.04.3 LTS
GCC version: (crosstool-NG 1.23.0.450-d54ae) 7.3.0
CMake version: version 3.16.3

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: TITAN RTX
GPU 1: TITAN RTX

Nvidia driver version: 440.33.01
cuDNN version: /usr/local/cuda-10.2.89/targets/x86_64-linux/lib/libcudnn.so.7

Versions of relevant libraries:
[pip] numpy==1.18.1
[pip] torch==1.6.0a0+04cc47d
[conda] magma-cuda101             2.5.1                         1    pytorch
[conda] mkl                       2019.4                      243  
[conda] mkl-include               2019.5                      281    conda-forge
[conda] numpy                     1.18.1           py36h95a1406_0    conda-forge
[conda] torch                     1.6.0a0+04cc47d           dev_0    <develop>

zou3519 · 2020-06-23T19:52:29Z

I can confirm this bug exists on 1.5.0 and 1.5.1 but appears to be fixed in master. I am closing this issue because it has been fixed on master.

ngimel · 2020-06-23T19:55:20Z

There's still something wrong with the shape checking it seems, the documentation says that the number of dimensions should be the same for src and idx, yet in this case idx has 2 dimensions (as a result of .nonzero(0), src is 1-d, and yet somehow it still works. @nikitaved can you please double check it?

nikitaved · 2020-06-23T20:03:36Z

@ngimel, it works in this case because the shape check iterates over self.dim. So, yeah, it requires a fix, I guess.
Shall we actually allow index/src to have one dimension more than self.dim if their sizes are 1?
But then the case self.dim == 1, src.dim == index.dim == 2 with src.dim(0) == index.dim(0) == 1 should be allowed as well I think. So, the scatter/gather seem to process only the first self.dim dimensions of index and src.

ngimel · 2020-06-23T20:12:04Z

I'm not sure what exactly we should allow, but in any case it should be documented, and it also should be more or less general. E.g. in the cases you describe, why stop at one extra dimension and not allow [N, 1, 1]? Also, can singleton dimensions be appended only, or appended and prepended? Seems easier to me to just disallow all questionable cases, with good error messages :-)

nikitaved · 2020-06-23T20:15:27Z

OK, it is much easier to fix the dim to be exactly the same for all inputs, otherwise we could allow broadcasting if needed, but this will change the documentation. All right, just a simple fix will do :)

ngimel · 2020-06-23T20:47:13Z

I'm a bit wary about allowing general broadcasting - would require some thought to get right. E.g. for in-place scatter self cannot be broadcasted, so values/indices should be broadcastable to self, right? For gather it's probably enough to require indices and self to be broadcastable to a common shape. Do you remember if there were enhancement requests to enable broadcasting for scatter/gather ops? In any case, imo it makes sense for now to bring behavior in agreement with the documentation, and discuss broadcasting separately.

nikitaved · 2020-06-23T20:59:12Z

No, I do not remember the broadcasting requests, but I do remember expectations that index/src, for example, do broadcast over self in scatter. Sure, I will make it as the documentation says.

zou3519 closed this as completed Jun 23, 2020

ngimel mentioned this issue Jul 20, 2020

torch.gather behavior changed from 1.5.1 to master #41532

Closed

ngimel mentioned this issue Nov 19, 2020

Incorrect answer when using scatter_add_ and broadcasting, Feature Request: scatter_add broadcasting #48214

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scatter_ throwing a RunTimeError #40359

scatter_ throwing a RunTimeError #40359

alfoudari commented Jun 21, 2020 •

edited

ngimel commented Jun 22, 2020

nikitaved commented Jun 22, 2020 •

edited

zou3519 commented Jun 23, 2020

ngimel commented Jun 23, 2020

nikitaved commented Jun 23, 2020 •

edited

ngimel commented Jun 23, 2020

nikitaved commented Jun 23, 2020 •

edited

ngimel commented Jun 23, 2020

nikitaved commented Jun 23, 2020

scatter_ throwing a RunTimeError #40359

scatter_ throwing a RunTimeError #40359

Comments

alfoudari commented Jun 21, 2020 • edited

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

ngimel commented Jun 22, 2020

nikitaved commented Jun 22, 2020 • edited

zou3519 commented Jun 23, 2020

ngimel commented Jun 23, 2020

nikitaved commented Jun 23, 2020 • edited

ngimel commented Jun 23, 2020

nikitaved commented Jun 23, 2020 • edited

ngimel commented Jun 23, 2020

nikitaved commented Jun 23, 2020

alfoudari commented Jun 21, 2020 •

edited

nikitaved commented Jun 22, 2020 •

edited

nikitaved commented Jun 23, 2020 •

edited

nikitaved commented Jun 23, 2020 •

edited