Mysterious Tensor Indexing Problem #22013

adodge · 2019-06-20T06:13:38Z

🐛 Bug

Indexing into a tensor with a 2d list of indices seems to fail sometimes, with a critical point when the number of indices is less than 32.

To Reproduce

import torch

def index_test(n):
    M = torch.tensor([0.]*n) # Trivial example for illustrative purposes
    idxes = [(a,) for a in range(n)]
    return M[idxes]

index_test(31) # Note that this fails
index_test(32) # But this works

n=31 fails with:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-169-222ec1878948> in <module>
      6     return M[idxes]
      7 
----> 8 index_test(31)

<ipython-input-169-222ec1878948> in index_test(n)
      4     M = torch.tensor([0.]*n) # Trivial example for illustrative purposes
      5     idxes = [(a,) for a in range(n)]
----> 6     return M[idxes]
      7 
      8 index_test(31)

IndexError: too many indices for tensor of dimension 1

Expected behavior

I expected this indexing to work the same for any number of indices. This is problematic in my actual code as the size of the data varies.

Environment

PyTorch version: 1.1.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 16.04.6 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
CMake version: Could not collect

Python version: 3.6
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip3] numpy==1.16.4
[pip3] pytorch-pretrained-bert==0.6.2
[pip3] torch==1.1.0
[conda] Could not collect

Additional context

cc @ezyang @gchanan @zou3519 @mruberry @rgommers @heitorschueroff

The text was updated successfully, but these errors were encountered:

umanwizard · 2019-06-20T17:26:46Z

Does M[torch.tensor(idxes)] do what you want?

adodge · 2019-06-20T21:47:39Z

Aha! Indeed it does. Thank you! That's an excellent workaround.

I'm tempted to say this is still a bug.

Indexing by native python lists seems like it should either be equivalent to indexing by a tensor or completely disallowed. Or, at the very least, the error message should mention something about indexing by native python lists being unsupported.

umanwizard · 2019-06-21T02:55:21Z

In particular it does seem that we should have the same behavior regardless of whether the list has more or fewer than 32 elements.

But I don't know the exact PyTorch indexing rules -- will discuss with the rest of the team and update this issue accordingly.

colesbury · 2019-06-21T20:39:20Z

Lists with 31 or fewer elements are interpreted as tuples in some cases. This is to match NumPy's behavior, which in turn is for backwards compatibility reasons. NumPy recently added a warning for this case. We should probably do the same.

pytorch/torch/csrc/autograd/python_variable_indexing.cpp

Lines 226 to 264 in 38c9bb8

    
           static bool treatSequenceAsTuple(PyObject* index) { 
        
             if (PyTuple_Check(index)) { 
        
               return true; 
        
             } 
        
             if (!PySequence_Check(index)) { 
        
               return false; 
        
             } 
        
             // This uses a heuristics from NumPy for determining whether to treat 
        
             // non-tuple sequences as if they were a tuple. From the NumPy code comments: 
        
             // 
        
             // "At this point, we're left with a non-tuple, non-array, sequence: 
        
             //  typically, a list. We use some somewhat-arbitrary heuristics from here 
        
             //  onwards to decided whether to treat that list as a single index, or a 
        
             //  list of indices. Backwards compatibility only takes effect for short 
        
             //  sequences - otherwise we treat it like any other scalar." 
        
             auto n = PySequence_Size(index); 
        
             if (n < 0) { 
        
               // Negative size indicates a Python error in the PySequence_Size call. 
        
               PyErr_Clear(); 
        
               return false; 
        
             } 
        
             if (n >= 32) { 
        
               return false; 
        
             } 
        
             for (Py_ssize_t i = 0; i < n; i++) { 
        
               auto obj = THPObjectPtr{PySequence_GetItem(index, i)}; 
        
               if (!obj.get()) { 
        
                 PyErr_Clear(); 
        
                 return false; 
        
               } 
        
               if (THPVariable_Check(obj.get()) || PySequence_Check(obj.get()) || PySlice_Check(obj.get())) { 
        
                 return true; 
        
               } 
        
               if (obj.get() == Py_Ellipsis || obj.get() == Py_None) { 
        
                 return true; 
        
               } 
        
             } 
        
             return false; 
        
           }

NumPy warns:

FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.

ezyang · 2019-06-24T17:10:09Z

cc @VitalyFedyunin

ezyang · 2019-06-24T17:10:29Z

We should definitely improve the error message here.

maPiche · 2022-06-27T17:48:07Z

I am surprised this bug is still silently active:

`def bernoulli_i(p, size):
return (torch.rand(size) < p).int()

c32 =bernoulli_i(0.5,32)
print(c32)
idx2 =[x == 1 for x in c32]
print(idx2.count(True))
print(c32[idx2])

c31 =bernoulli_i(0.5,31)
print(c31)
idx1 =[x == 1 for x in c1]
print(idx1.count(True))
print(c31[idx1])`

returns a tensor([], size=(0, s) for an index created with size s < 32

v-liuwei · 2022-12-03T12:02:44Z

On version 1.12.1+cu116, this bug still exists:

To produce:

torch.tensor([1])[[torch.tensor(0)] * 31]  # raises an IndexError
torch.tensor([1])[[torch.tensor(0)] * 32]  # works fine

mrshenli added module: advanced indexing Related to x[i] = y, index functions module: operators labels Jun 20, 2019

umanwizard added triage review triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Jun 21, 2019

umanwizard closed this as completed Jun 21, 2019

umanwizard reopened this Jun 21, 2019

ezyang removed the triage review label Jun 24, 2019

mglisse mentioned this issue Mar 8, 2020

Using numpy array to index torch tensor #34452

Closed

mruberry added module: ux module: error checking Bugs related to incorrect/lacking error checking module: numpy Related to numpy support, and also numpy compatibility of our operators and removed module: operators (deprecated) labels Oct 10, 2020

ezyang added the high priority label Jun 27, 2022

pytorch-bot bot added the triage review label Jun 27, 2022

ngimel removed the triage review label Jul 18, 2022

FredDeCeuster mentioned this issue Mar 31, 2023

Ensure to index a torch.Tensor with a torch.Tensor Magritte-code/pomme#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mysterious Tensor Indexing Problem #22013

Mysterious Tensor Indexing Problem #22013

adodge commented Jun 20, 2019 •

edited by pytorch-bot bot

Loading

umanwizard commented Jun 20, 2019

adodge commented Jun 20, 2019

umanwizard commented Jun 21, 2019

colesbury commented Jun 21, 2019

ezyang commented Jun 24, 2019

ezyang commented Jun 24, 2019

maPiche commented Jun 27, 2022

v-liuwei commented Dec 3, 2022

Mysterious Tensor Indexing Problem #22013

Mysterious Tensor Indexing Problem #22013

Comments

adodge commented Jun 20, 2019 • edited by pytorch-bot bot Loading

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

umanwizard commented Jun 20, 2019

adodge commented Jun 20, 2019

umanwizard commented Jun 21, 2019

colesbury commented Jun 21, 2019

ezyang commented Jun 24, 2019

ezyang commented Jun 24, 2019

maPiche commented Jun 27, 2022

v-liuwei commented Dec 3, 2022

adodge commented Jun 20, 2019 •

edited by pytorch-bot bot

Loading