Join GitHub today
Bug in Fancy/Boolean Indexing with nested lists #2702
Fancy or Boolean indexing on a Series has two strange behaviors. My examples only show the behavior with Fancy indexing, but it's the same for Boolean indexing.
LHS vs RHS length
I would have expected an error, similar to what I get with slice indexing
An even odder behavior is when you have too few items in the RHS
It seems to be using something like itertools.cycle which seems very arbitrary to me
This may seem like a strange use of pandas, but I need to store Python lists
Very strange. It's like it flattens the input first.
I know in numpy the array constructor would make a distinction between these two inputs, so maybe that's the reason for the difference, but I still don't see why ndarrays are being flattened.
I can work around the issue by converting the RHS to a 1-D array and passing that in.
Slice indexing doesn't have this problem at all
My Question: Are these behaviors a bug or a "feature"? I think Fancy/Boolean indexing should operate the same as slice indexing -- i.e. check for matching lengths and don't auto-convert to numpy array.
You're right. I just validated the same bugs on a plain ndarray. Do you think there is any value in raising this issue on a NumPy forum?
Thanks for looking into these corner cases. Pandas just keeps getting better and I find myself using it more and more when dealing with any non-trivial dataset.
This is easy to make all of these act the same, just an extension in
so this good (#2745)
else it is converted to a