Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: allow the iloc indexer to run off the end and not raise IndexError (GH6296) #6299

Merged
merged 1 commit into from
Feb 8, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Feb 7, 2014

closes #6296

Here's the behavior
I left it raising an IndexError if a single indexer is given that is out-of-bounds (consistent with the way ix/loc would work)

In [1]: df = DataFrame(np.random.randn(5,2),columns=list('AB'))

In [2]: df
Out[2]: 
          A         B
0  1.105982  0.309531
1 -0.619102 -0.002967
2 -0.817371  0.270714
3  1.569206  1.595204
4  0.276390 -1.316232

[5 rows x 2 columns]

In [3]: df.iloc[[4,5,6]]
Out[3]: 
         A         B
4  0.27639 -1.316232

[1 rows x 2 columns]

In [4]: df.iloc[4:6]
Out[4]: 
         A         B
4  0.27639 -1.316232

[1 rows x 2 columns]

In [5]: df.iloc[:,2:3]
Out[5]: 
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3, 4]

[5 rows x 0 columns]

In [6]: df.iloc[:,1:3]
Out[6]: 
          B
0  0.309531
1 -0.002967
2  0.270714
3  1.595204
4 -1.316232

[5 rows x 1 columns]

In [7]: df.iloc[10]
IndexError: 

jreback added a commit that referenced this pull request Feb 8, 2014
API: allow the iloc indexer to run off the end and not raise IndexError (GH6296)
@jreback jreback merged commit f4bcfd4 into pandas-dev:master Feb 8, 2014
@immerrr
Copy link
Contributor

immerrr commented Feb 22, 2014

I've hit an issue with code from this ticket when debugging #6370 and I might have an objection.

I agree that slices should behave as they do outside pandas, i.e. those that go outside container indices should be silently bounded, i.e. something along the lines of (UPD: fixed the code a bit)

start, stop, step = s.start, s.stop, s.step
length = len(obj)

if start < 0:
    start = max(length - start, 0)
elif start > length:
    start = length

if stop < 0:
    stop = max(length - stop, 0)
elif stop > length:
    stop = length

(there's actually a slice.indices(len(obj)) function which does exactly that, but that's not the point).

The point is that silently dropping invalid integer indexers, as in df[[1000, 5000, 10000]] might be counter-intuitive to people who come from numpy world (it is for me, at least). Just as it was for people with non-pandas background in python to find out that slicing raises IndexError on out-of-bounds start/stop values.

I've read that this ticket helped with #6301, is there a way to leave only slice bounding and drop integer index bounding without causing a regression there?

@immerrr
Copy link
Contributor

immerrr commented Feb 22, 2014

Ok, I've fixed the error in my code, but the question remains for the sake of API harmonization.

@jreback
Copy link
Contributor Author

jreback commented Feb 22, 2014

what does numpy / python do with an out of bounds indexer in a list?
I know a single element in a python list raises IndexError if out of bounds but does it support multiple?

FYI loc does raise a KeyError if u try something like this ( with the index labels of course)

@immerrr
Copy link
Contributor

immerrr commented Feb 22, 2014

what does numpy / python do with an out of bounds indexer in a list?

In [1]: np.arange(100)[[101]]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-1-16b50cf7e847> in <module>()
----> 1 np.arange(100)[[101]]

IndexError: index 101 is out of bounds for size 100

FYI loc does raise a KeyError if u try something like this ( with the index labels of course)

This makes perfect sense, since it's not the indexing operation that fails but rather the index lookup.

@jreback
Copy link
Contributor Author

jreback commented Feb 22, 2014

@immerrr I think you are right, a slicing operation can go over the bounds, but specific index lookups (in this case they are positional), should fail. ok

@jreback
Copy link
Contributor Author

jreback commented Feb 22, 2014

@immerrr fixed up by #6443 good catch

@immerrr
Copy link
Contributor

immerrr commented Feb 22, 2014

Awesome, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

iloc errors when end is greater than length
2 participants