Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiIndex loc for value not in index #4443

Closed
hayd opened this issue Aug 2, 2013 · 8 comments
Closed

MultiIndex loc for value not in index #4443

hayd opened this issue Aug 2, 2013 · 8 comments
Labels
Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@hayd
Copy link
Contributor

hayd commented Aug 2, 2013

example:

In [11]: s = pd.Series({(0, 0): 1, (0, 1): 2, (0, 3): 3, (1, 0): 1, (1, 2): 4, (3, 0): 5})

In [12]: s.loc[(1, 0):]
Out[12]:
(1, 0)    1
(1, 2)    4
(3, 0)    5
dtype: int64

In [13]: s.loc[(1, 1):]
KeyError: 'start bound [(1, 1)] is not the [index]'

Should this work? cc @jreback

From SO question.

@jreback
Copy link
Contributor

jreback commented Aug 2, 2013

This looks legit, you are asking for a starting tuple with no stopping tuple, so you get the rest of the index
and it correctly start/stops including the endpoints (if provided)

In [4]: s.loc[(1,0):]
Out[4]: 
(1, 0)    1
(1, 2)    4
(3, 0)    5
dtype: int64

In [5]: s
Out[5]: 
(0, 0)    1
(0, 1)    2
(0, 3)    3
(1, 0)    1
(1, 2)    4
(3, 0)    5
dtype: int64

In [6]: s.loc[(1,0):(1,2)]
Out[6]: 
(1, 0)    1
(1, 2)    4
dtype: int64

In [7]: s.loc[(0,3):(1,2)]
Out[7]: 
(0, 3)    3
(1, 0)    1
(1, 2)    4
dtype: int64

@hayd
Copy link
Contributor Author

hayd commented Aug 2, 2013

@jreback but don't we expect:

In [13]: s.loc[(1, 1):]
Out[13]:
(1, 2)    4
(3, 0)    5
dtype: int64

rather than KeyError?

@jreback
Copy link
Contributor

jreback commented Aug 2, 2013

nope....that's according to spec, then endpoints must be included

I get that this since there is a defined ordering that this should work (and might in cases of datetimes), where lookup is 'approximate', e.g. s['2000'] would start at the first datetime of 2000 (even if its much later)

but s.loc['2000'] would not

I think OP is best by doing this using a multi-index, a list of tuples is essentially (but not exactly the same)
and in this case, I suppose you could argue its a bug....

In [15]: s2 = s.copy()

In [16]: s2.index = pd.MultiIndex.from_tuples(s.index)

In [17]: s2
Out[17]: 
0  0    1
   1    2
   3    3
1  0    1
   2    4
3  0    5
dtype: int64

In [18]: s2[(1,0):]
Out[18]: 
1  0    1
   2    4
3  0    5
dtype: int64

In [19]: s2[(1,1):]
Out[19]: 
1  2    4
3  0    5
dtype: int64

@hayd
Copy link
Contributor Author

hayd commented Aug 2, 2013

Ha! So the error is that this totally isn't a MultiIndex. Sorry

@hayd hayd closed this as completed Aug 2, 2013
@hayd
Copy link
Contributor Author

hayd commented Aug 2, 2013

is it weird that using loc this gives an error?

In [92]: s[:(1,1)]
Out[92]: 
0  0    1
   1    2
   3    3
1  0    1
dtype: int64

In [93]: s.loc[:(1,1)]
KeyError: 'stop bound [(1, 1)] is not in the [index]'

@jreback
Copy link
Contributor

jreback commented Aug 2, 2013

that's correct
I you specify an endpoint in a slice it must be included in the index

@hayd
Copy link
Contributor Author

hayd commented Aug 2, 2013

not sure I follow why this works without loc but not with loc...

@jreback
Copy link
Contributor

jreback commented Aug 2, 2013

loc is very strict about this, while I think getitem just searches. I think that's the intent, though maybe these should be the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

No branches or pull requests

2 participants