API: Use different exception type when MultiIndex indexing fails due to index not being lexsorted #11897

Closed
Dr-Irv opened this Issue Dec 24, 2015 · 7 comments

Comments

Projects
None yet
4 participants
Contributor

Dr-Irv commented Dec 24, 2015

I'd like to be able to separately trap errors in indexing that are due to the lack of the MultiIndex being lexsorted. Right now, a KeyError is raised, and the only way to tell if the error was due to the lack of a lexsort versus a missing key is to look at the text of the message.

So I'd like to avoid code like this:

try:
    subdf = df.loc['A':'D','Vals']
except KeyError as ker:
    if ker.args[0].startswith("MultiIndex Slicing requires the index to be fully lexsorted"):
        print ("Need to handle fact index not sorted")
    else:
        print ("Need to handle fact that key was missing")

I'd rather write something like:

try:
    subdf = df.loc['A':'D','Vals']
except UnsortedIndexError:
    print ("Need to handle fact index not sorted")
except KeyError:
    print ("Need to handle fact that key was missing")

So could the designers accept a change where a different error is raised in case the index is not lexsorted? If so, I'll implement it.

Contributor

jreback commented Dec 24, 2015

pls show a complete example uptop

Contributor

Dr-Irv commented Dec 24, 2015

Here is an example:

mi = pd.MultiIndex.from_tuples([('z','a'),('x','a'),('y','b'),('x','b'),('y','a'),('z','b')], names=['one','two'])
df = pd.DataFrame([[i,10*i] for i in range(6)], index=mi, columns=['one','two'])
print(df.loc(axis=0)['z',:])

The last line fails with:
KeyError: 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (0)'

Now consider where we do the sorting:

df.sort_index(inplace=True)
print(df.loc(axis=0)['z',:])
print(df.loc(axis=0)['q',:])

The second line works because things are now sorted, but the last line fails with:
KeyError: 'q'

So I'd like to be able to surround all of the above code with ways of detecting the 2 different errors, which will make debugging easier.

Member

shoyer commented Dec 25, 2015

I like this idea. To preserve backwards compatibility, we might make it a KeyError subclass. Note that there are also some cases where indexing non-multiindexes can raise a KeyError because the index is unsourced (notably, when slicing).

jreback added this to the Next Major Release milestone Dec 26, 2015

Contributor

jreback commented Dec 26, 2015

would for sure have to be a KeyError for back-compat. (ideally this should be a ValueError, but that boat has sailed).

Contributor

Dr-Irv commented Dec 28, 2015

@jreback So should I create the UnsortedIndexError code, test it, and do the pull request?

@jreback jreback modified the milestone: 0.18.1, Next Major Release Mar 17, 2016

Contributor

jreback commented Mar 17, 2016

@Dr-Irv interested in working on this?

Contributor

Dr-Irv commented Mar 17, 2016

Yes, but it might be a while, as I've got some other pressing issues.

@jreback jreback modified the milestone: 0.18.1, 0.18.2 Apr 26, 2016

@jorisvandenbossche jorisvandenbossche modified the milestone: 0.20.0, 0.19.0 Aug 21, 2016

Dr-Irv referenced this issue Nov 29, 2016

Merged

ENH: Introduce UnsortedIndexError GH11897 #14762

0 of 3 tasks complete

@Dr-Irv Dr-Irv added a commit to Dr-Irv/pandas that referenced this issue Nov 29, 2016

@Dr-Irv Dr-Irv ENH: #11897 make lint work. ERR: #14491 change error message 5f7eee1

@Dr-Irv Dr-Irv added a commit to Dr-Irv/pandas that referenced this issue Dec 6, 2016

@Dr-Irv Dr-Irv ENH: Introduce UnsortedIndexError #11897
ERR: Change error message   #14491

ENH: #11897 make lint work.  ERR: #14491 change error message

ERR: #14491 fix test for error message

fixes based on jreback feedback

fix indent issue

Doc fixes

Fixes per jreback comments
f33093b

@Dr-Irv Dr-Irv added a commit to Dr-Irv/pandas that referenced this issue Dec 7, 2016

@Dr-Irv Dr-Irv ENH: Introduce UnsortedIndexError #11897
ERR: Change error message   #14491
0947982

@Dr-Irv Dr-Irv added a commit to Dr-Irv/pandas that referenced this issue Dec 9, 2016

@Dr-Irv Dr-Irv ENH: Introduce UnsortedIndexError #11897
ERR: Change error message   #14491
76b6434
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment