Skip to content

ENH: drop_level argument for xs #4180

Merged
merged 1 commit into from Aug 27, 2013

3 participants

@hayd
Python for Data member
hayd commented Jul 9, 2013

As discussed here, xs usually remove the level which you are accessing. This commit allows you to explicitly say you don't want that (and want to keep the same levels in the result).

In [4]: df = DataFrame({'day': {0: 'sat', 1: 'sun'},
                                   'flavour': {0: 'strawberry', 1: 'strawberry'},
                                   'sales': {0: 10, 1: 12},
                                   'year': {0: 2008, 1: 2008}}).set_index(['year','flavour','day'])

In [5]: df
Out[5]:
                     sales
year flavour    day
2008 strawberry sat     10
                sun     12

In [6]: df.xs('sat', level='day')
Out[6]:
                 sales
year flavour
2008 strawberry     10

In [7]: df.xs('sat', level='day', drop_level=False)
Out[7]:
                     sales
year flavour    day
2008 strawberry sat     10

In [8]: df.xs([2008, 'sat'], level=['year', 'day'], drop_level=False)
Out[8]:
                     sales
year flavour    day
2008 strawberry sat     10
@hayd
Python for Data member
hayd commented Jul 9, 2013

oh dear, screwed something up in this, need to fix.

@cpcloud
Python for Data member
cpcloud commented Jul 21, 2013

@hayd should be marked for 0.13?

@hayd
Python for Data member
hayd commented Aug 27, 2013

@jreback merge? or do you have a better arg name?

@jreback
jreback commented Aug 27, 2013

hmmm.....and we don't want to make drop_level a public function....

could we do this thru drop? maybe like this: (same impl)

df.drop('sat', level='day')

?

@hayd
Python for Data member
hayd commented Aug 27, 2013

This makes xs more like a filter (atm the behaviour is to drop the selected level, this adds option not too).

@jreback
jreback commented Aug 27, 2013

this is adding a modification option xs, something it doesn't do now; that's my hesitation

what about adding drop_level, be very explicit?

@hayd
Python for Data member
hayd commented Aug 27, 2013

ah ok, yeah drop_level is better

@hayd
Python for Data member
hayd commented Aug 27, 2013

Hmmm giving def _drop_levels a drop_level was stupid, this is a bad way to do this. Need to reconsider.

@jreback
jreback commented Aug 27, 2013

no...I meant in core/generic.py make a method called drop_levels with same args as xs, label, axis, level and just do what you are doing now

@hayd
Python for Data member
hayd commented Aug 27, 2013

ah, true and this isn't really a cross section either... I dunno actually I think this makes sense as an option for xs...just viewing the data in slightly different way, with superfluous index level.

I guess you can hack select to do this:

In [10]: df.select(lambda x: x[2] == 'sat')
Out[10]:
                     sales
year flavour    day
2008 strawberry sat     10

Maybe could just add this api to select? (add level argument)

df.select(self, crit, axis=0, level=None)

looks to see if crit is callable, if is current behaviour, if not then looks at level whatever and does xs without dropping level (the result we're talking about).

@jreback
jreback commented Aug 27, 2013

would shy away from select, its a linear search...let's go back to xs...

how about ''drop=...'' ? (rather than drop_selected_level?

@hayd
Python for Data member
hayd commented Aug 27, 2013

I liked you drop_level (as a compromise :p), I think it's clearer what you're dropping then. Ok, have to clean up this code though, it looks like I just threw this together without thinking about back-end semantics. I think it's much easier than my awful hack.

@hayd
Python for Data member
hayd commented Aug 27, 2013

@jreback logic is now much improved.

@jreback
jreback commented Aug 27, 2013

looks good..

@hayd hayd merged commit 66c4a05 into pydata:master Aug 27, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.