Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API change in groupby head and tail #6533

Merged
merged 2 commits into from
Mar 6, 2014
Merged

Conversation

hayd
Copy link
Contributor

@hayd hayd commented Mar 3, 2014

BUG/API groupby head and tail act like filter, since they dont aggregage fixes column selection

Breaking API change to groupby head and tail, these were never aggregated output, so should not have the group labels as the index, it should always filter. (It was a legacy from when this was g.apply(lambda x: x.head(n)).)

        as_index : boolean, default True
            For aggregated output, return object with group labels as the
            index. Only relevant for DataFrame input. as_index=False is
            effectively "SQL-style" grouped output

Also fixes the column selection for head and tail

part of #5755
related #6524
part of #5264.

cc @jreback @TomAugspurger

@hayd
Copy link
Contributor Author

hayd commented Mar 3, 2014

Note: definitely need to change some documentation here if we were to do this, but thought should put change out there. / discuss

@hayd hayd added API Design and removed API Design labels Mar 4, 2014
@hayd hayd added this to the 0.14.0 milestone Mar 4, 2014
@jreback
Copy link
Contributor

jreback commented Mar 5, 2014

@hayd this is good

why don't you add doc change in v0.14.0 (and in groupby.rst) for this change,

then in #5264 can expand that?

I edited the top to show this closes #5755 (only), others are related, right?

@TomAugspurger

@hayd
Copy link
Contributor Author

hayd commented Mar 5, 2014

@jreback is a partial close 5755 (I'll make a checklist on that, some of it's also up for discussion, should clean it up in one-hit IMO - this coming release). Ok, cool, will trawl through the docs to see what needs changing and add twice in release.

@TomAugspurger
Copy link
Contributor

Yeah this looks good. I'm looking forward to more consistency in groupby :)

@hayd
Copy link
Contributor Author

hayd commented Mar 5, 2014

Updated... Something has broken the build (first commit passed travis and second only changes docs)!

@jreback
Copy link
Contributor

jreback commented Mar 5, 2014

hmm..travis seems flaky right now....wait before merging this...let's see if can fix travis

@jreback
Copy link
Contributor

jreback commented Mar 5, 2014

looks like whl are not being transported properly....(though I just accessed the server ok)....maybe a network issue from travis?

@jreback
Copy link
Contributor

jreback commented Mar 5, 2014

@hayd

shouldn't this level be named? (e.g. its the original index of the frame)

http://stackoverflow.com/questions/22210865/python-assign-values-to-first-observation-of-each-group-in-dataframe/22210998#22210998

maybe make a first_index() function?

@jreback
Copy link
Contributor

jreback commented Mar 5, 2014

@hayd I realize that this is a case where as_index=False actually is useful!

@hayd
Copy link
Contributor Author

hayd commented Mar 5, 2014

@jreback haha, but you won't need to with this PR :) Ah, the unnamed level is from the original index (which had no name), no?

Is pretty easy to write with cumcount:

g.first_index == df[g.cumcount() == 0] ?

(s390x ?!!)

@jreback
Copy link
Contributor

jreback commented Mar 5, 2014

I thought he said mainframe. that's something the sparc guys can't even get numpy to run on, yet they want pandas to run ...meh

@hayd
Copy link
Contributor Author

hayd commented Mar 5, 2014

@jreback you want me to rebase/amending? I want to do the nth stuff on top of this so would be good to merge. (I tried amending before but it failed)

@jreback
Copy link
Contributor

jreback commented Mar 5, 2014

go ahead.....you can prob just merge this... (not sure why travis is acting up stilll)

hayd added a commit that referenced this pull request Mar 6, 2014
API change in groupby head and tail
@hayd hayd merged commit 1bab0a2 into pandas-dev:master Mar 6, 2014
@hayd hayd deleted the groupby_head branch March 6, 2014 00:21
@hayd
Copy link
Contributor Author

hayd commented Mar 6, 2014

we'll see...

@hayd
Copy link
Contributor Author

hayd commented Mar 6, 2014

yeah, build still failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants