You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I posed this issue on SO but was hoping to get a more detailed explanation. I have a df that has an (id, date)MultiIndex. I would like to resample the df within each id. Intuitively I thought the code should look something like:
This raises an exception and the working answer suggests that I only have date as the index leaving the id as a column, then group by that column and resample. Namely, instead of df.set_index(["id", "date"]).sort_index() just df.set_index("date").sort_index() and then everything else should work as is. I'm a bit confused why my original attempt failed. My understanding of groupby is that it will create an object that has grouping information and all the methods on that object respect that grouping information by operating on each group only. Specifically the index or column that the df is grouped on will be absent from the group subframe passed to any methods and will only be used to cat the results together. Therefore, I expected that resample would receive a df with only a date index rather than a MultiIndex. Am I think about this completely incorrectly?
The text was updated successfully, but these errors were encountered:
Looks like this is an issue not specific to just resample, but also any subsequently chained function that does not take in MultiIndex. I guess the implementation of groupby retains row labels as they are because that would offer the most amount of information and clarity to the user when he is dealing with the groups after groupby.
One possible question is would it be necessary to implement a pandas.MultiIndex.droplevel function when grouping on MultiIndex. (Is this what you are looking for?) That seems unnecessary as the working answer you mentioned has already proven, to flexibly use the values of a column to achieve the grouping rather than setting them into index to group.
I posed this issue on SO but was hoping to get a more detailed explanation. I have a df that has an
(id, date)
MultiIndex
. I would like to resample the df within eachid
. Intuitively I thought the code should look something like:This raises an exception and the working answer suggests that I only have
date
as the index leaving theid
as a column, then group by that column and resample. Namely, instead ofdf.set_index(["id", "date"]).sort_index()
justdf.set_index("date").sort_index()
and then everything else should work as is. I'm a bit confused why my original attempt failed. My understanding ofgroupby
is that it will create an object that has grouping information and all the methods on that object respect that grouping information by operating on each group only. Specifically the index or column that the df is grouped on will be absent from the group subframe passed to any methods and will only be used to cat the results together. Therefore, I expected thatresample
would receive a df with only adate
index rather than aMultiIndex
. Am I think about this completely incorrectly?The text was updated successfully, but these errors were encountered: