New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Preserve Index in Pandas Groupby Rolling on Datetime #35551
Comments
@nrcjea001 what's your suggested fix here? And is your expected output right? Why isn't there a |
No bug fix here. Rather a suggestion. In situations where the index is not datetime, and we rolling on a datetime column, it would be nice to preserve the index and the date column when performing a rolling groupby. A workaround is to use a groupby apply with rolling function, but this is considerably slower. May I suggest that the rolling groupby preserves the actual dataframe index, when rolling on a datetime column rather than datetime index. |
I'm running into a similar issue. I'm doing a Not what I expected, given that I explicitly said |
You actually don't need the index preserved to get what you need. You can add whatever columns you need in the egg function and apply a function keep the last row. From your example, the solution would be: df1 = df.groupby('group').rolling('1D',on='date').agg({'column1': sum, 'column2': lambda row: row[-1]}) I do agree there should be a better way to keep the linking of the original data frame to the new data frame from window aggregation. This solution also won't work if the current row is not within the window range. Maybe a special aggregation function to preserve the current row in the window? |
Is your feature request related to a problem?
A problem arises when rolling on datetime column where dates are the same. The Groupby Rolling function does not preserve the original index and so when dates are the same within the Group, it is impossible to know which index value it pertains to from the original dataframe.
Describe the solution you'd like
Ideally we wish to preserve the original index so that it can be merged back onto the original dataframe.
Current Code
Describe alternatives you've considered
Current workaround as follows, but considerably slower
The text was updated successfully, but these errors were encountered: