Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: dropping nuisance columns in rolling aggregations #42834

Merged
merged 6 commits into from
Aug 10, 2021

Conversation

jbrockmendel
Copy link
Member

we did this in 1.3 for DataFrame reductions and GroupBy reductions. AFAIK this is the last place we use this pattern.

cc @mroeschke

@jbrockmendel jbrockmendel added Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply Window rolling, ewma, expanding labels Jul 31, 2021
expected = df.ewm(halflife=1.0, min_periods=min_periods).mean()
with tm.assert_produces_warning(FutureWarning, match="nuisance columns"):
# GH#42738
result = df.ewm(halflife=halflife, min_periods=min_periods, times=times).mean()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should probably add a deprecation here too that allows times to be a string of a column label (which is already kinda flawed because a column label isn't always a string). A column of times will always be a nuisance column I think

if isinstance(self.times, str):

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be done separately? this is less obvious to me

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, yes.

result = ewm.mean(engine="numba", engine_kwargs=engine_kwargs)
expected = ewm.mean(engine="cython")

# TODO: why only in these cases?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because column "A" is string dtype and in the non-group case there's an attempt to agg over the column, while in the groupby case, the grouping column should be dropped internally before hand

@@ -159,6 +159,7 @@ Deprecations
- Deprecated treating ``numpy.datetime64`` objects as UTC times when passed to the :class:`Timestamp` constructor along with a timezone. In a future version, these will be treated as wall-times. To retain the old behavior, use ``Timestamp(dt64).tz_localize("UTC").tz_convert(tz)`` (:issue:`24559`)
- Deprecated ignoring missing labels when indexing with a sequence of labels on a level of a MultiIndex (:issue:`42351`)
- Creating an empty Series without a dtype will now raise a more visible ``FutureWarning`` instead of a ``DeprecationWarning`` (:issue:`30017`)
- Deprecated dropping of nuisance columns in :class:`Rolling` aggregations (:issue:`42738`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to include Expanding and EWM as well.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on board. @mroeschke suggestions + rebase

@jbrockmendel
Copy link
Member Author

comments addressed + rebased + green

if 0 != len(new_mgr.items) != len(mgr.items):
# GH#42738 ignore_failures dropped nuisance columns
warnings.warn(
"Dropping of nuisance columns in rolling operations "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you list the columns that are problematic in the error message?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated + green

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Nuisance Columns Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEPR: dropping nuisance columns in BaseWindow._apply
3 participants