Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Remove unnecessary kwargs in groupby ops #50483

Closed
wants to merge 16 commits into from
Closed

DEPR: Remove unnecessary kwargs in groupby ops #50483

wants to merge 16 commits into from

Conversation

ramvikrams
Copy link
Contributor

@ramvikrams ramvikrams commented Dec 29, 2022

"""
Cumulative sum for each group.

Returns
-------
Series or DataFrame
"""
nv.validate_groupby_func("cumsum", args, kwargs, ["numeric_only", "skipna"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CI is failing because of this. You can't simply delete the values, you should provide an empty tuple and an empty dictionary instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes will do

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am getting this error Error: The runner has received a shutdown signal...' in GHA. GH 45651 , could you help on this.

@datapythonista datapythonista added Groupby Deprecate Functionality to remove in pandas labels Dec 29, 2022
@datapythonista datapythonista changed the title gh-50407 Depr: unnecessary kwargs in groupby ops DEPR: Remove unnecessary kwargs in groupby ops Dec 29, 2022
Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add an entry in the whatsnew notes for this change.

@ramvikrams
Copy link
Contributor Author

We should add an entry in the whatsnew notes for this change.

Will do

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! We should only be deprecating at this point. The arguments can only be removed after a deprecation period. The removal won't happen until we begin working on pandas 3.0.

@ramvikrams
Copy link
Contributor Author

Thanks for the PR! We should only be deprecating at this point. The arguments can only be removed after a deprecation period. The removal won't happen until we begin working on pandas 3.0.

Thanks I will do the changes

@ramvikrams
Copy link
Contributor Author

Thanks for the PR! We should only be deprecating at this point. The arguments can only be removed after a deprecation period. The removal won't happen until we begin working on pandas 3.0.

I looked over the net but could not find out how to deprecate this, can you please explain. I thought that we gad to remove the arguments when I opened the PR.

@datapythonista
Copy link
Member

The idea is that in the next version we start showing warnings to the users in the next version, to let them know if they are passing the arguments that they will be removed, and some versions later we will actually remove them, once they had the time to update their code.

I'd personally still fix this PR with the removals first. The parameters are not veing used, but I wonder if there was any reason we are misssing to add them in the first place. So, it'll be good to see that nothing in our test suite breaks when removing them, before we apply the deprecation warnings.

For the warnings, check at closed PRs with the Deprecate label, and find one about arguments. We have a decorator to do it, but I'm in my phone and I fon't remember the exact name, something like deprecate_arg probably.

@ramvikrams
Copy link
Contributor Author

thanks i'll look for closed pr with deprecate_arg

@rhshadrach
Copy link
Member

@datapythonista - these arguments are to support passing groupby objects to NumPy; #47836 and #48277

@ramvikrams
Copy link
Contributor Author

@datapythonista - these arguments are to support passing groupby objects to NumPy; #47836 and #48277

so should we add this maybe_warn_args_and_kwargs in all the methods, as the deprecation message.

@datapythonista
Copy link
Member

Thanks @rhshadrach for the info, I wasn't aware of that, good to know.

@ramvikrams deprecate_nonkeyword_arguments is the decorator we need I think.

@ramvikrams
Copy link
Contributor Author

Thanks @rhshadrach for the info, I wasn't aware of that, good to know.

@ramvikrams deprecate_nonkeyword_arguments is the decorator we need I think.

Yeah I'll push a commit in 30 mins

@ramvikrams
Copy link
Contributor Author

I think it's done

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes look good except I'm not seeing cummin / cummax and this needs tests.

For tests, you want to invoke the function and ensure the appropriate warning is raised. Use with tm.assert_produces_warning(FutureWarning, match="..."): with the appropriate warning message filled in.

@@ -735,6 +735,7 @@ Removal of prior version deprecations/changes
- Removed the deprecated argument ``line_terminator`` from :meth:`DataFrame.to_csv` (:issue:`45302`)
- Removed the deprecated argument ``label`` from :func:`lreshape` (:issue:`30219`)
- Arguments after ``expr`` in :meth:`DataFrame.eval` and :meth:`DataFrame.query` are keyword-only (:issue:`47587`)
- Deprecate argument ``**kwargs`` from :meth:`cummax`, :meth:`cummin`, :meth:`cumsum`, :meth:`cumprod`, :meth:`take` and :meth:`skew` (:issue:`50483`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently in the section of deprecations that we are enforcing for 2.0. Instead, it should be in the new deprecations that are introduced in 2.0. That's one section up from this.

Can you add "unused" before "argument" to indicate to users we're not removing any functionality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes surely i'll do the changes

@@ -155,3 +165,17 @@ def test_class():
)
with tm.assert_produces_warning(FutureWarning, match=msg):
Foo().baz("qux", "quox")


def test_deprecated_keywords():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is for testing the decorator itself, these tests should go in pandas/tests/groupby/test_function.

"and kwargs will be deprecated for these functions"
)
with tm.assert_produces_warning(FutureWarning, match=msg):
assert cummax
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert [statement] evaluates the truthiness of the statement. For example, [] will evaluate to False (and hence raise if used in an assert) whereas [1] will evaluate to True. Functions are always truthy - they evaluate to True.

Instead, this needs to call the function itself, e.g.

df = DataFrame({'a': [1, 1, 2], 'b': [3, 4, 5]})
gb = df.groupby('a')
gb.cummax(kwargs=1)

Copy link
Contributor Author

@ramvikrams ramvikrams Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks I'll do the changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i did the changes but still the tests are failing, don't know why

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments

Comment on lines +1607 to +1612
assert gb.cummax(kwargs=1)
assert gb.cummin(kwargs=1)
assert gb.cumsum(args=1, kwargs=1)
assert gb.cumprod(args=1, kwargs=1)
assert gb.skew(kwargs=1)
assert gb.take(kwargs=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert is used to assert that a certain condition is true. Do you want to assert something here, or just call these functions?

Also, what will happen with this test if cummax produces the warning, but the other functions don't? Seems like the test will pass, even if the changes don't do what we want them to do. You probably want to use pytest parametrization, with the function name as a parameter, then you can retrieve the function with getattr(gb, func_name).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to use pytest parametrization, with the function name as a parameter, then you can retrieve the function with getattr(gb, func_name).

i'll try this way

@@ -893,6 +894,9 @@ def fillna(
return result

@doc(Series.take.__doc__)
@deprecate_nonkeyword_arguments(
version="3.0", allowed_args=["self", "axis", "indices"]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having another look, looks like deprecate_nonkeyword_arguments doesn't deprecate the argument, but makes it usable only via a keyword argument (e.g. foo(1) would produce the warning, but foo(val=1) wouldn't). If that's the case, doesn't seem like we've got a decorator to simply deprecate an argument. In that case, you'll have to produce it manually.

with tm.assert_produces_warning(FutureWarning, match=msg):
df = DataFrame({"a": [1, 1, 2], "b": [3, 4, 5]})
gb = df.groupby("a")
assert gb.cummax(kwargs=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using kwargs as the sample keyword argument is a bit misleading. Something like gb.cummax(incorrect_argument=)` would probably be clearer.

gb = df.groupby("a")
assert gb.cummax(kwargs=1)
assert gb.cummin(kwargs=1)
assert gb.cumsum(args=1, kwargs=1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've got this wrong. This is not testing args at all, to test if the *args deprecation is working you need to use gb.cumsum(1, incorrect_argument=1). And I'd recommend splitting this into two separate tests, otherwise you will only test that one of them works, not that both work.

I'd recommend that you make sure you understand well what the * and ** do in a function signature like def foo(my_arg, *args, **kwargs):

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Feb 10, 2023
@datapythonista
Copy link
Member

@ramvikrams do you want to keep working on this PR?

@ramvikrams
Copy link
Contributor Author

@ramvikrams do you want to keep working on this PR?

i think it's better I close it

@ramvikrams ramvikrams closed this Feb 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Groupby Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEPR: Unnecessary kwargs in groupby ops
3 participants