Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API/DEPR: DataFrame/Series.replace is too complex. #33302

Open
simonjayhawkins opened this issue Apr 5, 2020 · 6 comments
Open

API/DEPR: DataFrame/Series.replace is too complex. #33302

simonjayhawkins opened this issue Apr 5, 2020 · 6 comments
Labels
Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action replace replace method

Comments

@simonjayhawkins
Copy link
Member

in #17673 (comment) @jorisvandenbossche wrote

Underlying reason is that this function of course can do way too many things at the same time (or the same things in too many different ways) ... (orthogonal to this, we could maybe also think if certain functionality could be moved into its own function).

opened this issue to discuss.

@simonjayhawkins simonjayhawkins added API Design Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action labels Apr 5, 2020
@simonjayhawkins
Copy link
Member Author

simonjayhawkins commented Apr 5, 2020

the current behaviour when value is None is surprising from an end-user point of view #19998, #33298

#19998 (comment) @reidy-p wrote

These are good suggestions. I think it would be good if the method parameter is only used when explicitly called rather than by default because it's quite easy to get unexpected behaviour at present as you have shown. If it doesn't break the API substantially I would prefer if the behaviour of s.replace('a', None) is the same as s.replace({'a': None}).

@simonjayhawkins
Copy link
Member Author

could the regex parameter be deprecated and use str.replace instead? DataFrames don't have a str accessor, but this is a user request see #31850

@simonjayhawkins simonjayhawkins added this to the 2.0 milestone Apr 5, 2020
@jbrockmendel
Copy link
Member

+1 for simplifying

@jbrockmendel
Copy link
Member

After #33279 the follow-up removes the filter kwarg from Block.replace, which lets us simplify the implementatin of NDFrame.replace. Doesnt affect the API, but pertinent.

@jbrockmendel
Copy link
Member

I'm leaning towards deprecating the ffill/bfill behavior with value=None (in part bc it has hidden bugs bc Series._replace_single uses .values instead of ._values).

In my first attempt at deprecating I thought we could tell the user to do obj.replace(to_replace, np.nan).ffill(), but that actually messes up in cases where obj already has NAs that we dont want to fill.

@mroeschke mroeschke removed this from the 2.0 milestone Jan 13, 2023
@jbrockmendel
Copy link
Member

I propose the following, which we can do any subset of:

  1. deprecate the "method" keyword in replace
  2. deprecate the value=lib.no_default case (possibly excepting dict-like to_replace)
  3. deprecate the regex keyword and introduce a method for that. Or maybe there is something users can do with a string accessor to avoid the need for a new method?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action replace replace method
Projects
None yet
Development

No branches or pull requests

3 participants