Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGR: DataFrame.replace when the replacement value was explicitly None #46404

Merged
merged 3 commits into from
Mar 19, 2022

Conversation

simonjayhawkins
Copy link
Member

@simonjayhawkins simonjayhawkins added Regression Functionality that used to work in a prior pandas version replace replace method labels Mar 17, 2022
@simonjayhawkins simonjayhawkins added this to the 1.4.2 milestone Mar 17, 2022
@simonjayhawkins simonjayhawkins changed the title REGR: DataFrame.replace when the replacement value was explitly None REGR: DataFrame.replace when the replacement value was explicitly None Mar 17, 2022
@jreback
Copy link
Contributor

jreback commented Mar 18, 2022

cool, looks like a conflict

@jreback jreback merged commit a875c23 into pandas-dev:main Mar 19, 2022
@jreback
Copy link
Contributor

jreback commented Mar 19, 2022

thanks @simonjayhawkins

@jreback
Copy link
Contributor

jreback commented Mar 19, 2022

@meeseeksdev backport 1.4.x

@simonjayhawkins simonjayhawkins deleted the replace-list-with-None-value branch March 19, 2022 18:45
@simonjayhawkins
Copy link
Member Author

@meeseeksdev backport 1.4.x

@@ -777,6 +777,13 @@ def _replace_coerce(
mask=mask,
)
else:
if value is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why here instead of in 'replace'?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because on main the recursion error occurs and we normally fix backports by opening PR against main and then backporting rather than against the backport branch directly.

Also, we would split the blocks in Block.replace which we didn't do on 1.3.5 and the regression fix restores previous behavior for now, see #45601 (comment).

I think is we do move to replace after the recursion is fixed we could also backport as a bug fix if we think that the block splitting is desirable to be consistent for 1.4.x

None handling is also slightly different in Block.replace than for a list-like so I suspect would need some other changes which happy as a followup on master.

This PR was a very targeted regression fix as a suitable backport for 1.4.x.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, thanks.

@@ -661,6 +661,20 @@ def test_replace_simple_nested_dict_with_nonexistent_value(self):
result = df.replace({"col": {-1: "-", 1: "a", 4: "b"}})
tm.assert_frame_equal(expected, result)

def test_replace_NA_with_None(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in all of the relevant examples the both the value being replaced and the replacement are NA. are these the only affected cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC for a list like to_replace None is treated explicitly at the moment, whereas if using a scalar None, the behavior is different in some cases. My understanding is that users are therefore using a dictionary to get the explicit replacement behavior. To make these consistent, we would need to deprecate this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Functionality that used to work in a prior pandas version replace replace method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Pandas 1.4.0 - pd.NaT can not be replaced. BUG: Replacing pd.NA by None has no effect
3 participants