Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.replace only replaces the first occurrence of replacement pattern #6689

Closed
fonnesbeck opened this issue Mar 22, 2014 · 8 comments · Fixed by #6820
Closed

DataFrame.replace only replaces the first occurrence of replacement pattern #6689

fonnesbeck opened this issue Mar 22, 2014 · 8 comments · Fixed by #6820
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Milestone

Comments

@fonnesbeck
Copy link

This is best explained by a screenshot:

bad_replace

I'm running a pretty recent build of Pandas ('0.13.1-213-gc174c3d') on Python 2.7.5 on OS X 0.9.2.

@fonnesbeck fonnesbeck changed the title DataFrame replace only replaces the first occurrence of replacement pattern DataFrame.replace only replaces the first occurrence of replacement pattern Mar 22, 2014
@dsm054
Copy link
Contributor

dsm054 commented Mar 22, 2014

Feels like a changing-dtype issue, or maybe a mixup of values and keys (True == 1 and False == 0):

>>> df = pd.DataFrame({"a": [True, False, True]})
>>> df
       a
0   True
1  False
2   True

[3 rows x 1 columns]
>>> df.replace({True: "Y", False: "N"})
   a
0  Y
1  N
2  Y

[3 rows x 1 columns]
>>> df.replace({"a": {True: "Y", False: "N"}})
      a
0     N
1     Y
2  True

[3 rows x 1 columns]
>>> df.astype(object).replace({"a": {True: "Y", False: "N"}})
   a
0  Y
1  N
2  Y

[3 rows x 1 columns]

but I've never fully understood the intended semantics of replace.

@dsm054
Copy link
Contributor

dsm054 commented Mar 22, 2014

Yeah, it looks like the keys of the inner dict are being interpreted as indices to match, not values, which explains the Y/N, N/Y swap above. For example:

>>> df = pd.DataFrame({"a": [True, False, True]})
>>> df.replace({"a": {0: "zero", 1: "one", 2: "two"}})
      a
0  zero
1   one
2   two

[3 rows x 1 columns]

@cpcloud
Copy link
Member

cpcloud commented Mar 23, 2014

Are you guys running master? I fixed a similar bug somewhat recently.

@dsm054
Copy link
Contributor

dsm054 commented Mar 23, 2014

@cpcloud: I am, at least.

@cpcloud
Copy link
Member

cpcloud commented Mar 23, 2014

Okay thanks I'll take a look. Looks like a dtype issue. @fonnesbeck thanks for the report.

@cpcloud cpcloud self-assigned this Mar 28, 2014
@cpcloud cpcloud added this to the 0.14.0 milestone Mar 28, 2014
@cpcloud
Copy link
Member

cpcloud commented Apr 6, 2014

@jreback Is there a way to select and set a block by name? Something like df._data['a'] = df._data['a'].some_method()?

@cpcloud
Copy link
Member

cpcloud commented Apr 6, 2014

i want to operate on a block inplace or out of place but see the changes in the whole block manager

@cpcloud
Copy link
Member

cpcloud commented Apr 6, 2014

Nevermind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants