Skip to content

Conversation

@julianste
Copy link
Contributor

@julianste julianste commented May 6, 2022

In the example, all negative values of a group should be replaced by the mean of the rest of the group. The linked stackoverflow output has it right: https://stackoverflow.com/questions/14760757/replacing-values-with-groupby-means

We have to negate the first argument of DataFrame.where here, since this should be the condition when the value stays the same (and this should be the case for g>=0.)

pandas df.where behavior is pretty weird when coming from Spark world :)

  • closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Negate first argument of `DataFrame.where`, since in pandas this must be the condition where the value should stay the same.
@mroeschke mroeschke added the Docs label May 7, 2022
@mroeschke mroeschke added this to the 1.5 milestone May 7, 2022
@mroeschke mroeschke merged commit 01650a8 into pandas-dev:main May 7, 2022
@mroeschke
Copy link
Member

Thanks @julianste

yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
Negate first argument of `DataFrame.where`, since in pandas this must be the condition where the value should stay the same.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants