Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Where method on DataFrames can't seem to take series as conditions #9558

Closed
max-sixty opened this issue Feb 26, 2015 · 5 comments
Closed

Where method on DataFrames can't seem to take series as conditions #9558

max-sixty opened this issue Feb 26, 2015 · 5 comments
Labels
Error Reporting Incorrect or improved errors from pandas Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Comments

@max-sixty
Copy link
Contributor

I want to use where to set values across a DataFrame where a Series meets a condition.

For example:

df=pd.DataFrame({'a':[1,2,3], 'b':[4,5,6]})

But this

df.where(df['a']==2,0)

returns an unedited DataFrame.
Whereas this returns just the middle row, as you'd expect:

df[df['a']==2]

Supplying axis=0 or axis=1 doesn't help. Am I doing something wrong? Or this unintended behavior?

Versions
python: 2.7.6.final.0
pandas: 0.14.1

@TomAugspurger
Copy link
Contributor

I believe the cond argument to df.where has to be the same shape as the original df.

@max-sixty
Copy link
Contributor Author

@TomAugspurger, that's correct.

Is that desired behavior though? I would think that broadcasting across an index (in the same way as the final example broadcasts) would be desired.

@jreback
Copy link
Contributor

jreback commented Feb 27, 2015

@MaximilianR the reason for this is that .where() is shape preserving, IOW it puts nan in the non-matching conditions.

df[df['a']==2] potentially changes the shape of the result. e.g. it can drop rows.

FYI, df.where(df['a']==2) raises (an odd error message), so actually prob need to do some more checking for shape compat (right now it just does a .reindex()).

@jreback jreback added the Error Reporting Incorrect or improved errors from pandas label Feb 27, 2015
@jreback jreback added this to the 0.17.0 milestone Feb 27, 2015
@jreback jreback added the Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate label Feb 27, 2015
@max-sixty
Copy link
Contributor Author

@jreback Thanks!

@max-sixty
Copy link
Contributor Author

This is closed with #10283. Closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

No branches or pull requests

3 participants