Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: New callable indexing makes storing functions in a Series difficult #13299

Closed
evanpw opened this issue May 26, 2016 · 8 comments · Fixed by #13516
Closed

API: New callable indexing makes storing functions in a Series difficult #13299

evanpw opened this issue May 26, 2016 · 8 comments · Fixed by #13516
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@evanpw
Copy link
Contributor

evanpw commented May 26, 2016

Example of something that previously worked (before 7bbd031):

>>> s = pd.Series([lambda x: x] * 10)
>>> s[s.index > 5] = lambda x: x + 1

But now the second line tries to apply the function in the rhs to the elements of the Series, rather than assigning them (and throws an exception). This is very counter-intuitive when using __setitem__ rather than calling Series.where directly.

@jreback
Copy link
Contributor

jreback commented May 26, 2016

this is quite odd to store callables in a Series

what is the purpose of doing that?

@jreback
Copy link
Contributor

jreback commented May 27, 2016

hmm, this is simple to fix (we just don't evaluate a callable for other in .where).

@sinhrks do we have a valid use of other as a callable?

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves API Design labels May 27, 2016
@jreback jreback added this to the 0.18.2 milestone May 27, 2016
@evanpw
Copy link
Contributor Author

evanpw commented May 27, 2016

We're choosing a function based on some columns in a DataFrame, and then applying them on other column values / external data. Since there might be a different function for each row, it's convenient to store them in the same place.

@jreback
Copy link
Contributor

jreback commented May 27, 2016

@evanpw I think you might be able to just remove the _maybe_callable(other...) and this will work. Not really sure of the original purpose of testing for a callable on the other (e.g. it works, but IMHO not that useful).

If you want to give this a shot and see what it looks like would be great.

@evanpw
Copy link
Contributor Author

evanpw commented May 27, 2016

Yes, deleting this line from where in generic.py fixes the problem for me:

other = com._apply_if_callable(other, self)

@sinhrks
Copy link
Member

sinhrks commented May 28, 2016

The usage in my mind was chaining like:

pd.DataFrame({'A': [1, 2, 1, 2], 'B': [1, 2, 3, 4]}).groupby('A').sum().where(lambda x: x < 5, lambda x: -x)
#    B
# A   
# 1  4
# 2 -6

I think it is more likely rather than storing callable.

CC: @TomAugspurger

@evanpw
Copy link
Contributor Author

evanpw commented May 28, 2016

I think that makes sense when calling where directly. What about making it configurable, and turned off when using __setitem__?

@jreback
Copy link
Contributor

jreback commented May 29, 2016

yes that is a bug I think.

@jorisvandenbossche jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Jul 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants