Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate raise_on_error in .where/.mask in favor of errors= #17744

Merged
merged 1 commit into from
Oct 5, 2017

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Oct 2, 2017

closes #14968

@jreback jreback added the Deprecate Functionality to remove in pandas label Oct 2, 2017
@jreback jreback added this to the 0.21.0 milestone Oct 2, 2017
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR itself looks good, added some minor comments.

But more in general, can you actually give a use case ? I mean a small example where you see the effect of the keyword?
I haven't been able to make one, and it also seems you didn't have to change any test (the one test you changed, there the keyword has no effect at all). Just wondering, probably I am missing something, but if not I would rather deprecate completely without adding alternative.

return the results
errors : str, {'raise', 'ignore'}, default 'ignore'
'raise' : pass the error to the higher level
'ignore' : evaluate the op with and return the results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was like this before, but for me the phrase "evaluate the op with" is not really clear what it means

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah a bunch of copy-paste


if raise_on_error is not None:
warnings.warn(
"raise_on_error is deprecated in favor of errors='raise'",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 'raise' is only for setting it to True, not False, so I would mention both options

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh right

assert_series_equal(result, expected)

result = s.where([True, False],
Timestamp('20130101', tz='US/Eastern'),
raise_on_error=True)
errors='ignore')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between both of the above cases ?

(also , the errors has no effect)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right the kw is pretty much ignored in .where (that's part of the problem); its used in other routines. This is pretty much a mess internally, though making progress.

@jreback
Copy link
Contributor Author

jreback commented Oct 2, 2017

But more in general, can you actually give a use case ? I mean a small example where you see the effect of the keyword?
I haven't been able to make one, and it also seems you didn't have to change any test (the one test you changed, there the keyword has no effect at all). Just wondering, probably I am missing something, but if not I would rather deprecate completely without adding alternative.

this really doesn't change anything; just had to fix some actual calls that causes the deprecation warning. The point of this is make things consistent. In the future I actually do want to change the defaults I think (and make 'coerce') possible. This is just conforming things a bit.

@jorisvandenbossche
Copy link
Member

I know this PR is not changing things, it is just converting the previous situation to a new more consistent one. That is fine. But if it is really a useless keyword, it is also an opportunity to clean things up, and to not just translate the mistakes (?) of the previous keyword to the new one.

Maybe I am missing something, but so my question still is: is there a case errors='ignore' actually does something?
If it currently has no effect, we shouldn't add the new errors keyword, but just deprecate the old raise_on_errors. Or if adding the new one, have errors='raise' as the only option.

@jreback
Copy link
Contributor Author

jreback commented Oct 2, 2017

Maybe I am missing something, but so my question still is: is there a case errors='ignore' actually does something?
If it currently has no effect, we shouldn't add the new errors keyword, but just deprecate the old raise_on_errors. Or if adding the new one, have errors='raise' as the only option.

We DO need this, it is just not fully working ATM; it was turned off a while back. We need the option to raise, or ignore where the dtype is actually changing (rather than just do it, or in some cases coerce), this is a very common choice which we do in lots of places.

I don't think its a big deal to add the kw. Sure I can prob only allow 'raise' for now. Though we DO need it in a couple of places.

@codecov
Copy link

codecov bot commented Oct 2, 2017

Codecov Report

Merging #17744 into master will decrease coverage by 0.04%.
The diff coverage is 84.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17744      +/-   ##
==========================================
- Coverage   91.24%    91.2%   -0.05%     
==========================================
  Files         163      163              
  Lines       49962    49973      +11     
==========================================
- Hits        45588    45576      -12     
- Misses       4374     4397      +23
Flag Coverage Δ
#multiple 89% <84.61%> (-0.03%) ⬇️
#single 40.25% <30.76%> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/panel.py 96.94% <ø> (ø) ⬆️
pandas/core/computation/expressions.py 0% <ø> (ø) ⬆️
pandas/core/frame.py 97.74% <100%> (-0.1%) ⬇️
pandas/core/series.py 94.89% <100%> (ø) ⬆️
pandas/core/sparse/frame.py 94.78% <100%> (ø) ⬆️
pandas/core/ops.py 91.76% <75%> (ø) ⬆️
pandas/core/generic.py 92.03% <81.81%> (-0.07%) ⬇️
pandas/core/internals.py 94.37% <83.33%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.38% <0%> (-1.82%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3fae8dd...552437b. Read the comment docs.

@jreback
Copy link
Contributor Author

jreback commented Oct 3, 2017

this should be much cleaner now. final thoughts...... @jorisvandenbossche @TomAugspurger

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for doing this now, if it makes things easier in the future.

- ``raise`` : allow exceptions to be raised
- ``ignore`` : suppress exceptions. On error return original object

values :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like an unfinished docstring here.

@jorisvandenbossche
Copy link
Member

I am fine with having the keyword, I would just update the actual docstring (of the user facing where and mask) to not be misleading (as the current explanation is not correct).

I didn't follow the earlier discussion (#16821), but just to clarify: what is the idea with errors keyword in the future? You say you would like to change this? But to what?
Because the linked PR actually changed it to not error anymore if a non-matching dtype value is inserted, but to convert to object.

@jreback
Copy link
Contributor Author

jreback commented Oct 5, 2017

I didn't follow the earlier discussion (#16821), but just to clarify: what is the idea with errors keyword in the future? You say you would like to change this? But to what?
Because the linked PR actually changed it to not error anymore if a non-matching dtype value is inserted, but to convert to object.

well the problem is we are effectively always errors='coerce' now (even though we don't call it that), and never errors='raise```, eg. say you do a .where()(or lots of other ops like.fillna, .setitemetc.) on something which would change the dtype, e.g. assign aTimestamp`` to an integer array.

Now we just up-convert to object, but providing an option to coerce or raise is palable, and in fact you really want the default to be raise (and actually do it rather than up-converting), which is almost always an accident)

@jreback
Copy link
Contributor Author

jreback commented Oct 5, 2017

updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEPR:change DataFrame.where(..,raise_on_error) -> errors
3 participants