Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate raise_on_error in .where/.mask in favor of errors= #17744

Merged
merged 1 commit into from Oct 5, 2017

Conversation

@jreback
Copy link
Contributor

commented Oct 2, 2017

closes #14968

@jreback jreback added the Deprecate label Oct 2, 2017

@jreback jreback added this to the 0.21.0 milestone Oct 2, 2017

@jsexauer jsexauer referenced this pull request Oct 2, 2017
0 of 89 tasks complete
@jorisvandenbossche
Copy link
Member

left a comment

The PR itself looks good, added some minor comments.

But more in general, can you actually give a use case ? I mean a small example where you see the effect of the keyword?
I haven't been able to make one, and it also seems you didn't have to change any test (the one test you changed, there the keyword has no effect at all). Just wondering, probably I am missing something, but if not I would rather deprecate completely without adding alternative.

return the results
errors : str, {'raise', 'ignore'}, default 'ignore'
'raise' : pass the error to the higher level
'ignore' : evaluate the op with and return the results

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Oct 2, 2017

Member

I know it was like this before, but for me the phrase "evaluate the op with" is not really clear what it means

This comment has been minimized.

Copy link
@jreback

jreback Oct 2, 2017

Author Contributor

yeah a bunch of copy-paste


if raise_on_error is not None:
warnings.warn(
"raise_on_error is deprecated in favor of errors='raise'",

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Oct 2, 2017

Member

the 'raise' is only for setting it to True, not False, so I would mention both options

This comment has been minimized.

Copy link
@jreback

jreback Oct 2, 2017

Author Contributor

oh right

assert_series_equal(result, expected)

result = s.where([True, False],
Timestamp('20130101', tz='US/Eastern'),
raise_on_error=True)
errors='ignore')

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Oct 2, 2017

Member

What is the difference between both of the above cases ?

(also , the errors has no effect)

This comment has been minimized.

Copy link
@jreback

jreback Oct 2, 2017

Author Contributor

right the kw is pretty much ignored in .where (that's part of the problem); its used in other routines. This is pretty much a mess internally, though making progress.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Oct 2, 2017

But more in general, can you actually give a use case ? I mean a small example where you see the effect of the keyword?
I haven't been able to make one, and it also seems you didn't have to change any test (the one test you changed, there the keyword has no effect at all). Just wondering, probably I am missing something, but if not I would rather deprecate completely without adding alternative.

this really doesn't change anything; just had to fix some actual calls that causes the deprecation warning. The point of this is make things consistent. In the future I actually do want to change the defaults I think (and make 'coerce') possible. This is just conforming things a bit.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Oct 2, 2017

I know this PR is not changing things, it is just converting the previous situation to a new more consistent one. That is fine. But if it is really a useless keyword, it is also an opportunity to clean things up, and to not just translate the mistakes (?) of the previous keyword to the new one.

Maybe I am missing something, but so my question still is: is there a case errors='ignore' actually does something?
If it currently has no effect, we shouldn't add the new errors keyword, but just deprecate the old raise_on_errors. Or if adding the new one, have errors='raise' as the only option.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Oct 2, 2017

Maybe I am missing something, but so my question still is: is there a case errors='ignore' actually does something?
If it currently has no effect, we shouldn't add the new errors keyword, but just deprecate the old raise_on_errors. Or if adding the new one, have errors='raise' as the only option.

We DO need this, it is just not fully working ATM; it was turned off a while back. We need the option to raise, or ignore where the dtype is actually changing (rather than just do it, or in some cases coerce), this is a very common choice which we do in lots of places.

I don't think its a big deal to add the kw. Sure I can prob only allow 'raise' for now. Though we DO need it in a couple of places.

@codecov

This comment has been minimized.

Copy link

commented Oct 2, 2017

Codecov Report

Merging #17744 into master will decrease coverage by 0.04%.
The diff coverage is 84.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #17744      +/-   ##
==========================================
- Coverage   91.24%    91.2%   -0.05%     
==========================================
  Files         163      163              
  Lines       49962    49973      +11     
==========================================
- Hits        45588    45576      -12     
- Misses       4374     4397      +23
Flag Coverage Δ
#multiple 89% <84.61%> (-0.03%) ⬇️
#single 40.25% <30.76%> (-0.08%) ⬇️
Impacted Files Coverage Δ
pandas/core/panel.py 96.94% <ø> (ø) ⬆️
pandas/core/computation/expressions.py 0% <ø> (ø) ⬆️
pandas/core/frame.py 97.74% <100%> (-0.1%) ⬇️
pandas/core/series.py 94.89% <100%> (ø) ⬆️
pandas/core/sparse/frame.py 94.78% <100%> (ø) ⬆️
pandas/core/ops.py 91.76% <75%> (ø) ⬆️
pandas/core/generic.py 92.03% <81.81%> (-0.07%) ⬇️
pandas/core/internals.py 94.37% <83.33%> (ø) ⬆️
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/plotting/_converter.py 63.38% <0%> (-1.82%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3fae8dd...552437b. Read the comment docs.

@jreback jreback force-pushed the jreback:where branch from 8b0773f to 8f6a72b Oct 3, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Oct 3, 2017

this should be much cleaner now. final thoughts...... @jorisvandenbossche @TomAugspurger

@TomAugspurger
Copy link
Contributor

left a comment

+1 for doing this now, if it makes things easier in the future.

- ``raise`` : allow exceptions to be raised
- ``ignore`` : suppress exceptions. On error return original object
values :

This comment has been minimized.

Copy link
@TomAugspurger

TomAugspurger Oct 3, 2017

Contributor

Looks like an unfinished docstring here.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Oct 4, 2017

I am fine with having the keyword, I would just update the actual docstring (of the user facing where and mask) to not be misleading (as the current explanation is not correct).

I didn't follow the earlier discussion (#16821), but just to clarify: what is the idea with errors keyword in the future? You say you would like to change this? But to what?
Because the linked PR actually changed it to not error anymore if a non-matching dtype value is inserted, but to convert to object.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Oct 5, 2017

I didn't follow the earlier discussion (#16821), but just to clarify: what is the idea with errors keyword in the future? You say you would like to change this? But to what?
Because the linked PR actually changed it to not error anymore if a non-matching dtype value is inserted, but to convert to object.

well the problem is we are effectively always errors='coerce' now (even though we don't call it that), and never errors='raise```, eg. say you do a .where()(or lots of other ops like.fillna, .setitemetc.) on something which would change the dtype, e.g. assign aTimestamp`` to an integer array.

Now we just up-convert to object, but providing an option to coerce or raise is palable, and in fact you really want the default to be raise (and actually do it rather than up-converting), which is almost always an accident)

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Oct 5, 2017

updated.

@jreback jreback force-pushed the jreback:where branch from 8f6a72b to 9a68746 Oct 5, 2017

@jreback jreback force-pushed the jreback:where branch from 9a68746 to 552437b Oct 5, 2017

@jreback jreback merged commit 7a57b83 into pandas-dev:master Oct 5, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

ghost pushed a commit to reef-technologies/pandas that referenced this pull request Oct 16, 2017

@jreback jreback referenced this pull request Jun 30, 2019
141 of 141 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.