Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DataFrame.replace with out of bound datetime causing RecursionError #22108

Merged
merged 7 commits into from
Aug 1, 2018

Conversation

minggli
Copy link
Contributor

@minggli minggli commented Jul 28, 2018

By default, keyword convert=True for replace method for Block and ObjectBlock. When trying to replace dataframe, blocks are operated on separately, during which conversion happens if convert=True. OutOfBoundsDatetime(inherited from ValueError) raised by lib.maybe_convert_objects caused Block.replace, ObjectBlock.replace to form infinite recursive loop.

As mentioned in #20380, setting convert=False (private and not available to public) solves the issue at hand, but there appear to be other uses cases that expect it to be True by default. Right now, simply wrap a try...except block around the lib.maybe_convert_objects in cast module. Downside is it catches all ValueError in maybe_convert_objects silently.

@codecov
Copy link

codecov bot commented Jul 29, 2018

Codecov Report

Merging #22108 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22108      +/-   ##
==========================================
+ Coverage   92.06%   92.07%   +0.01%     
==========================================
  Files         170      170              
  Lines       50705    50693      -12     
==========================================
- Hits        46680    46675       -5     
+ Misses       4025     4018       -7
Flag Coverage Δ
#multiple 90.48% <100%> (+0.01%) ⬆️
#single 42.31% <50%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/internals/blocks.py 94.45% <100%> (-0.02%) ⬇️
pandas/core/dtypes/cast.py 88.58% <100%> (+0.05%) ⬆️
pandas/core/tools/datetimes.py 84.78% <0%> (-0.44%) ⬇️
pandas/io/sas/sas_xport.py 90.23% <0%> (-0.05%) ⬇️
pandas/io/formats/html.py 88.81% <0%> (-0.04%) ⬇️
pandas/core/arrays/interval.py 92.33% <0%> (-0.03%) ⬇️
pandas/core/sparse/frame.py 94.78% <0%> (-0.02%) ⬇️
pandas/core/indexes/interval.py 94.11% <0%> (-0.02%) ⬇️
pandas/core/nanops.py 95.12% <0%> (-0.01%) ⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b7a08b...cb2a3c6. Read the comment docs.

@jreback jreback added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jul 29, 2018
@jreback jreback added this to the 0.24.0 milestone Jul 29, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. can you add a whatsnew, bug fix in 0.24.0, reshaping

values = lib.maybe_convert_objects(values,
convert_datetime=datetime)
except ValueError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment on when the valueerror is hit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented, used OutOfBoundsDatetime to be more clear on the error

datetime(2018, 7, 28),
datetime(2018, 5, 28)])}),
datetime(2018, 5, 28), datetime(2018, 7, 28),
DataFrame({'datetime64': Index([datetime(2018, 7, 28)] * 3)})),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a datetime64 w/tz here as well (if its an easy add)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure! 👍

@minggli
Copy link
Contributor Author

minggli commented Jul 31, 2018

@jreback items actioned. additional feedback welcome.

@jreback jreback merged commit 57c7daa into pandas-dev:master Aug 1, 2018
@jreback
Copy link
Contributor

jreback commented Aug 1, 2018

thanks @minggli keep me coming!

@minggli minggli deleted the bugfix/replace_recursion_dt branch August 1, 2018 07:21
@minggli
Copy link
Contributor Author

minggli commented Aug 1, 2018

Thanks for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RecursionError in DataFrame.replace with Out of Bounds Datetime
2 participants