Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: silent overflow in DateTimeArray subtraction #22508

Merged
merged 5 commits into from Aug 31, 2018

Conversation

Projects
None yet
5 participants
@shengpu1126
Copy link
Contributor

commented Aug 26, 2018

  • closes #22492
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@shengpu1126

This comment has been minimized.

Copy link
Contributor Author

commented Aug 26, 2018

Hi @jbrockmendel , thanks for the follow-up.

  • With the suggested change, it now throws OverflowError: Overflow in int64 addition. Is this desired behavior?
  • Where should I add the tests? Maybe under tests/arrays/test_datetimelike.py?
@codecov

This comment has been minimized.

Copy link

commented Aug 26, 2018

Codecov Report

❗️ No coverage uploaded for pull request base (master@fa47b8d). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #22508   +/-   ##
=========================================
  Coverage          ?   92.03%           
=========================================
  Files             ?      169           
  Lines             ?    50780           
  Branches          ?        0           
=========================================
  Hits              ?    46737           
  Misses            ?     4043           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.44% <100%> (?)
#single 42.25% <100%> (?)
Impacted Files Coverage Δ
pandas/core/arrays/datetimes.py 95.52% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fa47b8d...cfaf42b. Read the comment docs.

@gfyoung

This comment has been minimized.

Copy link
Member

commented Aug 26, 2018

@shengpu1126 : You should make the changes so that it gets the expected output that you reported in the original issue. And yes, that does seem like a good place to put the test. Don't forget a whatsnew entry too.

@jbrockmendel

This comment has been minimized.

Copy link
Member

commented Aug 26, 2018

@gfyoung it isn’t that easy. If you look at the scalar op in the original example, it returns a stdlib timedelta, not a pd.Timedelta.

@shengpu1126 i think raising overflowerror is correct, will doublecheck.

@gfyoung

This comment has been minimized.

Copy link
Member

commented Aug 26, 2018

it isn’t that easy. If you look at the scalar op in the original example, it returns a stdlib timedelta, not a pd.Timedelta.

@jbrockmendel : Ah, I see. My response was motivated by the fact that you didn't comment on @shengpu1126 expected output in the original issue, which I took to be a tacit agreement of the expected behavior for patching this bug.

@jbrockmendel

This comment has been minimized.

Copy link
Member

commented Aug 26, 2018

Test will go in tests.arithmetic.test_datetime64

@pep8speaks

This comment has been minimized.

Copy link

commented Aug 28, 2018

Hello @shengpu1126! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on August 30, 2018 at 15:56 Hours UTC
@@ -582,6 +582,7 @@ Datetimelike
- Bug in :class:`DataFrame` comparisons against ``Timestamp``-like objects failing to raise ``TypeError`` for inequality checks with mismatched types (:issue:`8932`,:issue:`22163`)
- Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`)
- Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`)
- Bug in :class:`DatetimeIndex` subtraction that incorrectly failed to raise `OverflowError` (:issue:22492, :issue:22508)

This comment has been minimized.

Copy link
@jreback

jreback Aug 29, 2018

Contributor

can you add a little more here, e.g. when this fails to raise

@@ -1570,6 +1570,27 @@ def test_datetimeindex_sub_timestamp_overflow(self):
with pytest.raises(OverflowError):
dtimin - variant

def test_datetimeindex_sub_datetimeindex_overflow(self):
dtimax = pd.to_datetime(['now', pd.Timestamp.max])

This comment has been minimized.

Copy link
@jreback

jreback Aug 29, 2018

Contributor

can you add a comment with the issue number

@jreback jreback added this to the 0.24.0 milestone Aug 29, 2018

@jreback
Copy link
Contributor

left a comment

minor comments. ping on green (or at least except for 3.6/3.5 builds are currently failing to an unrelated error). so might want to wait a little bit to rebase.

@@ -582,6 +582,7 @@ Datetimelike
- Bug in :class:`DataFrame` comparisons against ``Timestamp``-like objects failing to raise ``TypeError`` for inequality checks with mismatched types (:issue:`8932`,:issue:`22163`)
- Bug in :class:`DataFrame` with mixed dtypes including ``datetime64[ns]`` incorrectly raising ``TypeError`` on equality comparisons (:issue:`13128`,:issue:`22163`)
- Bug in :meth:`DataFrame.eq` comparison against ``NaT`` incorrectly returning ``True`` or ``NaN`` (:issue:`15697`,:issue:`22163`)
- Bug in :class:`DatetimeIndex` subtraction that incorrectly failed to raise `OverflowError` when the difference between :class:`Timestamp`s (a :class:`Timedelta` object) exceeds `int64` limit (:issue:`22492`, :issue:`22508`)

This comment has been minimized.

Copy link
@jreback

jreback Aug 30, 2018

Contributor

this is now confusing. let's go back to the prior one;

This comment has been minimized.

Copy link
@shengpu1126

shengpu1126 Aug 30, 2018

Author Contributor

sorry about that!
for the record, it was mentioned here that pd.Timedeltas are represented as nanoseconds using 64 bit integers, and I believe that's the source of potential overflow.

dtimax - ts_neg

expected = pd.Timestamp.max.value - ts_pos[1].value
res = dtimax - ts_pos

This comment has been minimized.

Copy link
@jreback

jreback Aug 30, 2018

Contributor

use result (rather than res) in all occurrences

assert res[1].value == expected

with pytest.raises(OverflowError):
dtimin - ts_pos

This comment has been minimized.

Copy link
@jreback

jreback Aug 30, 2018

Contributor

can you add a comment or 2 to delineate the various cases

@jreback jreback merged commit 8c128a3 into pandas-dev:master Aug 31, 2018

5 of 6 checks passed

ci/circleci: py36_locale Your tests failed on CircleCI
Details
ci/circleci: py27_compat Your tests passed on CircleCI!
Details
ci/circleci: py35_ascii Your tests passed on CircleCI!
Details
ci/circleci: py36_locale_slow Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jreback

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2018

thanks!

Sup3rGeo added a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.