Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: avoid overflow in Bday generate_range, closes #24252 #26651

Merged
merged 14 commits into from Jun 21, 2019

Conversation

@jbrockmendel
Copy link
Member

commented Jun 4, 2019

  • closes #24252
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@codecov

This comment has been minimized.

Copy link

commented Jun 4, 2019

Codecov Report

Merging #26651 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26651      +/-   ##
==========================================
- Coverage   91.87%   91.87%   -0.01%     
==========================================
  Files         174      174              
  Lines       50692    50696       +4     
==========================================
+ Hits        46575    46576       +1     
- Misses       4117     4120       +3
Flag Coverage Δ
#multiple 90.41% <100%> (ø) ⬆️
#single 41.8% <50%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 96.7% <100%> (+0.01%) ⬆️
pandas/io/gbq.py 78.94% <0%> (-10.53%) ⬇️
pandas/core/frame.py 97% <0%> (-0.12%) ⬇️
pandas/util/testing.py 90.91% <0%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c07d71d...e200f32. Read the comment docs.

@codecov

This comment has been minimized.

Copy link

commented Jun 4, 2019

Codecov Report

Merging #26651 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #26651      +/-   ##
==========================================
- Coverage   91.87%   91.86%   -0.02%     
==========================================
  Files         180      180              
  Lines       50746    50751       +5     
==========================================
- Hits        46623    46621       -2     
- Misses       4123     4130       +7
Flag Coverage Δ
#multiple 90.45% <100%> (-0.01%) ⬇️
#single 41.09% <33.33%> (-0.1%) ⬇️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 96.36% <100%> (-0.33%) ⬇️
pandas/io/gbq.py 88.88% <0%> (-11.12%) ⬇️
pandas/core/frame.py 96.89% <0%> (-0.12%) ⬇️
pandas/util/testing.py 90.94% <0%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cfd65e9...b06440c. Read the comment docs.

@topper-123

This comment has been minimized.

Copy link
Contributor

commented Jun 4, 2019

This doesn't solve the overflow type:

>>> date = pd.Timestamp.max.floor("D").to_pydatetime().date()  # datetime.date(2262, 4, 11)
>>> freq = 'B'
>>> pd.date_range(date, periods=2, freq=freq)  # see periods=2
OverflowError: int too big to convert

I would guess this should be a OutOfBoundsDatetime and not a OverflowError?

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jun 4, 2019

Yah, it should probably be an OutOfBoundsDatetime instead of OverflowError. Will update.

@jreback jreback added this to the 0.25.0 milestone Jun 6, 2019

Show resolved Hide resolved pandas/tseries/offsets.py
ts = convert_to_tsobject(ts_input, tz, unit, 0, 0, nanosecond or 0)
try:
ts = convert_to_tsobject(ts_input, tz, unit, 0, 0, nanosecond or 0)
except OverflowError:

This comment has been minimized.

Copy link
@jreback

jreback Jun 6, 2019

Contributor

can you update the doc-string of convert_ot_tsobject. Why are we not catching the OverflowError inside?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 7, 2019

Author Member

I put this here because I wanted to do this error-catching in python-space instead of c-space so that convert_to_tsobject could have better optimizations, but it looks like convert_to_tsobject already has some raising cases, so this can be moved there. will update.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 8, 2019

lgtm. ping when green.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 10, 2019

code & tests lgtm. can you add a whatsnew. ping on green.

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jun 18, 2019

ping

@jreback
Copy link
Contributor

left a comment

small comment, otherwise lgtm.

@@ -24,6 +24,7 @@ from pandas._libs.tslibs.conversion cimport (
from pandas._libs.tslibs.nattype cimport NPY_NAT, c_NaT as NaT
from pandas._libs.tslibs.np_datetime cimport (
check_dts_bounds, npy_datetimestruct, dt64_to_dtstruct)
from pandas._libs.tslibs.np_datetime import OutOfBoundsDatetime

This comment has been minimized.

Copy link
@jreback

jreback Jun 18, 2019

Contributor

is this used here?

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Jun 19, 2019

Author Member

Nope, removed.

I'm holding out hope that one day we'll be able to lint these files.

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jun 19, 2019

In one Travis build I'm seeing an error I also get locally on many branches:

_______ TestDataFrameConstructors.test_constructor_maskedrecarray_dtype ________
[gw1] linux -- Python 3.7.3 /home/travis/miniconda3/envs/pandas-dev/bin/python
self = <pandas.tests.frame.test_constructors.TestDataFrameConstructors object at 0x7fa6ace49f98>
    def test_constructor_maskedrecarray_dtype(self):
        # Ensure constructor honors dtype
        data = np.ma.array(
            np.ma.zeros(5, dtype=[('date', '<f8'), ('price', '<f8')]),
            mask=[False] * 5)
>       data = data.view(ma.mrecords.mrecarray)
E       AttributeError: module 'numpy.ma' has no attribute 'mrecords'

Any idea what this is all about?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

Not sure, but numpy.ma.mrecords doesn't seem to be imported by default. Would need import numpy.ma.mrecords before using ma.mrecords.

@jbrockmendel

This comment has been minimized.

Copy link
Member Author

commented Jun 20, 2019

Looks like we import ma.mrecords as a runtime import in the DataFrame constructor, so this ma.mrecords will not exist if we run these tests before that runtime import gets hit. (I'm guessing we started randomizing test order recently?). Just added the import in the appropriate place in the test so it shouldn't matter anymore.

ts = <int64_t>ts
except OverflowError:
# GH#26651 re-raise as OutOfBoundsDatetime
raise OutOfBoundsDatetime

This comment has been minimized.

Copy link
@TomAugspurger

TomAugspurger Jun 20, 2019

Contributor

Is including the bad ts in the exception message useful here? I imagine it's good for Timestamp(too_big_int), but not so much for other inputs (if they hit this).

@jreback jreback merged commit 388d22c into pandas-dev:master Jun 21, 2019

12 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
pandas-dev.pandas Build #20190620.29 succeeded
Details
pandas-dev.pandas (Checks) Checks succeeded
Details
pandas-dev.pandas (Docs) Docs succeeded
Details
pandas-dev.pandas (Linux py35_compat) Linux py35_compat succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow) Linux py36_locale_slow succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow_old_np) Linux py36_locale_slow_old_np succeeded
Details
pandas-dev.pandas (Linux py37_locale) Linux py37_locale succeeded
Details
pandas-dev.pandas (Linux py37_np_dev) Linux py37_np_dev succeeded
Details
pandas-dev.pandas (Windows py36_np15) Windows py36_np15 succeeded
Details
pandas-dev.pandas (Windows py37_np141) Windows py37_np141 succeeded
Details
pandas-dev.pandas (macOS py35_macos) macOS py35_macos succeeded
Details
@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 21, 2019

thanks @jbrockmendel nice patch!

@jbrockmendel jbrockmendel deleted the jbrockmendel:oob branch Jun 21, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.