Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/TST: Add more timezone fixtures and use is_utc more consistently #23807

Merged
merged 25 commits into from Nov 23, 2018

Conversation

Projects
None yet
4 participants
@mroeschke
Copy link
Member

commented Nov 20, 2018

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Was discussed a while back with @jreback to add more timezones to the tz_aware_fixture, added some less tested ones like:

  • dateutil.tz.tzlocal
  • pytz.FixedOffset

This uncovered some bugs with tzlocal especially near DST transitions.

Additionally, using is_utc more consistently for UTC checking across the codebase and no longer passing the string 'UTC' around. Instead the UTC timezone object (i.e. pytz.UTC or dateutil.tz.tzutc()) is passed along. However, for single value conversion to/from UTC, pytz.UTC is used by default.

@pep8speaks

This comment has been minimized.

Copy link

commented Nov 20, 2018

Hello @mroeschke! Thanks for submitting the PR.

@mroeschke mroeschke changed the title BUG/TST: Add more timezone fixtures and use is_utc more consistently [WIP] BUG/TST: Add more timezone fixtures and use is_utc more consistently Nov 20, 2018

@@ -1248,6 +1248,8 @@ Timezones
- Bug when indexing a :class:`Series` with a DST transition (:issue:`21846`)
- Bug in :meth:`DataFrame.resample` and :meth:`Series.resample` where an ``AmbiguousTimeError`` or ``NonExistentTimeError`` would raise if a timezone aware timeseries ended on a DST transition (:issue:`19375`, :issue:`10117`)
- Bug in :meth:`DataFrame.drop` and :meth:`Series.drop` when specifying a tz-aware Timestamp key to drop from a :class:`DatetimeIndex` with a DST transition (:issue:`21761`)
- Bug in :class:`DatetimeIndex` constructor where :class:`NaT` and ``dateutil.tz.tzlocal`` would raise an ``OutOfBoundsDatetime`` error (:issue:`23807`)

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Nov 20, 2018

Member

nitpick: I'm guessing this happens for users for whom tzlocal is UTC-N but not those for whom tzlocal is UTC+N (@jorisvandenbossche can you check this when convenient?)

This comment has been minimized.

Copy link
@mroeschke

mroeschke Nov 21, 2018

Author Member

This was actually because NaT wasn't handled in tz_localize_to_utc so NaT was converted to its i8 value which was less than Timestamp.min

https://github.com/pandas-dev/pandas/pull/23807/files#diff-66d289312fd0b02e7721bf45fb281797R911

@@ -664,7 +670,8 @@ cdef inline int64_t[:] _tz_convert_dst(int64_t[:] values, tzinfo tz,


cdef inline int64_t _tz_convert_tzlocal_utc(int64_t val, tzinfo tz,
bint to_utc=True):
bint to_utc=True,
bint from_utc=False):

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Nov 20, 2018

Member

Isn't from_utc just equal to not to_utc?

This comment has been minimized.

Copy link
@mroeschke

mroeschke Nov 21, 2018

Author Member

Hmm it appears that way in general. I can try to remove the new keyword arg for not to_utc

@@ -354,7 +354,8 @@ def unique_nulls_fixture(request):


TIMEZONES = [None, 'UTC', 'US/Eastern', 'Asia/Tokyo', 'dateutil/US/Pacific',
'dateutil/Asia/Singapore']
'dateutil/Asia/Singapore', tzutc(), tzlocal(), FixedOffset(300),
FixedOffset(0), FixedOffset(-300)]

This comment has been minimized.

Copy link
@jbrockmendel

jbrockmendel Nov 20, 2018

Member

Brainstorm:

  • non-round numbers for FixedOffset
  • we have a small number of tests with tzinfos from psycopg2.tz and matplotlib (not sure where in the matplotlib namespace)
  • random subset of pytz.all_timezones (592 of them)
  • equivalent from dateutil.tz (not sure off the top of my head how to access that, but if we figure it out I'll make a PR there)

This comment has been minimized.

Copy link
@jreback

jreback Nov 20, 2018

Contributor

how much additional time do the tests take with these added fixtures? how many more tests

This comment has been minimized.

Copy link
@mroeschke

mroeschke Nov 21, 2018

Author Member

This comment has been minimized.

Copy link
@mroeschke

mroeschke Nov 21, 2018

Author Member

We can add these in a follow up and further discuss.

@@ -8,6 +8,7 @@ from numpy cimport int64_t, int32_t, ndarray
cnp.import_array()

import pytz
from dateutil.tz import tzutc

This comment has been minimized.

Copy link
@jreback

jreback Nov 20, 2018

Contributor

why are you switching this?

This comment has been minimized.

Copy link
@mroeschke

mroeschke Nov 21, 2018

Author Member

No switching, this is just used here for timezone conversion between tzlocal and tzutc (making sure they are both dateutil timezones so they play nice) https://github.com/pandas-dev/pandas/pull/23807/files/de3c83bb946a7791d29a18db5f004228b24c9b02#diff-66d289312fd0b02e7721bf45fb281797R705

This comment has been minimized.

Copy link
@jreback

jreback Nov 23, 2018

Contributor

ok

@@ -354,7 +354,8 @@ def unique_nulls_fixture(request):


TIMEZONES = [None, 'UTC', 'US/Eastern', 'Asia/Tokyo', 'dateutil/US/Pacific',
'dateutil/Asia/Singapore']
'dateutil/Asia/Singapore', tzutc(), tzlocal(), FixedOffset(300),
FixedOffset(0), FixedOffset(-300)]

This comment has been minimized.

Copy link
@jreback

jreback Nov 20, 2018

Contributor

how much additional time do the tests take with these added fixtures? how many more tests

@@ -409,8 +409,8 @@ def test_to_datetime_tz_pytz(self, cache):
hour=3, minute=0))],
dtype=object)
result = pd.to_datetime(arr, utc=True, cache=cache)
expected = DatetimeIndex(['2000-01-01 08:00:00+00:00',

This comment has been minimized.

Copy link
@jreback

jreback Nov 20, 2018

Contributor

I find this surprising that you change this

This comment has been minimized.

Copy link
@mroeschke

mroeschke Nov 21, 2018

Author Member

Thanks, yeah this was the wrong fix. Addressed

@jreback jreback added the Timezones label Nov 21, 2018

@mroeschke mroeschke changed the title [WIP] BUG/TST: Add more timezone fixtures and use is_utc more consistently BUG/TST: Add more timezone fixtures and use is_utc more consistently Nov 21, 2018

Matt Roeschke added some commits Nov 21, 2018

@codecov

This comment has been minimized.

Copy link

commented Nov 22, 2018

Codecov Report

Merging #23807 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #23807      +/-   ##
==========================================
+ Coverage   92.28%   92.28%   +<.01%     
==========================================
  Files         161      161              
  Lines       51500    51501       +1     
==========================================
+ Hits        47528    47529       +1     
  Misses       3972     3972
Flag Coverage Δ
#multiple 90.68% <100%> (ø) ⬆️
#single 42.31% <100%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/tseries/offsets.py 96.98% <ø> (ø) ⬆️
pandas/core/dtypes/dtypes.py 95.61% <100%> (ø) ⬆️
pandas/tseries/frequencies.py 97.08% <100%> (+0.01%) ⬆️
pandas/core/arrays/datetimes.py 98.51% <100%> (ø) ⬆️
pandas/io/pytables.py 92.3% <0%> (-0.05%) ⬇️
pandas/io/formats/printing.py 93.61% <0%> (+0.53%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e405a10...15ba5fb. Read the comment docs.

@jreback jreback added this to the 0.24.0 milestone Nov 23, 2018

@jreback

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2018

looks good. does this add a lot more testing? e.g. now that we are having a bigger tz matrix (and use this fixture a fair amount)

@mroeschke

This comment has been minimized.

Copy link
Member Author

commented Nov 23, 2018

Looking at two, 3.6 travis build, we will go from 36028 tests to 37606 tests (783.49 seconds to 931.50 seconds).

@jreback jreback merged commit 94ce05d into pandas-dev:master Nov 23, 2018

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
pandas-dev.pandas Build #20181122.3 succeeded
Details
@jreback

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2018

thanks! of course if this does close any existing issues, pls update them.

@mroeschke mroeschke deleted the mroeschke:add_more_timezones branch Nov 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.