Skip to content

DateRange.tz_normalize(tz) problem when tz=pytz.utc #969

@ijmcf

Description

@ijmcf

Hello

I believe there is a problem when using DateRange.tz_normalize(tz) when the tz is pytz.utc (but not other pytz timezones):

I create a DateRange using a DateOffset that is Week(weekday=1), and then use the built-in DateRange methods tz_localize and tz_normalize to account for time zones. However, I get something weird if the time zone I use is pytz.UTC:

dr = pandas.DateRange(start=start, end=end, offset=pandas.core.datetools.Week(weekday=1))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: None
[2010-04-13 00:00:00, ..., 2012-03-06 00:00:00]
length: 100

Now localize:

drl = dr.tz_localize(pytz.timezone('US/Eastern'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: US/Eastern
[2010-04-13 00:00:00-04:00, ..., 2012-03-06 00:00:00-05:00]
length: 100

And normalize:

drl.tz_normalize(pytz.timezone('UTC'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: UTC
[2010-04-13 00:00:00+00:00, ..., 2012-03-06 00:00:00+00:00]
length: 100

Bzzzt. The time zone info has changed, but the times have remained the same!

But if I use any other timezone, it works:

drl.tz_normalize(pytz.timezone('Europe/London'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: Europe/London
[2010-04-13 05:00:00+01:00, ..., 2012-03-06 05:00:00+00:00]
length: 100

Or even with 'Etc/UTC':

drl.tz_normalize(pytz.timezone('Etc/UTC'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: Etc/UTC
[2010-04-13 04:00:00+00:00, ..., 2012-03-06 05:00:00+00:00]
length: 100

Why does the DateRange.tz_localize() method fail with 'UTC' but work with other timezones (including 'Etc/UTC')? I looked at the code for DateRange.tz_localize(), and it's a very simple invocation of pytz tz.normalize().

But the pytz documentation (which isn't easy to parse, IMO) suggests that when using UTC normalize() alone is insufficient. For example:

dt = datetime.datetime(2012, 3, 26, 12, 0)
dtl = pytz.timezone('US/Eastern').localize(dt)
datetime.datetime(2012, 3, 26, 12, 0, tzinfo=<DstTzInfo 'US/Eastern' EDT-1 day, 20:00:00 DST>)
print pytz.timezone('Etc/UTC').normalize(dtl) <- as expected
2012-03-26 16:00:00+00:00
print pytz.timezone('UTC').normalize(dtl) <- bzzzt
2012-03-26 12:00:00+00:00
print dtl.astimezone(pytz.timezone('UTC')) <- as expected
2012-03-26 16:00:00+00:00
print pytz.timezone('UTC').normalize(dtl.astimezone(pytz.timezone('UTC'))) <- as expected
2012-03-26 16:00:00+00:00

Why this behavior of normalize should be limited to the canonical pytz UTC timezone (versus e.g. 'Etc/UTC'), I don't know, but it appears to be due to the fact that pytz.utc is a specialized singleton class that has different character than the other TZ classes.

I would suggest that DateRange.tz_normalize() should use either dt.astimezone(pytz.utc) or pytz.utc.normalize(dt.astimezone(pytz.utc)) at line 417.

Iain

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions