-
-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
Hello
I believe there is a problem when using DateRange.tz_normalize(tz) when the tz is pytz.utc (but not other pytz timezones):
I create a DateRange using a DateOffset that is Week(weekday=1), and then use the built-in DateRange methods tz_localize and tz_normalize to account for time zones. However, I get something weird if the time zone I use is pytz.UTC:
dr = pandas.DateRange(start=start, end=end, offset=pandas.core.datetools.Week(weekday=1))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: None
[2010-04-13 00:00:00, ..., 2012-03-06 00:00:00]
length: 100
Now localize:
drl = dr.tz_localize(pytz.timezone('US/Eastern'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: US/Eastern
[2010-04-13 00:00:00-04:00, ..., 2012-03-06 00:00:00-05:00]
length: 100
And normalize:
drl.tz_normalize(pytz.timezone('UTC'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: UTC
[2010-04-13 00:00:00+00:00, ..., 2012-03-06 00:00:00+00:00]
length: 100
Bzzzt. The time zone info has changed, but the times have remained the same!
But if I use any other timezone, it works:
drl.tz_normalize(pytz.timezone('Europe/London'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: Europe/London
[2010-04-13 05:00:00+01:00, ..., 2012-03-06 05:00:00+00:00]
length: 100
Or even with 'Etc/UTC':
drl.tz_normalize(pytz.timezone('Etc/UTC'))
<class 'pandas.core.daterange.DateRange'>
offset: <1 Week: kwds={'weekday': 1}, weekday=1>, tzinfo: Etc/UTC
[2010-04-13 04:00:00+00:00, ..., 2012-03-06 05:00:00+00:00]
length: 100
Why does the DateRange.tz_localize() method fail with 'UTC' but work with other timezones (including 'Etc/UTC')? I looked at the code for DateRange.tz_localize(), and it's a very simple invocation of pytz tz.normalize().
But the pytz documentation (which isn't easy to parse, IMO) suggests that when using UTC normalize() alone is insufficient. For example:
dt = datetime.datetime(2012, 3, 26, 12, 0)
dtl = pytz.timezone('US/Eastern').localize(dt)
datetime.datetime(2012, 3, 26, 12, 0, tzinfo=<DstTzInfo 'US/Eastern' EDT-1 day, 20:00:00 DST>)
print pytz.timezone('Etc/UTC').normalize(dtl) <- as expected
2012-03-26 16:00:00+00:00
print pytz.timezone('UTC').normalize(dtl) <- bzzzt
2012-03-26 12:00:00+00:00
print dtl.astimezone(pytz.timezone('UTC')) <- as expected
2012-03-26 16:00:00+00:00
print pytz.timezone('UTC').normalize(dtl.astimezone(pytz.timezone('UTC'))) <- as expected
2012-03-26 16:00:00+00:00
Why this behavior of normalize should be limited to the canonical pytz UTC timezone (versus e.g. 'Etc/UTC'), I don't know, but it appears to be due to the fact that pytz.utc is a specialized singleton class that has different character than the other TZ classes.
I would suggest that DateRange.tz_normalize() should use either dt.astimezone(pytz.utc) or pytz.utc.normalize(dt.astimezone(pytz.utc)) at line 417.
Iain