Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add notion of pytz time zone normalization #18595

Closed
pganssle opened this issue Dec 1, 2017 · 1 comment · Fixed by #20510
Closed

Add notion of pytz time zone normalization #18595

pganssle opened this issue Dec 1, 2017 · 1 comment · Fixed by #20510
Labels
Clean Timezones Timezone data dtype
Milestone

Comments

@pganssle
Copy link
Contributor

pganssle commented Dec 1, 2017

Though part of the solution to #18523 will be to introduce a new time zone comparison operation, there's another, less pressing issue that's also brought up there:

import pandas as pd

idx1 = pd.date_range('2011-01-01', periods=3, freq='H', tz='Europe/Paris')
idx2 = pd.date_range(start=idx1[0], end=idx1[-1], freq='H')

print(idx1.tz)    # <DstTzInfo 'Europe/Paris' LMT+0:09:00 STD>
print(idx2.tz)   # <DstTzInfo 'Europe/Paris' CET+1:00:00 STD>

The problem comes from the fact that pytz.localize spawns concrete offset-specific new tzinfo objects for each date that don't compare equal to one another (and aren't the same instance as one another). As such, when you pull idx1[0] and idx1[-1], you get concrete timestamps, each of which have the <DstTzInfo 'Europe/Paris' CET+1:00:00 STD> zone attached and so the DateTimeIndex gets the concrete example, which is not actually consistent with the abstract notion of time zone from idx1. At the series level, it makes sense to use a "normalized" version of pytz zones, so I recommend adding a function like this:

def tz_normalize(tzi):
    if is_pytz_zone(tzi):
        return pytz.timezone(str(tzi))
    else:
        return tzi

You can solve the above problem by either normalizing the time zone on construction or you can change .tz from being an attribute to being a property:

@property
def tz(self):
    return tz_normalize(self._tz)

@tz.setter
def tz(self, value):
    msg = ('Directly setting DatetimeIndex is deprecated, instead, you should '
           'use tz_localize() or tz_convert() as appropriate')
    raise_deprecation_warning(msg)
    self._tz = value

You'll note that in my property implementation, I've demonstrated that this will also allow you to enforce that people use the proper interface for changing index time zones.

Issue raised per discussion with @jreback and @MridulS at PyData NYC sprints.

@jreback jreback added this to the 0.22.0 milestone Dec 1, 2017
@jreback
Copy link
Contributor

jreback commented Dec 1, 2017

thanks @pganssle would love a PR which implements this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants