Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make DateOffset immutable #21341

Merged
merged 8 commits into from Jun 21, 2018
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v0.24.0.txt
Expand Up @@ -150,7 +150,7 @@ Timezones
Offsets
^^^^^^^

-
- :class:`DateOffset` objects are now immutable. Attempting to alter one of these will now raise ``AttributeError`` (:issue:`21341`)
-
-

Expand Down
16 changes: 13 additions & 3 deletions pandas/_libs/tslibs/offsets.pyx
Expand Up @@ -304,6 +304,15 @@ class _BaseOffset(object):
_day_opt = None
_attributes = frozenset(['n', 'normalize'])

def __init__(self, n=1, normalize=False):
n = self._validate_n(n)
object.__setattr__(self, "n", n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just set these as regular attributes, I doubt this actually makes a signfiicant perf difference.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand. These are regular attributes, but the PR overrides __setattr__ so we have to use object.__setattr__.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh sorry you are right.

can you set these as cdef readonly? (and maybe type them) or not yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really really wish I could, but until someone smarter than me looks at #18824#18224 cdef class causes pickle test failures.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel what pickle issue are you referring to in that change? That just looks like a master tracker but not sure the exact issue contained within

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd thanks for catching that, should be #18224. Will fix above.

object.__setattr__(self, "normalize", normalize)
object.__setattr__(self, "_cache", {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is _cache for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't create this explicitly in __init__ then @cache_readonly lookups will try to create it, which will raise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you put this on the class? (or is that really weird)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pd.DateOffset._cache = {}
off = pd.DateOffset(n=1)
hour = pd.offsets.BusinessHour(n=2)

>>> off._cache is hour._cache
True

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right so we can't use the @cache_readonly decorator here? (or is that what you are doing by pre-emptively setting the cache)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_readonly.__get__ checks for a _cache attribute and creates one if it does not exist. Creating it in __init__ ensures that one exists, so a new one does not have to be created (since attempting to do so would raise)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. this is necessary in order to retain the existing usages of cache_readonly on DateOffset subclasses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok makes sense


def __setattr__(self, name, value):
raise AttributeError("DateOffset objects are immutable.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standard approach is usually to use private attribute like _n and provide access via a properties. Is there a reason why that wouldn't make sense here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would absolutely make sense, but since micro-benchmarks are a concern I prefer to avoid the property overhead. The best-case would be the implementation in #18824#18224, but I can't figure out the pickle errors there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, seems reasonable to me.


@property
def kwds(self):
# for backwards-compatibility
Expand Down Expand Up @@ -395,13 +404,14 @@ class _BaseOffset(object):
kwds = {key: odict[key] for key in odict if odict[key]}
state.update(kwds)

self.__dict__ = state
self.__dict__.update(state)

if 'weekmask' in state and 'holidays' in state:
calendar, holidays = _get_calendar(weekmask=self.weekmask,
holidays=self.holidays,
calendar=None)
self.calendar = calendar
self.holidays = holidays
object.__setattr__(self, "calendar", calendar)
object.__setattr__(self, "holidays", holidays)

def __getstate__(self):
"""Return a pickleable state"""
Expand Down
8 changes: 8 additions & 0 deletions pandas/tests/tseries/offsets/test_offsets.py
Expand Up @@ -234,6 +234,14 @@ class TestCommon(Base):
'Nano': Timestamp(np_datetime64_compat(
'2011-01-01T09:00:00.000000001Z'))}

def test_immutable(self, offset_types):
# GH#21341 check that __setattr__ raises
offset = self._get_offset(offset_types)
with pytest.raises(AttributeError):
offset.normalize = True
with pytest.raises(AttributeError):
offset.n = 91

def test_return_type(self, offset_types):
offset = self._get_offset(offset_types)

Expand Down
118 changes: 53 additions & 65 deletions pandas/tseries/offsets.py
Expand Up @@ -23,7 +23,6 @@
ApplyTypeError,
as_datetime, _is_normalized,
_get_calendar, _to_dt64,
_determine_offset,
apply_index_wraps,
roll_yearday,
shift_month,
Expand Down Expand Up @@ -192,11 +191,14 @@ def __add__(date):
normalize = False

def __init__(self, n=1, normalize=False, **kwds):
self.n = self._validate_n(n)
self.normalize = normalize
BaseOffset.__init__(self, n, normalize)

self._offset, self._use_relativedelta = _determine_offset(kwds)
self.__dict__.update(kwds)
off, use_rd = liboffsets._determine_offset(kwds)
object.__setattr__(self, "_offset", off)
object.__setattr__(self, "_use_relativedelta", use_rd)
for key in kwds:
val = kwds[key]
object.__setattr__(self, key, val)

@apply_wraps
def apply(self, other):
Expand Down Expand Up @@ -446,9 +448,9 @@ def __init__(self, weekmask, holidays, calendar):
# following two attributes. See DateOffset._params()
# holidays, weekmask

self.weekmask = weekmask
self.holidays = holidays
self.calendar = calendar
object.__setattr__(self, "weekmask", weekmask)
object.__setattr__(self, "holidays", holidays)
object.__setattr__(self, "calendar", calendar)


class BusinessMixin(object):
Expand Down Expand Up @@ -480,9 +482,8 @@ class BusinessDay(BusinessMixin, SingleConstructorOffset):
_attributes = frozenset(['n', 'normalize', 'offset'])

def __init__(self, n=1, normalize=False, offset=timedelta(0)):
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "_offset", offset)

def _offset_str(self):
def get_str(td):
Expand Down Expand Up @@ -578,9 +579,11 @@ class BusinessHourMixin(BusinessMixin):

def __init__(self, start='09:00', end='17:00', offset=timedelta(0)):
# must be validated here to equality check
self.start = liboffsets._validate_business_time(start)
self.end = liboffsets._validate_business_time(end)
self._offset = offset
start = liboffsets._validate_business_time(start)
object.__setattr__(self, "start", start)
end = liboffsets._validate_business_time(end)
object.__setattr__(self, "end", end)
object.__setattr__(self, "_offset", offset)

@cache_readonly
def next_bday(self):
Expand Down Expand Up @@ -807,8 +810,7 @@ class BusinessHour(BusinessHourMixin, SingleConstructorOffset):

def __init__(self, n=1, normalize=False, start='09:00',
end='17:00', offset=timedelta(0)):
self.n = self._validate_n(n)
self.normalize = normalize
BaseOffset.__init__(self, n, normalize)
super(BusinessHour, self).__init__(start=start, end=end, offset=offset)


Expand Down Expand Up @@ -837,9 +839,8 @@ class CustomBusinessDay(_CustomMixin, BusinessDay):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "_offset", offset)

_CustomMixin.__init__(self, weekmask, holidays, calendar)

Expand Down Expand Up @@ -898,9 +899,8 @@ class CustomBusinessHour(_CustomMixin, BusinessHourMixin,
def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None,
start='09:00', end='17:00', offset=timedelta(0)):
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "_offset", offset)

_CustomMixin.__init__(self, weekmask, holidays, calendar)
BusinessHourMixin.__init__(self, start=start, end=end, offset=offset)
Expand All @@ -914,9 +914,7 @@ class MonthOffset(SingleConstructorOffset):
_adjust_dst = True
_attributes = frozenset(['n', 'normalize'])

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
self.normalize = normalize
__init__ = BaseOffset.__init__

@property
def name(self):
Expand Down Expand Up @@ -995,9 +993,8 @@ class _CustomBusinessMonth(_CustomMixin, BusinessMixin, MonthOffset):

def __init__(self, n=1, normalize=False, weekmask='Mon Tue Wed Thu Fri',
holidays=None, calendar=None, offset=timedelta(0)):
self.n = self._validate_n(n)
self.normalize = normalize
self._offset = offset
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "_offset", offset)

_CustomMixin.__init__(self, weekmask, holidays, calendar)

Expand Down Expand Up @@ -1074,18 +1071,18 @@ class SemiMonthOffset(DateOffset):
_attributes = frozenset(['n', 'normalize', 'day_of_month'])

def __init__(self, n=1, normalize=False, day_of_month=None):
BaseOffset.__init__(self, n, normalize)

if day_of_month is None:
self.day_of_month = self._default_day_of_month
object.__setattr__(self, "day_of_month",
self._default_day_of_month)
else:
self.day_of_month = int(day_of_month)
object.__setattr__(self, "day_of_month", int(day_of_month))
if not self._min_day_of_month <= self.day_of_month <= 27:
msg = 'day_of_month must be {min}<=day_of_month<=27, got {day}'
raise ValueError(msg.format(min=self._min_day_of_month,
day=self.day_of_month))

self.n = self._validate_n(n)
self.normalize = normalize

@classmethod
def _from_name(cls, suffix=None):
return cls(day_of_month=suffix)
Expand Down Expand Up @@ -1291,9 +1288,8 @@ class Week(DateOffset):
_attributes = frozenset(['n', 'normalize', 'weekday'])

def __init__(self, n=1, normalize=False, weekday=None):
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "weekday", weekday)

if self.weekday is not None:
if self.weekday < 0 or self.weekday > 6:
Expand Down Expand Up @@ -1421,10 +1417,9 @@ class WeekOfMonth(_WeekOfMonthMixin, DateOffset):
_attributes = frozenset(['n', 'normalize', 'week', 'weekday'])

def __init__(self, n=1, normalize=False, week=0, weekday=0):
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday
self.week = week
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "weekday", weekday)
object.__setattr__(self, "week", week)

if self.weekday < 0 or self.weekday > 6:
raise ValueError('Day must be 0<=day<=6, got {day}'
Expand Down Expand Up @@ -1493,9 +1488,8 @@ class LastWeekOfMonth(_WeekOfMonthMixin, DateOffset):
_attributes = frozenset(['n', 'normalize', 'weekday'])

def __init__(self, n=1, normalize=False, weekday=0):
self.n = self._validate_n(n)
self.normalize = normalize
self.weekday = weekday
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "weekday", weekday)

if self.n == 0:
raise ValueError('N cannot be 0')
Expand Down Expand Up @@ -1553,11 +1547,11 @@ class QuarterOffset(DateOffset):
# startingMonth vs month attr names are resolved

def __init__(self, n=1, normalize=False, startingMonth=None):
self.n = self._validate_n(n)
self.normalize = normalize
BaseOffset.__init__(self, n, normalize)

if startingMonth is None:
startingMonth = self._default_startingMonth
self.startingMonth = startingMonth
object.__setattr__(self, "startingMonth", startingMonth)

def isAnchored(self):
return (self.n == 1 and self.startingMonth is not None)
Expand Down Expand Up @@ -1679,11 +1673,10 @@ def onOffset(self, dt):
return dt.month == self.month and dt.day == self._get_offset_day(dt)

def __init__(self, n=1, normalize=False, month=None):
self.n = self._validate_n(n)
self.normalize = normalize
BaseOffset.__init__(self, n, normalize)

month = month if month is not None else self._default_month
self.month = month
object.__setattr__(self, "month", month)

if self.month < 1 or self.month > 12:
raise ValueError('Month must go from 1 to 12')
Expand Down Expand Up @@ -1776,12 +1769,11 @@ class FY5253(DateOffset):

def __init__(self, n=1, normalize=False, weekday=0, startingMonth=1,
variation="nearest"):
self.n = self._validate_n(n)
self.normalize = normalize
self.startingMonth = startingMonth
self.weekday = weekday
BaseOffset.__init__(self, n, normalize)
object.__setattr__(self, "startingMonth", startingMonth)
object.__setattr__(self, "weekday", weekday)

self.variation = variation
object.__setattr__(self, "variation", variation)

if self.n == 0:
raise ValueError('N cannot be 0')
Expand Down Expand Up @@ -1976,13 +1968,12 @@ class FY5253Quarter(DateOffset):

def __init__(self, n=1, normalize=False, weekday=0, startingMonth=1,
qtr_with_extra_week=1, variation="nearest"):
self.n = self._validate_n(n)
self.normalize = normalize
BaseOffset.__init__(self, n, normalize)

self.weekday = weekday
self.startingMonth = startingMonth
self.qtr_with_extra_week = qtr_with_extra_week
self.variation = variation
object.__setattr__(self, "startingMonth", startingMonth)
object.__setattr__(self, "weekday", weekday)
object.__setattr__(self, "qtr_with_extra_week", qtr_with_extra_week)
object.__setattr__(self, "variation", variation)

if self.n == 0:
raise ValueError('N cannot be 0')
Expand Down Expand Up @@ -2129,9 +2120,7 @@ class Easter(DateOffset):
_adjust_dst = True
_attributes = frozenset(['n', 'normalize'])

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
self.normalize = normalize
__init__ = BaseOffset.__init__

@apply_wraps
def apply(self, other):
Expand Down Expand Up @@ -2177,11 +2166,10 @@ class Tick(SingleConstructorOffset):
_attributes = frozenset(['n', 'normalize'])

def __init__(self, n=1, normalize=False):
self.n = self._validate_n(n)
BaseOffset.__init__(self, n, normalize)
if normalize:
raise ValueError("Tick offset with `normalize=True` are not "
"allowed.") # GH#21427
self.normalize = normalize

__gt__ = _tick_comp(operator.gt)
__ge__ = _tick_comp(operator.ge)
Expand Down