Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG/TST/REF: Datetimelike Arithmetic Methods #23215

Merged
merged 49 commits into from Oct 28, 2018
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
dc2280f
Make arithmetic code dispatch less redundant, fix datetime64 addition…
jbrockmendel Oct 17, 2018
37728ff
remove unnecessary nat_new
jbrockmendel Oct 17, 2018
15ad0a6
Fix interpretation of NaT
jbrockmendel Oct 17, 2018
94f1745
move test
jbrockmendel Oct 17, 2018
bf5e2fb
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 17, 2018
3bdf104
whatsnew, fix dropped timezone
jbrockmendel Oct 17, 2018
982ea30
remove comment
jbrockmendel Oct 17, 2018
d046038
Add GH references
jbrockmendel Oct 17, 2018
a0c1a85
remove unused imports
jbrockmendel Oct 18, 2018
33d82b3
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 18, 2018
8a7c249
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 18, 2018
8e57fd8
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 18, 2018
b932121
dummy commit to force CI
jbrockmendel Oct 18, 2018
9f3b18d
Fix bug in adding DateOffset to PeriodIndex, Series, Frame
jbrockmendel Oct 19, 2018
7a8232e
Dummy commit to force CI
jbrockmendel Oct 19, 2018
a743f74
revert tracebackhide
jbrockmendel Oct 19, 2018
0693196
Merge branch 'arith' of https://github.com/jbrockmendel/pandas into a…
jbrockmendel Oct 19, 2018
29d91af
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 19, 2018
af4872e
comment and reversion
jbrockmendel Oct 19, 2018
6707032
revert to simpler version
jbrockmendel Oct 20, 2018
18ef26d
Move overriding of addsub_int_array to PeriodArray
jbrockmendel Oct 20, 2018
60f7f8d
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 23, 2018
9372423
correct docstrings
jbrockmendel Oct 23, 2018
2a6268e
comments and assertions
jbrockmendel Oct 23, 2018
777f4d9
More explicit names for array/scalar add/sub methods
jbrockmendel Oct 23, 2018
ee885c8
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 23, 2018
5f231b2
oo optimization fixup
jbrockmendel Oct 23, 2018
0214a9e
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 23, 2018
fbee9f5
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 24, 2018
f7cf3d9
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 24, 2018
d799d8e
Dummy commit to force CI
jbrockmendel Oct 25, 2018
82df39c
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 25, 2018
8cf614b
change default fill_value for maybe_mask_results
jbrockmendel Oct 25, 2018
a45734a
Merge branch 'arith' of https://github.com/jbrockmendel/pandas into a…
jbrockmendel Oct 25, 2018
fb007cb
Make docstring extra explicit
jbrockmendel Oct 25, 2018
1d3fd48
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 25, 2018
aecfef7
prettify docstrings
jbrockmendel Oct 25, 2018
dade955
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 26, 2018
a6eb01c
fixup super usage
jbrockmendel Oct 26, 2018
acf1f74
flesh out TODO comment
jbrockmendel Oct 26, 2018
e89e3ef
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 26, 2018
d306277
move test for moved method
jbrockmendel Oct 26, 2018
f6e4073
Fixup name
jbrockmendel Oct 26, 2018
9146a72
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 27, 2018
4e4b9ed
change NotImplementedError to TypeError
jbrockmendel Oct 27, 2018
f35b3b6
Fix add_delta_tdi return type; docstrings
jbrockmendel Oct 27, 2018
87e07ef
Merge branch 'arith' of https://github.com/jbrockmendel/pandas into a…
jbrockmendel Oct 27, 2018
343ee30
Merge branch 'master' of https://github.com/pandas-dev/pandas into arith
jbrockmendel Oct 28, 2018
0466b9c
docstring edit, rename
jbrockmendel Oct 28, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 4 additions & 2 deletions doc/source/whatsnew/v0.24.0.txt
Expand Up @@ -1021,6 +1021,7 @@ Datetimelike
- Bug in :class:`PeriodIndex` where adding or subtracting a :class:`timedelta` or :class:`Tick` object produced incorrect results (:issue:`22988`)
- Bug in :func:`date_range` when decrementing a start date to a past end date by a negative frequency (:issue:`23270`)
- Bug in :func:`DataFrame.combine` with datetimelike values raising a TypeError (:issue:`23079`)
- Bug in :class:`PeriodIndex` with attribute ``freq.n`` greater than 1 where adding a :class:`DateOffset` object would return incorrect results (:issue:`23215`)

Timedelta
^^^^^^^^^
Expand All @@ -1032,7 +1033,8 @@ Timedelta
- Bug in :class:`TimedeltaIndex` incorrectly allowing indexing with ``Timestamp`` object (:issue:`20464`)
- Fixed bug where subtracting :class:`Timedelta` from an object-dtyped array would raise ``TypeError`` (:issue:`21980`)
- Fixed bug in adding a :class:`DataFrame` with all-`timedelta64[ns]` dtypes to a :class:`DataFrame` with all-integer dtypes returning incorrect results instead of raising ``TypeError`` (:issue:`22696`)

- Bug in :class:`TimedeltaIndex` where adding a timezone-aware datetime scalar incorrectly returned a timezone-naive :class:`DatetimeIndex` (:issue:`23215`)
- Bug in :class:`TimedeltaIndex` where adding ``np.timedelta64('NaT')`` incorrectly returned an all-`NaT` :class:`DatetimeIndex` instead of an all-`NaT` :class:`TimedeltaIndex` (:issue:`23215`)

Timezones
^^^^^^^^^
Expand Down Expand Up @@ -1062,7 +1064,7 @@ Offsets

- Bug in :class:`FY5253` where date offsets could incorrectly raise an ``AssertionError`` in arithmetic operatons (:issue:`14774`)
- Bug in :class:`DateOffset` where keyword arguments ``week`` and ``milliseconds`` were accepted and ignored. Passing these will now raise ``ValueError`` (:issue:`19398`)
-
- Bug in adding :class:`DateOffset` with :class:`DataFrame` or :class:`PeriodIndex` incorrectly raising ``TypeError`` (:issue:`23215`)

Numeric
^^^^^^^
Expand Down
4 changes: 2 additions & 2 deletions pandas/_libs/tslibs/offsets.pyx
Expand Up @@ -344,8 +344,8 @@ class _BaseOffset(object):
return {name: kwds[name] for name in kwds if kwds[name] is not None}

def __add__(self, other):
if getattr(other, "_typ", None) in ["datetimeindex",
"series", "period"]:
if getattr(other, "_typ", None) in ["datetimeindex", "periodindex",
"series", "period", "dataframe"]:
# defer to the other class's implementation
return other + self
try:
Expand Down
88 changes: 52 additions & 36 deletions pandas/core/arrays/datetimelike.py
Expand Up @@ -221,7 +221,7 @@ def hasnans(self):
""" return if I have any nans; enables various perf speedups """
return self._isnan.any()

def _maybe_mask_results(self, result, fill_value=None, convert=None):
def _maybe_mask_results(self, result, fill_value=iNaT, convert=None):
"""
Parameters
----------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update the doc-string for fill_value

Expand All @@ -246,27 +246,6 @@ def _maybe_mask_results(self, result, fill_value=None, convert=None):
result[self._isnan] = fill_value
return result

def _nat_new(self, box=True):
"""
Return Array/Index or ndarray filled with NaT which has the same
length as the caller.

Parameters
----------
box : boolean, default True
- If True returns a Array/Index as the same as caller.
- If False returns ndarray of np.int64.
"""
result = np.zeros(len(self), dtype=np.int64)
result.fill(iNaT)
if not box:
return result

attribs = self._get_attributes_dict()
if not is_period_dtype(self):
attribs['freq'] = None
return self._simple_new(result, **attribs)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only used once outside of tests. Much less verbose to inline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hah I just made a comment on PeriodArray about re-using this.


# ------------------------------------------------------------------
# Frequency Properties/Methods

Expand Down Expand Up @@ -346,22 +325,56 @@ def _validate_frequency(cls, index, freq, **kwargs):
# ------------------------------------------------------------------
# Arithmetic Methods

def _add_datelike(self, other):
def _add_datetimelike_scalar(self, other):
# Overriden by TimedeltaArray
raise TypeError("cannot add {cls} and {typ}"
.format(cls=type(self).__name__,
typ=type(other).__name__))

def _sub_datelike(self, other):
raise com.AbstractMethodError(self)
_add_datetime_arraylike = _add_datetimelike_scalar

def _sub_datetimelike_scalar(self, other):
# Overridden by DatetimeArray
assert other is not NaT
raise TypeError("cannot subtract a datelike from a {cls}"
.format(cls=type(self).__name__))

_sub_datetime_arraylike = _sub_datetimelike_scalar

def _sub_period(self, other):
return NotImplemented
# Overriden by PeriodArray
raise TypeError("cannot subtract Period from a {cls}"
.format(cls=type(self).__name__))
jorisvandenbossche marked this conversation as resolved.
Show resolved Hide resolved

def _add_offset(self, offset):
raise com.AbstractMethodError(self)

def _add_delta(self, other):
return NotImplemented
"""
Add a timedelta-like, Tick or TimedeltaIndex-like object
to self, yielding an int64 numpy array

Parameters
----------
delta : {timedelta, np.timedelta64, Tick,
TimedeltaIndex, ndarray[timedelta64]}

Returns
-------
result : ndarray[int64]

Notes
-----
The result's name is set outside of _add_delta by the calling
method (__add__ or __sub__), if necessary (i.e. for Indexes).
"""
if isinstance(other, (Tick, timedelta, np.timedelta64)):
new_values = self._add_delta_td(other)
elif is_timedelta64_dtype(other):
# ndarray[timedelta64] or TimedeltaArray/index
new_values = self._add_delta_tdi(other)

return new_values

def _add_delta_td(self, other):
"""
Expand All @@ -371,16 +384,15 @@ def _add_delta_td(self, other):
inc = delta_to_nanoseconds(other)
new_values = checked_add_with_arr(self.asi8, inc,
arr_mask=self._isnan).view('i8')
if self.hasnans:
new_values[self._isnan] = iNaT
new_values = self._maybe_mask_results(new_values)
return new_values.view('i8')

def _add_delta_tdi(self, other):
"""
Add a delta of a TimedeltaIndex
return the i8 result view
"""
if not len(self) == len(other):
if len(self) != len(other):
raise ValueError("cannot add indices of unequal length")

if isinstance(other, np.ndarray):
Expand All @@ -407,7 +419,9 @@ def _add_nat(self):

# GH#19124 pd.NaT is treated like a timedelta for both timedelta
# and datetime dtypes
return self._nat_new(box=True)
result = np.zeros(len(self), dtype=np.int64)
result.fill(iNaT)
return self._shallow_copy(result, freq=None)

def _sub_nat(self):
"""Subtract pd.NaT from self"""
Expand Down Expand Up @@ -441,7 +455,7 @@ def _sub_period_array(self, other):
.format(dtype=other.dtype,
cls=type(self).__name__))

if not len(self) == len(other):
if len(self) != len(other):
raise ValueError("cannot subtract arrays/indices of "
"unequal length")
if self.freq != other.freq:
Expand Down Expand Up @@ -473,6 +487,8 @@ def _addsub_int_array(self, other, op):
-------
result : same class as self
"""
# _addsub_int_array is overriden by PeriodArray
assert not is_period_dtype(self)
assert op in [operator.add, operator.sub]

if self.freq is None:
Expand Down Expand Up @@ -613,7 +629,7 @@ def __add__(self, other):
# specifically _not_ a Tick
result = self._add_offset(other)
elif isinstance(other, (datetime, np.datetime64)):
result = self._add_datelike(other)
result = self._add_datetimelike_scalar(other)
elif lib.is_integer(other):
# This check must come after the check for np.timedelta64
# as is_integer returns True for these
Expand All @@ -628,7 +644,7 @@ def __add__(self, other):
result = self._addsub_offset_array(other, operator.add)
elif is_datetime64_dtype(other) or is_datetime64tz_dtype(other):
# DatetimeIndex, ndarray[datetime64]
return self._add_datelike(other)
return self._add_datetime_arraylike(other)
elif is_integer_dtype(other):
result = self._addsub_int_array(other, operator.add)
elif is_float_dtype(other):
Expand Down Expand Up @@ -671,7 +687,7 @@ def __sub__(self, other):
# specifically _not_ a Tick
result = self._add_offset(-other)
elif isinstance(other, (datetime, np.datetime64)):
result = self._sub_datelike(other)
result = self._sub_datetimelike_scalar(other)
elif lib.is_integer(other):
# This check must come after the check for np.timedelta64
# as is_integer returns True for these
Expand All @@ -688,7 +704,7 @@ def __sub__(self, other):
result = self._addsub_offset_array(other, operator.sub)
elif is_datetime64_dtype(other) or is_datetime64tz_dtype(other):
# DatetimeIndex, ndarray[datetime64]
result = self._sub_datelike(other)
result = self._sub_datetime_arraylike(other)
elif is_period_dtype(other):
# PeriodIndex
result = self._sub_period_array(other)
Expand Down
103 changes: 40 additions & 63 deletions pandas/core/arrays/datetimes.py
@@ -1,5 +1,5 @@
# -*- coding: utf-8 -*-
from datetime import datetime, timedelta, time
from datetime import datetime, time
import warnings

import numpy as np
Expand All @@ -21,7 +21,6 @@
is_object_dtype,
is_datetime64tz_dtype,
is_datetime64_dtype,
is_timedelta64_dtype,
ensure_int64)
from pandas.core.dtypes.dtypes import DatetimeTZDtype
from pandas.core.dtypes.missing import isna
Expand Down Expand Up @@ -76,11 +75,12 @@ def f(self):

if field in self._object_ops:
result = fields.get_date_name_field(values, field)
result = self._maybe_mask_results(result)
result = self._maybe_mask_results(result, fill_value=None)
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved

else:
result = fields.get_date_field(values, field)
result = self._maybe_mask_results(result, convert='float64')
result = self._maybe_mask_results(result, fill_value=None,
convert='float64')

return result

Expand Down Expand Up @@ -424,11 +424,21 @@ def _assert_tzawareness_compat(self, other):
# -----------------------------------------------------------------
# Arithmetic Methods

def _sub_datelike_dti(self, other):
"""subtraction of two DatetimeIndexes"""
if not len(self) == len(other):
def _sub_datetime_arraylike(self, other):
"""subtract DatetimeArray/Index or ndarray[datetime64]"""
if len(self) != len(other):
raise ValueError("cannot add indices of unequal length")

if isinstance(other, np.ndarray):
jorisvandenbossche marked this conversation as resolved.
Show resolved Hide resolved
assert is_datetime64_dtype(other)
other = type(self)(other)

if not self._has_same_tz(other):
# require tz compat
raise TypeError("{cls} subtraction must have the same "
"timezones or no timezones"
.format(cls=type(self).__name__))

self_i8 = self.asi8
other_i8 = other.asi8
new_values = checked_add_with_arr(self_i8, -other_i8,
Expand Down Expand Up @@ -456,74 +466,41 @@ def _add_offset(self, offset):

return type(self)(result, freq='infer')

def _sub_datelike(self, other):
def _sub_datetimelike_scalar(self, other):
# subtract a datetime from myself, yielding a ndarray[timedelta64[ns]]
if isinstance(other, (DatetimeArrayMixin, np.ndarray)):
if isinstance(other, np.ndarray):
# if other is an ndarray, we assume it is datetime64-dtype
other = type(self)(other)
if not self._has_same_tz(other):
# require tz compat
raise TypeError("{cls} subtraction must have the same "
"timezones or no timezones"
.format(cls=type(self).__name__))
result = self._sub_datelike_dti(other)
elif isinstance(other, (datetime, np.datetime64)):
assert other is not NaT
other = Timestamp(other)
if other is NaT:
return self - NaT
assert isinstance(other, (datetime, np.datetime64))
jorisvandenbossche marked this conversation as resolved.
Show resolved Hide resolved
assert other is not NaT
other = Timestamp(other)
if other is NaT:
return self - NaT

if not self._has_same_tz(other):
# require tz compat
elif not self._has_same_tz(other):
raise TypeError("Timestamp subtraction must have the same "
"timezones or no timezones")
else:
i8 = self.asi8
result = checked_add_with_arr(i8, -other.value,
arr_mask=self._isnan)
result = self._maybe_mask_results(result,
fill_value=iNaT)
else:
raise TypeError("cannot subtract {cls} and {typ}"
.format(cls=type(self).__name__,
typ=type(other).__name__))
raise TypeError("Timestamp subtraction must have the same "
"timezones or no timezones")

i8 = self.asi8
result = checked_add_with_arr(i8, -other.value,
arr_mask=self._isnan)
result = self._maybe_mask_results(result)
return result.view('timedelta64[ns]')

def _add_delta(self, delta):
"""
Add a timedelta-like, DateOffset, or TimedeltaIndex-like object
to self.
Add a timedelta-like, Tick, or TimedeltaIndex-like object
to self, yielding a new DatetimeArray

Parameters
----------
delta : {timedelta, np.timedelta64, DateOffset,
other : {timedelta, np.timedelta64, Tick,
TimedeltaIndex, ndarray[timedelta64]}

Returns
-------
result : same type as self

Notes
-----
The result's name is set outside of _add_delta by the calling
method (__add__ or __sub__)
result : DatetimeArray
"""
from pandas.core.arrays import TimedeltaArrayMixin

if isinstance(delta, (Tick, timedelta, np.timedelta64)):
new_values = self._add_delta_td(delta)
elif is_timedelta64_dtype(delta):
if not isinstance(delta, TimedeltaArrayMixin):
delta = TimedeltaArrayMixin(delta)
new_values = self._add_delta_tdi(delta)
else:
new_values = self.astype('O') + delta

tz = 'UTC' if self.tz is not None else None
result = type(self)(new_values, tz=tz, freq='infer')
if self.tz is not None and self.tz is not utc:
result = result.tz_convert(self.tz)
return result
new_values = dtl.DatetimeLikeArrayMixin._add_delta(self, delta)
return type(self)(new_values, tz=self.tz, freq='infer')
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved

# -----------------------------------------------------------------
# Timezone Conversion and Localization Methods
Expand Down Expand Up @@ -904,7 +881,7 @@ def month_name(self, locale=None):

result = fields.get_date_name_field(values, 'month_name',
locale=locale)
result = self._maybe_mask_results(result)
result = self._maybe_mask_results(result, fill_value=None)
jbrockmendel marked this conversation as resolved.
Show resolved Hide resolved
return result

def day_name(self, locale=None):
Expand Down Expand Up @@ -940,7 +917,7 @@ def day_name(self, locale=None):

result = fields.get_date_name_field(values, 'day_name',
locale=locale)
result = self._maybe_mask_results(result)
result = self._maybe_mask_results(result, fill_value=None)
return result

@property
Expand Down