New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delegate (most) datetimelike Series arithmetic ops to DatetimeIndex #18817

Closed
wants to merge 10 commits into
base: master
from

Conversation

Projects
None yet
4 participants
@jbrockmendel
Member

jbrockmendel commented Dec 18, 2017

DatetimeIndex arithmetic ops do overflow checks that Series does not. Instead of re-implementing those checks (which I tried and its a PITA) this PR just delegates the appropriate operations to the DatetimeIndex implementation.

There are a couple of things that are missing from the DatetimeIndex implementations, e.g. the int-array case added to indexes.datetimelike. If there's consensus that having a single implementation for these ops is the way to go, I'll port the other cases over in their own PRs.

If/when these implementations get merged, ops._TimeOp can be trimmed/removed.

  • closes #12534
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry
@pep8speaks

This comment has been minimized.

pep8speaks commented Dec 18, 2017

Hello @jbrockmendel! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on December 19, 2017 at 16:11 Hours UTC
@codecov

This comment has been minimized.

codecov bot commented Dec 18, 2017

Codecov Report

❗️ No coverage uploaded for pull request base (master@07d8c2d). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #18817   +/-   ##
=========================================
  Coverage          ?    91.6%           
=========================================
  Files             ?      154           
  Lines             ?    51452           
  Branches          ?        0           
=========================================
  Hits              ?    47135           
  Misses            ?     4317           
  Partials          ?        0
Flag Coverage Δ
#multiple 89.47% <100%> (?)
#single 40.8% <61.29%> (?)
Impacted Files Coverage Δ
pandas/core/ops.py 90.56% <100%> (ø)
pandas/core/indexes/datetimelike.py 97.16% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 07d8c2d...ab75018. Read the comment docs.

@@ -107,7 +107,7 @@ def test_shift(self):
# incompat tz
s2 = Series(date_range('2000-01-01 09:00:00', periods=5,
tz='CET'), name='foo')
pytest.raises(ValueError, lambda: s - s2)
pytest.raises(TypeError, lambda: s - s2)

This comment has been minimized.

@jreback

jreback Dec 18, 2017

Contributor

this is an API change

This comment has been minimized.

@jbrockmendel

jbrockmendel Dec 18, 2017

Member

as in needs to be noted in whatsnew or needs to be avoided?

return all(isinstance(x, ABCDateOffset) for x in arr_or_obj)
return False
def _is_offset(arr_or_obj):

This comment has been minimized.

@jreback

jreback Dec 18, 2017

Contributor

better to move this to pandas.core.dtypes.common and de-privatize

This comment has been minimized.

@jreback

jreback Dec 18, 2017

Contributor

and add tests

This comment has been minimized.

@jbrockmendel

jbrockmendel Dec 18, 2017

Member

Agreed on all three points.

@@ -664,6 +665,12 @@ def __add__(self, other):
return self.shift(other)
elif isinstance(other, (Index, datetime, np.datetime64)):
return self._add_datelike(other)
elif (isinstance(self, DatetimeIndex) and
isinstance(other, np.ndarray) and other.size == 1 and

This comment has been minimized.

@jreback

jreback Dec 18, 2017

Contributor

huh? I believe we already check is_integer_dtype(other). this check is much too specific

This comment has been minimized.

@jbrockmendel

jbrockmendel Dec 18, 2017

Member

Under the status quo adding np.array(1, dtype=np.int64) to a DatetimeIndex behaves like adding Timedelta(nanosecond=1). Series raises (and a Series test fails unless this is caught here).

I'm open to ideas for how to make this less hacky.

This comment has been minimized.

@jbrockmendel

jbrockmendel Dec 18, 2017

Member

An alternative would be to -- for the time being -- check for this case in the Series op and avoid dispatching to DatetimeIndex.

@@ -715,6 +726,29 @@ def wrapper(left, right, name=name, na_op=na_op):
if isinstance(right, ABCDataFrame):
return NotImplemented
if _is_offset(right):

This comment has been minimized.

@jreback

jreback Dec 18, 2017

Contributor

this logic does not belong here. this is what happens in Timeop.

This comment has been minimized.

@jbrockmendel

jbrockmendel Dec 18, 2017

Member

this logic does not belong here. this is what happens in Timeop.

Yes. This is preventing this case from being dispatched to DatetimeIndex so that it continues to be handled by TimeOp for now. DatetimeIndex makes a mess of this case (try it...), but I decided to fix that separately.

@jbrockmendel jbrockmendel referenced this pull request Dec 18, 2017

Merged

implement is_offsetlike #18823

2 of 4 tasks complete
@jreback

this needs some work. let's clear the decks and you can revist

@@ -198,6 +198,7 @@ Other API Changes
- Rearranged the order of keyword arguments in :func:`read_excel()` to align with :func:`read_csv()` (:issue:`16672`)
- :func:`pandas.merge` now raises a ``ValueError`` when trying to merge on incompatible data types (:issue:`9780`)
- :func:`wide_to_long` previously kept numeric-like suffixes as ``object`` dtype. Now they are cast to numeric if possible (:issue:`17627`)

This comment has been minimized.

@jreback

jreback Dec 23, 2017

Contributor

move to 0.23.0

@jbrockmendel

This comment has been minimized.

Member

jbrockmendel commented Dec 28, 2017

The other arithmetic PRs are a couple steps away from really being ready for this. The only part of this that is separately actionable (and on which I need input) is for where (and I guess if) to fix the mismatched behavior in DatetimeIndex.__add__ and __sub__

rng = pd.date_range('2000-01-01 09:00', freq='H', periods=2, tz='US/Pacific')
ser = pd.Series(rng)
other = np.array(1, dtype=np.int64)

>>> ser + other
[...]
TypeError: incompatible type for a datetime/timedelta operation [__add__]

>>> rng + other
DatetimeIndex(['2000-01-01 17:00:00.000000001-08:00', '2000-01-01 18:00:00.000000001-08:00'], dtype='datetime64[ns, US/Pacific]', freq=None)

>>> other + rng
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'DatetimeIndex'

This PR addressed that here and here, but got some pushback because it was (admittedly) ugly. Alternative suggestions requested.

Closing this PR, will (eventually) open another one with reduced scope.

@jbrockmendel jbrockmendel deleted the jbrockmendel:fix12534 branch Feb 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment