implement truediv, rtruediv directly in TimedeltaArray; tests #23829

jbrockmendel · 2018-11-21T02:46:24Z

Follow-up to #23642, implements __rdiv__ in TimedeltaArray, plus missing tests.

After this will be a PR that does the same for floordiv+rfloordiv.

closes BUG: TDI / TDI should work #22631
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

Fixes:
tdi / np.timedelta64('NaT')
tdi / tdi (#22631)
tdi / object_dtyped
object_dtyped / tdi

pep8speaks · 2018-11-21T02:46:28Z

Hello @jbrockmendel! Thanks for updating the PR.

There are no PEP8 issues in the file pandas/core/arrays/timedeltas.py !
There are no PEP8 issues in the file pandas/core/indexes/base.py !
There are no PEP8 issues in the file pandas/core/indexes/timedeltas.py !
There are no PEP8 issues in the file pandas/tests/arithmetic/test_timedelta64.py !

Comment last updated on November 29, 2018 at 02:49 Hours UTC

codecov · 2018-11-21T04:58:11Z

Codecov Report

Merging #23829 into master will decrease coverage by <.01%.
The diff coverage is 92.1%.

@@            Coverage Diff             @@
##           master   #23829      +/-   ##
==========================================
- Coverage   92.31%   92.31%   -0.01%     
==========================================
  Files         161      161              
  Lines       51513    51578      +65     
==========================================
+ Hits        47554    47613      +59     
- Misses       3959     3965       +6

Flag	Coverage Δ
#multiple	`90.71% <92.1%> (ø)`	⬆️
#single	`42.37% <13.15%> (-0.05%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/base.py	`96.32% <66.66%> (-0.17%)`	⬇️
pandas/core/indexes/timedeltas.py	`89.6% <86.66%> (+0.07%)`	⬆️
pandas/core/arrays/timedeltas.py	`95.95% <96.36%> (-0.31%)`	⬇️
pandas/io/formats/printing.py	`93.61% <0%> (+0.53%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 982c169...d72bf90. Read the comment docs.

pandas/tests/arithmetic/test_timedelta64.py

gfyoung · 2018-11-21T09:02:10Z

pandas/tests/arithmetic/test_timedelta64.py

+        rng = tm.box_expected(rng, box_with_array)
+        for obj in [mismatched, mismatched[:2]]:
+            # one shorter, one longer
+            for other in [obj, np.array(obj), pd.Index(obj)]:


Is it possible to parameterize this? Potentially via flags to indicate what transformation to perform on obj before testing for the error?

Definitely the outer-loop can be parameterized (just use [1, 2, 3, 4] and [1, 2]), unless there's a really strong reason for using mismatched and mismatched[:2].

It's possible, but I'm trying to push back a little bit against over-parametrization. The pytest setup cost is non-trivial

pandas/core/arrays/timedeltas.py

jorisvandenbossche · 2018-11-21T09:15:52Z

pandas/core/arrays/timedeltas.py

+
+        elif is_timedelta64_dtype(other):
+            # let numpy handle it
+            return self._data / other


Do we need to ensure other is a np.ndarray and not TimedeltaArray here? (meaning, extract the numpy array out of the TimedeltaArray)

No. self._data.__div__(other) would return NotImplemented if other were a TimedeltaArray. This PR includes a test that covers TDA/TDA.

Yes, but that is another level of redirection, while we know this will happen and can directly do the correct thing here?
(I suppose is_timedelta64_dtype only passes through those two cases of ndarray or TimedeltaArray?)

I suppose is_timedelta64_dtype only passes through those two cases of ndarray or TimedeltaArray?

Yes, since we exclude Series and Index at the start.

Yes, but that is another level of redirection, while we know this will happen and can directly do the correct thing here?

I guess we could replace other with getattr(other, "_data", other) (or if/when TimedeltaArray gets an __array__ method, just np.array(other), which would be prettier)

In the (hopefully not too distant future), TimedeltaArray will no longer be an Index. In this case we would want to explicitly grab the ._data out of it and proceed?

And what's the return type here? Does this need to be wrapped in a a type(self) so that we return a TimedeltaArray?

And what's the return type here?

float-dtyped ndarray

jorisvandenbossche · 2018-11-21T09:19:44Z

pandas/core/arrays/timedeltas.py

+            # let numpy handle it
+            return self._data / other
+
+        elif is_object_dtype(other):


Do we need to allow this?
I would be fine with raising a TypeError here.

(I first wanted to say: can't we dispatch that to numpy, thinking that numpy object dtype would handle that, but they raise a TypeError)

I can see why this would be first on the chopping block if we had to support fewer cases. Is there a compelling reason not to handle this case?

You can also turn around the question :) Is there a compelling reason to do handle this case?

It's just an extra case to support. And eg, we could discuss whether this should return object dtype data or timedelta, as you are inferring now? Looking at Series behaviour with int64 and object integers, it actually returns object. For datetimes it now raises. So at least, our support is at the moment not very consistent.

Is there a compelling reason to do handle this case?

Because a selling point of pandas is that things Just Work? Because the code and tests are already written, so the marginal cost is \approx zero?

our support is at the moment not very consistent

Fair enough. If a goal is to make things more consistent (which I'm +1 on BTW) then we're probably not going to go around and start breaking the places where it currently is supported.

i agree with @jbrockmendel here

pandas/core/arrays/timedeltas.py

jorisvandenbossche · 2018-11-21T09:26:48Z

pandas/core/indexes/timedeltas.py

+            if isinstance(other, Index):
+                # TimedeltaArray defers, so we need to unwrap
+                oth = other._values
+            result = TimedeltaArray.__truediv__(self, oth)


Can we do here something like result = self._values / oth instead of calling the dunder method? (I don't really know the difference in practice, but it looks a bit cleaner)

I'll double-check. Usually my reasoning for using the dunder methods is to preserve NotImplemented semantics.

pandas/core/arrays/timedeltas.py

pandas/core/indexes/timedeltas.py

…ests

jreback · 2018-11-21T19:41:18Z

pandas/core/arrays/timedeltas.py

+
+        elif is_object_dtype(other):
+            result = [self[n] / other[n] for n in range(len(self))]
+            result = np.array(result)


actually this is really close to what soft_convert_objects does.

That isn't clear to me. soft_convert_objects doesn't call lib.infer_dtype or any analogue.

you are essentially re-implementing it. i would rather not do that.

I'm not clear on what you have in mind. Something like:

if lib.infer_dtype(result) == 'timedelta': result = soft_convert_objects(result, timedelta=True, coerce=False) return type(self)(result) return result

?

Fair enough, will change

changed. this ends up changing the behavior of the DataFrame test case, but that's largely driven by the fact that DataFrame([NaT]) gets inferred as datetime64[ns]

what is changed here? shouldn't is_object_type result in a TypeError or a NotImplemented?

There are two independent questions that have been asked about the object-dtype case:

should we just raise TypeError instead or should we handle it so it Just Works (the latter being what this PR does)

Given that we handle this case, do we try to infer the output dtpye or just return object dtype? This PR originally did the former, then changed to do the latter following discussion.

ok this is fine, you are returning object dtype (which is consistent with how we do for Series now)

…ests

jbrockmendel · 2018-11-22T04:06:57Z

Next round implementing floordiv, mod, divmod, and reversed ops is almost ready. Any preference between follow-up vs adding to this PR?

…ests

gfyoung · 2018-11-25T02:06:39Z

Any preference between follow-up vs adding to this PR?

How many more things do you need to follow-up on? And how straightforward are they to handle? Probably would lean towards merging this and follow-up later, but not sure whether some of the remaining comments are blockers...

…ests

jreback · 2018-11-27T02:31:17Z

pandas/core/arrays/timedeltas.py

+
+        elif is_object_dtype(other):
+            result = [self[n] / other[n] for n in range(len(self))]
+            result = np.array(result)


ok reversing here. i agree with @jorisvandenbossche we don't infer object dtypes in ops (only in the constructor), so this seems reasonable

In [2]: pd.Series([1,2, 3]) / pd.Series([1,1,1], dtype=object) Out[2]: 0 1 1 2 2 3 dtype: object

pandas/core/arrays/timedeltas.py

jreback · 2018-11-27T02:32:04Z

pandas/core/arrays/timedeltas.py

+            # let numpy handle it
+            return other / self._data
+
+        elif is_object_dtype(other):


or can raise NotImplemented here? does that work?

I think that might be fragile; might depend on having __array__ implemented. Either way, better to make it explicit than rely on numpy imlpementation

same comment as above

ok, can you add a comment here (and above), were we do the operation but do not infer the output type (just for posterity), otherwise this PR lgtm. ping on green.

…ests

jbrockmendel · 2018-11-27T18:53:40Z

I think all comments have been addressed

jorisvandenbossche

One remaining question/comment about the testing of the last change for object dtype.

For the rest looks good to me!

jorisvandenbossche · 2018-11-27T21:28:41Z

pandas/tests/arithmetic/test_timedelta64.py

            result = tdser / vector.astype(object)
+            expected = [tdser[n] / vector[n] for n in range(len(tdser))]
+            expected = tm.box_expected(expected, xbox)


Wouldn't this converted the expected list to an DatetimeArray, while the expected result is on object ndarray? (so it's not really testing that?)

No. In the case where xbox is tm.to_array (the only case that could conceivably give a DatetimeArray), tm.to_array(any_list) returning np.array(that_list)

Ah, OK, all a bit opaque .. (I checked what get_upcast_box and box_expected do, but not box_with_array :-))

Yah, I'm hoping to simplify some of it, and ideally even get rid of box_expected, but it'll be a while before thats feasible.

jreback · 2018-11-28T17:49:33Z

doc/source/whatsnew/v0.24.0.rst

@@ -1227,7 +1227,7 @@ Timedelta
 - Bug in :class:`TimedeltaIndex` where adding a timezone-aware datetime scalar incorrectly returned a timezone-naive :class:`DatetimeIndex` (:issue:`23215`)
 - Bug in :class:`TimedeltaIndex` where adding ``np.timedelta64('NaT')`` incorrectly returned an all-`NaT` :class:`DatetimeIndex` instead of an all-`NaT` :class:`TimedeltaIndex` (:issue:`23215`)
 - Bug in :class:`Timedelta` and :func:`to_timedelta()` have inconsistencies in supported unit string (:issue:`21762`)
-
+- Bug in :class:`TimedeltaIndex` division where dividing by another :class:`TimedeltaIndex` raised ``TypeError`` instead of returning a :class:`Float64Index` (:issue:`23829`)


i think you need the refernces issue number here. do you need to mention the other issue that is refernces in the top or PR?

jreback · 2018-11-28T17:50:47Z

pandas/core/arrays/timedeltas.py

+
+        elif is_object_dtype(other):
+            result = [self[n] / other[n] for n in range(len(self))]
+            result = np.array(result)


what is changed here? shouldn't is_object_type result in a TypeError or a NotImplemented?

jreback · 2018-11-28T17:50:59Z

pandas/core/arrays/timedeltas.py

+            # let numpy handle it
+            return other / self._data
+
+        elif is_object_dtype(other):


same comment as above

…ests

jbrockmendel · 2018-11-28T20:00:10Z

Travis fail is unrelated flake8-rst problem

…ests

jbrockmendel · 2018-11-29T03:58:09Z

ping

jreback · 2018-11-29T12:48:29Z

thanks!

TomAugspurger · 2018-12-11T17:25:52Z

Is the difference in behavior between TimedeltaIndex / timedelta64('NaT') and TimedeltaIndex / pd.NaT deliberate?

In [14]: rng = pd.timedelta_range(start='1D', periods=10, freq='D')

In [15]: rng / np.timedelta64('NaT')
Out[15]: Float64Index([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan], dtype='float64')

In [16]: rng / pd.NaT
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-16-0bf699fa1965> in <module>
----> 1 rng / pd.NaT

~/sandbox/pandas-alt/pandas/core/indexes/timedeltas.py in __truediv__(self, other)
    259             # TimedeltaArray defers, so we need to unwrap
    260             oth = other._values
--> 261         result = TimedeltaArray.__truediv__(self, oth)
    262         return wrap_arithmetic_op(self, other, result)
    263

~/sandbox/pandas-alt/pandas/core/arrays/timedeltas.py in __truediv__(self, other)
    364         elif lib.is_scalar(other):
    365             # assume it is numeric
--> 366             result = self._data / other
    367             freq = None
    368             if self.freq is not None:

TypeError: ufunc true_divide cannot use operands with types dtype('<m8[ns]') and dtype('O')

jbrockmendel · 2018-12-11T17:53:13Z

Is the difference in behavior between TimedeltaIndex / timedelta64('NaT') and TimedeltaIndex / pd.NaT deliberate?

Yes. timedelta64("NaT") is unambiguously timedelta-like, whereas pd.NaT is both datetime-like and timedelta-like (and defaults to datetime-like).

TomAugspurger · 2018-12-11T18:02:44Z

Thanks for the clarification.

…

On Tue, Dec 11, 2018 at 11:53 AM jbrockmendel ***@***.***> wrote: Is the difference in behavior between TimedeltaIndex / timedelta64('NaT') and TimedeltaIndex / pd.NaT deliberate? Yes. timedelta64("NaT") is unambiguously timedelta-like, whereas pd.NaT is both datetime-like and timedelta-like (and defaults to datetime-like). — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#23829 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABQHIir-WB_mF_gczMVbnt2o4SsZUH-oks5u3_ERgaJpZM4YsTAg> .

…-dev#23829)

jbrockmendel added 2 commits November 20, 2018 18:31

implement truediv, rtruediv directly in TimedeltaArray; tests

6ec1f08

test tdi/tdi specifically

4c2cc59

more checks and test cases

bd2ee96

gfyoung added Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type labels Nov 21, 2018

gfyoung reviewed Nov 21, 2018

View reviewed changes

pandas/tests/arithmetic/test_timedelta64.py Show resolved Hide resolved

gfyoung reviewed Nov 21, 2018

View reviewed changes

pandas/tests/arithmetic/test_timedelta64.py Show resolved Hide resolved

gfyoung reviewed Nov 21, 2018

View reviewed changes

pandas/core/arrays/timedeltas.py Outdated Show resolved Hide resolved

jorisvandenbossche reviewed Nov 21, 2018

View reviewed changes

jreback added this to the 0.24.0 milestone Nov 21, 2018

jreback requested changes Nov 21, 2018

View reviewed changes

pandas/core/arrays/timedeltas.py Show resolved Hide resolved

pandas/core/indexes/timedeltas.py Outdated Show resolved Hide resolved

jreback removed this from the 0.24.0 milestone Nov 21, 2018

jbrockmendel added 5 commits November 21, 2018 09:20

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

3275dd9

…ests

dont define _override_div_mod_methods, matches for pytest.raises

79901f5

change comment

adea273

whatsnew, GH references

da9f743

error msg py3 compat

ba9e490

jreback requested changes Nov 21, 2018

View reviewed changes

jbrockmendel added 2 commits November 21, 2018 13:23

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

10bb49b

…ests

flake8 fixup, raise directly

8f276ae

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

7d56da9

…ests

jbrockmendel mentioned this pull request Nov 24, 2018

Implement mul, floordiv, mod, divmod, and reversed directly in TimedeltaArray #23885

Merged

4 tasks

jbrockmendel added 2 commits November 26, 2018 09:24

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

6097789

…ests

sidestep object conversion

2037be8

jreback requested changes Nov 27, 2018

View reviewed changes

jbrockmendel added 2 commits November 26, 2018 18:52

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

ffedf35

…ests

dont case result when operating against object dtype

2fc44aa

jorisvandenbossche reviewed Nov 27, 2018

View reviewed changes

jbrockmendel mentioned this pull request Nov 28, 2018

REF/TST: Fix remaining DatetimeArray with DateOffset arithmetic ops #23789

Merged

4 tasks

jreback added this to the 0.24.0 milestone Nov 28, 2018

jreback requested changes Nov 28, 2018

View reviewed changes

jbrockmendel added 2 commits November 28, 2018 10:52

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

cd4ff57

…ests

another GH reference

641ad20

jbrockmendel added 6 commits November 28, 2018 15:22

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

e0d696f

…ests

comment

7d9e677

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

dfc7af4

…ests

Fixup rebase mixup, un-skip part of a test that isnt broken after all

55cad6b

Merge branch 'master' of https://github.com/pandas-dev/pandas into gt…

d21ae78

…ests

flake8 fixup

d72bf90

jreback approved these changes Nov 29, 2018

View reviewed changes

jreback merged commit d887927 into pandas-dev:master Nov 29, 2018

jbrockmendel deleted the gtests branch November 29, 2018 14:55

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

implement truediv, rtruediv directly in TimedeltaArray; tests (pandas…

d3b194a

…-dev#23829)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

implement truediv, rtruediv directly in TimedeltaArray; tests (pandas…

ce8ac04

…-dev#23829)

implement truediv, rtruediv directly in TimedeltaArray; tests #23829

implement truediv, rtruediv directly in TimedeltaArray; tests #23829

Conversation

jbrockmendel commented Nov 21, 2018 • edited Loading

pep8speaks commented Nov 21, 2018 • edited Loading

Comment last updated on November 29, 2018 at 02:49 Hours UTC

codecov bot commented Nov 21, 2018 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Nov 22, 2018

gfyoung commented Nov 25, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Nov 27, 2018

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Nov 28, 2018

jbrockmendel commented Nov 29, 2018

jreback commented Nov 29, 2018

TomAugspurger commented Dec 11, 2018

jbrockmendel commented Dec 11, 2018

TomAugspurger commented Dec 11, 2018 via email

jbrockmendel commented Nov 21, 2018 •

edited

Loading

pep8speaks commented Nov 21, 2018 •

edited

Loading

codecov bot commented Nov 21, 2018 •

edited

Loading