ENH: Add Series.dt.total_seconds GH #10817 #10939

Merged
merged 1 commit into from Sep 2, 2015

Conversation

Projects
None yet
3 participants
Contributor

sjdenny commented Aug 30, 2015

Implements a Series.dt.total_seconds method for timedelta64 Series.

closes #10817

Contributor

jreback commented Aug 30, 2015

does timedelta.total_seconds() provide fractional second as wel?

Contributor

sjdenny commented Aug 30, 2015

Yes, fractional seconds are included:

In [1]: s = Series(pd.timedelta_range('1day',periods=5,freq='1ms'))

In [2]: s
Out[2]: 
0          1 days 00:00:00
1   1 days 00:00:00.001000
2   1 days 00:00:00.002000
3   1 days 00:00:00.003000
4   1 days 00:00:00.004000
dtype: timedelta64[ns]

In [3]: s.dt.total_seconds
Out[3]: 
0    86400.000
1    86400.001
2    86400.002
3    86400.003
4    86400.004
dtype: float64
Contributor

jreback commented Aug 30, 2015

not what I mean

is the actual Python timedelta.total_seconds have fractions (I think yes) just confirming

Contributor

sjdenny commented Aug 30, 2015

Ah, I understand. Yes, it does:

In [38]: t = timedelta(days=1,microseconds=40)

In [39]: t.total_seconds()
Out[39]: 86400.00004

jreback added this to the 0.17.0 milestone Aug 31, 2015

@jreback jreback commented on an outdated diff Aug 31, 2015

pandas/tseries/tests/test_timedeltas.py
@@ -944,6 +945,27 @@ def test_fields(self):
tm.assert_series_equal(s.dt.days,Series([1,np.nan],index=[0,1]))
tm.assert_series_equal(s.dt.seconds,Series([10*3600+11*60+12,np.nan],index=[0,1]))
+
+ def test_total_seconds(self):
@jreback

jreback Aug 31, 2015

Contributor

add a test in test_series/test_dt_namespace_accessor as well (its like the one you have for test Series)

@jreback jreback commented on an outdated diff Aug 31, 2015

pandas/tseries/tests/test_timedeltas.py
@@ -944,6 +945,27 @@ def test_fields(self):
tm.assert_series_equal(s.dt.days,Series([1,np.nan],index=[0,1]))
tm.assert_series_equal(s.dt.seconds,Series([10*3600+11*60+12,np.nan],index=[0,1]))
+
+ def test_total_seconds(self):
+ # test index
+ rng = timedelta_range('1 days, 10:11:12.100123456', periods=2, freq='s')
+ expt = [1*86400+10*3600+11*60+12+100123456./1e9,1*86400+10*3600+11*60+13+100123456./1e9]
+ assert_allclose(rng.total_seconds, expt, atol=1e-10, rtol=0)
+
+ # test Series
+ s = Series(rng)
+ s_expt = Series(expt,index=[0,1])
+ tm.assert_series_equal(s.dt.total_seconds,s_expt)
@jreback

jreback Aug 31, 2015

Contributor

can you add a test for the scalar as well (even though we know it works as its a sub-class of timedelta), I don't think its testsed. e.g. Timedelta(....).total_seconds().

Note that this is a method, so maybe we should make total_seconds (on a TimedeltaIndex) a method as well (even though its a bit odd, for consitency with the scalar)?

Indeed, didn't see that, but +1 making it a method instead of a property for consistency with datetime.timedelta/pd.Timedelta

Further, @sjdenny can you add a whatsnew notice (see doc/source/whatsnew/v0.17.0.txt, somewhere in 'other enhancements')

Contributor

sjdenny commented Aug 31, 2015

In extending the (nanosecond-precision) tests, I've come across this behaviour:

In [18]: td = pd.Timedelta(nanoseconds=50)

In [19]: td.nanoseconds
Out[19]: 50

In [20]: td.total_seconds()
Out[20]: 0.0

In [21]: td.total_seconds() == 0.0
Out[21]: True

In [22]: td2 = pd.Timedelta(microseconds=50)

In [23]: td2.total_seconds()
Out[23]: 5e-05

It appears (e.g. here) that timedelta.total_seconds() aims for microsecond accuracy only. Is this the behaviour we want to reproduce in Series.dt.total_seconds()? Nanosecond precision appears preferable, but then the scalar and Series functions would have slightly different behaviour.

It should be consistent between both in any case, but maybe we can also see the Timedelta behaviour as a bug?

I suspect that this method is just inherited from datetime.timedelta, so maybe we will have to overwrite it ourselves in the subclass.

BTW, the reason of this difference is that datetime.timedelta does not support nanoseconds, while pd.Timedelta does

Contributor

jreback commented Aug 31, 2015

the pandas functions should work with ns precision in all cases

so need to override the scalar function as well

@sjdenny This should probably be added somewhere here: https://github.com/pydata/pandas/blob/master/pandas/tslib.pyx#L2415 (I think the self.value will give you the nanoseconds, and then you can use the same approach)

@jreback jreback commented on an outdated diff Aug 31, 2015

doc/source/whatsnew/v0.17.0.txt
@@ -176,6 +176,10 @@ Other enhancements
- ``pandas.tseries.offsets`` larger than the ``Day`` offset can now be used with with ``Series`` for addition/subtraction (:issue:`10699`). See the :ref:`Documentation <timeseries.offsetseries>` for more details.
+- ``pd.Series`` of type timedelta64 has new method .dt.total_seconds() returning the duration of the timedelta in seconds (:issue: `10817`)
@jreback

jreback Aug 31, 2015

Contributor

use double-backticks around timedelta64 and .dt.total_seconds()

@jreback jreback commented on an outdated diff Aug 31, 2015

pandas/tests/test_series.py
@@ -142,6 +142,7 @@ def compare(s, name):
for s in [Series(timedelta_range('1 day',periods=5),index=list('abcde')),
Series(timedelta_range('1 day 01:23:45',periods=5,freq='s')),
Series(timedelta_range('2 days 01:23:45.012345',periods=5,freq='ms'))]:
+ #assert False
@jreback

jreback Aug 31, 2015

Contributor

leftover?

@jreback jreback commented on an outdated diff Aug 31, 2015

pandas/tseries/tests/test_timedeltas.py
@@ -945,6 +946,31 @@ def test_fields(self):
tm.assert_series_equal(s.dt.days,Series([1,np.nan],index=[0,1]))
tm.assert_series_equal(s.dt.seconds,Series([10*3600+11*60+12,np.nan],index=[0,1]))
+ def test_total_seconds(self):
+ # test index
@jreback

jreback Aug 31, 2015

Contributor

add the issue number as a comment here

Contributor

jreback commented Aug 31, 2015

I think also add the method .total_seconds() on the NaTType object (returns np.nan) for compat. (and pls add a test as well).

sjdenny changed the title from Add Series.dt.total_seconds GH #10817 to ENH: Add Series.dt.total_seconds GH #10817 Sep 1, 2015

@jreback jreback commented on an outdated diff Sep 1, 2015

pandas/tseries/tdi.py
@@ -391,6 +391,17 @@ def f(x):
result = result.astype('int64')
return result
+ def total_seconds(self):
+ """ Total duration of each element expressed in seconds. """
+ values = self.asi8
+ hasnans = self.hasnans
+ result = 1e-9 * values
+ if hasnans:
@jreback

jreback Sep 1, 2015

Contributor

use result = self._maybe_mask_results(result) (you can remove the hasnans as well)

@jreback jreback added a commit that referenced this pull request Sep 2, 2015

@jreback jreback Merge pull request #10939 from sjdenny/series_total_seconds
ENH: Add Series.dt.total_seconds GH #10817
582eb17

@jreback jreback merged commit 582eb17 into pandas-dev:master Sep 2, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Contributor

jreback commented Sep 2, 2015

@sjdenny awesome job!

@sjdenny Thanks a lot!

sjdenny deleted the sjdenny:series_total_seconds branch Sep 2, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment