ENH: TimeDeltaIndex and corresponding scalar #3009

jreback · 2013-03-11T02:00:02Z

disallow operations in series that are invalid that now create timedelta64
(e.g. dt + dt, only dt - dt?) BUG/API: Fix operating with timedelta64/pd.offsets on rhs of a datelike series/index #4534
create the TimeDeltaIndex object, inherit from Int64, ENH: create Timedelta scalar / TimedeltaIndex #8184
create associated scalar (np.timedelta64 wrapper), like
Timestamp (Timedelta?), direct wrapper might work, ENH: create Timedelta scalar / TimedeltaIndex #8184
that forwards methods, internally repr as timedelta64 (or view?) as array
move pretty printing to the new class (from function in lib), done???
see Weird Datetime Behaviour #3002 and ENH: support min/max on timedelta64[ns] Series #2990, ENH: timedelta should write correctly to csv #4378
need a Timedelta64 Block, with to_native_types (for printing), PR ENH: GH3371 support timedelta fillna #4684
read_csv to have a parse_timedelta to automatically apply the dtype, see API: add parse_timedelta kw for read_csv #4650, ENH: GH3371 support timedelta fillna #4684, moved to ENH: read_csv to have a parse_timedelta kw to automatically parse into timedelta64 #8185

The text was updated successfully, but these errors were encountered:

cpcloud · 2013-04-17T22:28:45Z

This would be totally awesome, especially for any kind of experimental data.

jreback · 2013-04-17T22:32:41Z

there is already quite a lot of support, see the time delta section (in time series part of docs)
this is just an extension to make it easier to do some things
eg the index is works but its am Int64Index, so a bit non-intuitive

cpcloud · 2013-07-27T14:06:19Z

ha! didn't realize i had seen this already

hughesadam87 · 2014-07-28T22:25:37Z

For reference (as #7640) was not the appropriate place to share, here's a hacked implementation of datetime --> timeinterval.

http://nbviewer.ipython.org/github/hugadams/pyuvvis/blob/master/examples/Notebooks/intervals.ipynb

I'd imagine the TimeDeltaIndex should supplant the need for such an API entirely.

jreback · 2014-09-05T01:22:14Z

Almost have this working.

In [1]: pd.TimedeltaIndex(range(5),unit='s')
Out[1]: 
<class 'pandas.tseries.tdi.TimedeltaIndex'>
[00:00:00, ..., 00:00:04]
Length: 5, Freq: None

In [2]: pd.TimedeltaIndex(range(5),unit='d')
Out[2]: 
<class 'pandas.tseries.tdi.TimedeltaIndex'>
[00:00:00, ..., 4 days, 00:00:00]
Length: 5, Freq: None

I think I need something better for the repr, because unlike other formats, commas CAN be in a single element, so it gets confusing.

In [3]: pd.TimedeltaIndex(range(5),unit='d')+pd.offsets.Hour(1)
Out[3]: 
<class 'pandas.tseries.tdi.TimedeltaIndex'>
[01:00:00, ..., 4 days, 01:00:00]
Length: 5, Freq: None

any thoughts

@jorisvandenbossche @cpcloud @hayd @TomAugspurger
cc @shoyer

shoyer · 2014-09-05T01:26:16Z

@jreback Sweet! Is there a Timedelta scalar or are you just reusing np.timedelta64? Is there some support for missing values like NaT? (Maybe you have a PR or branch which answers these questions?)

jreback · 2014-09-05T01:50:56Z

@shoyer just created a Timedelta scalar (pretty much like a Period, but no freq), just a .value really. I need it mainly to box (otherwise you get silly np.timedelta64 when displaying which is ugly).

In [1]: (pd.TimedeltaIndex(range(5),unit='d')+pd.offsets.Hour(1)).tolist()
Out[1]: 
[Timedelta('01:00:00'),
 Timedelta('1 days, 01:00:00'),
 Timedelta('2 days, 01:00:00'),
 Timedelta('3 days, 01:00:00'),
 Timedelta('4 days, 01:00:00')]

NaT should work as well.
not really any tests yet though :)

https://github.com/jreback/pandas/tree/tdi

jreback · 2014-09-05T02:02:06Z

API issue:

should a Timedelta scalar be more timedelta like or np.timedelta64 like?

mainly this is for the constructor. should it be as close to timedelta as possible so that they are 'effectively' interchangeable (but with Timedelta improving on the oddiietes), much like Timestamp does?

hayd · 2014-09-05T02:20:27Z

If you can get back from the string to the timedelta, then maybe stringify the repr to remove comma ambiguity:

In [2]: pd.TimedeltaIndex(range(5),unit='d')
Out[2]: 
<class 'pandas.tseries.tdi.TimedeltaIndex'>
["00:00:00", ..., "4 days, 00:00:00"]
Length: 5, Freq: None

+1 This looks great!

shoyer · 2014-09-05T03:00:49Z

I assume that the you will be able to query using any string that can be handled by pd.to_timedelta? e.g., s.loc['1 day']?

RE: timedelta like or np.timedelta64 like: I think Timestamp sets a clear precedent (even though I'm not sure that was the right decision -- there is something to be said for being similar to a numpy scalar, which Timestamp is not) so I would make Timedelta more timedelta like.

jreback · 2014-09-05T03:04:05Z

yep that's pretty easy

a bit trickier (but not a lot) is partial string indexing

eg s.loc['1 day':]

is effectively >=1 day and < 2 days

cpcloud · 2014-09-05T04:09:42Z

... s.loc['1 day':]
is effectively >=1 day and < 2 days

why wouldn't that be ">= 1 day"?

jorisvandenbossche · 2014-09-05T07:11:19Z

For the repr, other option is to use ; to seperate? But maybe quoting is better (; is not used for anything else)

jorisvandenbossche · 2014-09-05T07:12:58Z

@jreback How would both look like (what would be the differences) if Timedelta would more be like datetime.timedelta or np.timedelta64?

And would a timedelta64 series element also return this? (like you now also get Timestamps from a datetime64 series)?

jreback · 2014-09-05T14:18:58Z

So I created Timedelta as an extension type almost exactly like Timestamp (it has a c-extension and a shadow python class). This makes it a sub-class of timedelta which is nice (and also has a numpy-like value which makes it fast, though like Timestamp its not actually stored in an Index or Series, rather the underlying data is (which is how it is currently)

In [2]: from pandas import Timedelta

In [3]: Timedelta('10 days, 00:00:10')
Out[3]: Timedelta('10 days, 00:00:10')

In [4]: type(Timedelta('10 days, 00:00:10'))
Out[4]: pandas.tslib.Timedelta

In [5]: Timedelta(days=10,milliseconds=10*1000)
Out[5]: Timedelta('10 days, 00:00:10')

In [6]: Timedelta('nat')
Out[6]: Timedelta('NaT')

In [7]: Timedelta(10,unit='d')
Out[7]: Timedelta('10 days, 00:00:00')

In [8]: isinstance(Timedelta(10,unit='d'),timedelta)
Out[8]: True

In [10]: pd.to_timedelta(range(5),unit='h')
Out[10]: 
<class 'pandas.tseries.td.TimedeltaIndex'>
[00:00:00, ..., 04:00:00]
Length: 5, Freq: None

In [11]: Series(pd.to_timedelta(range(5),unit='h'))
Out[11]: 
0   00:00:00
1   01:00:00
2   02:00:00
3   03:00:00
4   04:00:00
dtype: timedelta64[ns]

In [12]: list(Series(pd.to_timedelta(range(5),unit='h')))
Out[12]: 
[Timedelta('0 days, 00:00:00'),
 Timedelta('0 days, 01:00:00'),
 Timedelta('0 days, 02:00:00'),
 Timedelta('0 days, 03:00:00'),
 Timedelta('0 days, 04:00:00')]

only API change in this entire thing really is that to_timedelta will now return a TimedeltaIndex rather than a Series by default (which makes it consisten with to_datetime as well).

Still need more tests / integration.

jreback · 2014-09-05T14:40:28Z

PR is now #8184, further comments can go there

jreback mentioned this issue Mar 16, 2013

Accessing fields in datetime64 Series #2315

Closed

jreback mentioned this issue Mar 26, 2013

ENH add time to DatetimeIndex #3180

Closed

hayd mentioned this issue Jul 5, 2013

API: Series/DTIndex arithmetic with timedeltas and offsets should work the same #4134

Closed

jreback mentioned this issue Jul 27, 2013

ENH: timedelta should write correctly to csv #4378

Closed

This was referenced Sep 12, 2013

ENH/CLN: support enhanced timedelta64 operations/conversions #4822

Merged

API: add parse_timedelta kw for read_csv #4650

Closed

jreback modified the milestones: 0.15.0, 0.14.0 Feb 15, 2014

shoyer mentioned this issue May 16, 2014

Allow datetime.timedelta coordinates. pydata/xarray#55

Closed

shoyer mentioned this issue Jun 11, 2014

time delta index #7425

Closed

shoyer mentioned this issue Jul 28, 2014

Proposal: New Index type for binned data (IntervalIndex) #7640

Closed

jreback mentioned this issue Sep 5, 2014

ENH: create Timedelta scalar / TimedeltaIndex #8184

Merged

25 tasks

jreback modified the milestones: 0.15.0, 0.15.1 Sep 5, 2014

jreback mentioned this issue Sep 5, 2014

ENH: read_csv to have a parse_timedelta kw to automatically parse into timedelta64 #8185

Closed

This was referenced Sep 6, 2014

PERF: improve to_timedelta perf for string-like #6755

Closed

ENH: Add tz_localize and tz_convert to Series.dt methods #8187

Closed

jreback closed this as completed in #8184 Sep 13, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: TimeDeltaIndex and corresponding scalar #3009

ENH: TimeDeltaIndex and corresponding scalar #3009

jreback commented Mar 11, 2013

cpcloud commented Apr 17, 2013

jreback commented Apr 17, 2013

cpcloud commented Jul 27, 2013

hughesadam87 commented Jul 28, 2014

jreback commented Sep 5, 2014

shoyer commented Sep 5, 2014

jreback commented Sep 5, 2014

jreback commented Sep 5, 2014

hayd commented Sep 5, 2014

shoyer commented Sep 5, 2014

jreback commented Sep 5, 2014

cpcloud commented Sep 5, 2014

jorisvandenbossche commented Sep 5, 2014

jorisvandenbossche commented Sep 5, 2014

jreback commented Sep 5, 2014

jreback commented Sep 5, 2014

ENH: TimeDeltaIndex and corresponding scalar #3009

ENH: TimeDeltaIndex and corresponding scalar #3009

Comments

jreback commented Mar 11, 2013

cpcloud commented Apr 17, 2013

jreback commented Apr 17, 2013

cpcloud commented Jul 27, 2013

hughesadam87 commented Jul 28, 2014

jreback commented Sep 5, 2014

shoyer commented Sep 5, 2014

jreback commented Sep 5, 2014

jreback commented Sep 5, 2014

hayd commented Sep 5, 2014

shoyer commented Sep 5, 2014

jreback commented Sep 5, 2014

cpcloud commented Sep 5, 2014

jorisvandenbossche commented Sep 5, 2014

jorisvandenbossche commented Sep 5, 2014

jreback commented Sep 5, 2014

jreback commented Sep 5, 2014