Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
ENH: enable pd.cut to handle i8 convertibles #14714
Comments
jreback
added Difficulty Novice Effort Low Enhancement Groupby Timedelta
labels
Nov 22, 2016
jreback
added this to the
Next Major Release
milestone
Nov 22, 2016
jreback
changed the title from
ENH: enable pd.cut to handle timedeltas to ENH: enable pd.cut to handle i8 convertibles
Nov 22, 2016
jreback
added the
Timeseries
label
Nov 22, 2016
|
@jreback in the above example should the return type of the final object be 'timedelta64'? (the datatype of the original input) |
|
yes (it should be original dtype) |
|
@jreback Does it currently return strings? I got the below output by printing typeof over the members of category object returned |
|
yes it returns strings |
|
@jreback i am bit confused now, as part of this enhancement we first need convert to a dtype which cut can handle (timedelta64[ms]) and then return the type (timedelta64[ns]) from which we originally started. Though the objects returned will still be strings but they will be strings composed of object types that we initially passed to cut (timedelta64[ns])? |
|
yes this is a bit tricky. I think you:
|
|
@jreback thanks, this is what i was thinking |
|
IF we had an |
|
@jreback is there an error with the way i am making this round trip I start with s which is timedelta64[ns]. This is what "s". I then convert using the astype conversion used earlier here. Then i convert back to timedelta64[ns] using another as type conversion. But this conversion does not retain data and r is just a list having no time information s = pd.Series(pd.to_timedelta(np.random.randint(0,100,size=10),unit='ms')).sort_values() python data_conversion.py |
|
no, you will always convert to
then |
|
@jreback does this code block emulate the change we want to make? import pandas as pd output 1 00:00:00.012000 |
|
@jreback should a series s of pandas.TimeDelta objects on s.astype('timedelta64[ns]') return a series having a numpy timedelta64 objects with time in seconds. Currently on doing s.astype('timedelta64[ns]') it return back pandas.TimeDelta objects instead of the numpy equivalent |
|
we always return pandas objects (for timedelta / datetime) |
|
@jreback is it a good idea to use infer_dtype (from pandas.lib) to test if the object that we are calling cut on is of type datetime or timedelta, to decide if we need to do a conversion to timedelta[64] or datetime[64]? |
|
no use needs_i8_conversion |
This was referenced Nov 25, 2016
jorisvandenbossche
closed this
in #14737
Dec 3, 2016
aileronajay
referenced
this issue
Dec 22, 2016
Closed
allowing datetime and timedelta datatype in pd cut bins #14798
jreback
added a commit
that referenced
this issue
Dec 22, 2016
|
|
aileronajay + jreback |
3e4f839
|
ShaharBental
added a commit
to ShaharBental/pandas
that referenced
this issue
Dec 26, 2016
|
|
aileronajay + ShaharBental |
4c75674
|
jreback commentedNov 22, 2016
•
edited
so this should work for timedeltas AND datetimes.
Should be straight forward. detect an i8 convertible. turn into i8. do the cut. turn back to the original dtype.