allowing datetime and timedelta datatype in pd cut bins #14798

Closed
wants to merge 3 commits into
from

Conversation

Projects
None yet
5 participants
Contributor

aileronajay commented Dec 4, 2016 edited by jreback

xref #14714, follow-on to #14737

Contributor

aileronajay commented Dec 4, 2016

The change is currently WIP, will add tests and other change, @jorisvandenbossche

codecov-io commented Dec 5, 2016 edited

Current coverage is 84.64% (diff: 88.88%)

Merging #14798 into master will increase coverage by <.01%

@@             master     #14798   diff @@
==========================================
  Files           144        144          
  Lines         51021      51030     +9   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43188      43196     +8   
- Misses         7833       7834     +1   
  Partials          0          0          

Powered by Codecov. Last update f79bc7a...82bffa1

@@ -313,6 +313,18 @@ def test_datetime_cut(self):
result, bins = cut(data, 3, retbins=True)
tm.assert_series_equal(Series(result), expected)
+ def test_datetime_bin(self):
+ data = [np.datetime64('2012-12-13'), np.datetime64('2012-12-15')]
@jreback

jreback Dec 5, 2016

Contributor

add the issue number as a comment

@aileronajay

aileronajay Dec 22, 2016 edited

Contributor

@jreback i dont think there is an open issue for this change, @jorisvandenbossche had proposed this change when i was making changes to cut to allow datetime and timedelta data types

pandas/tools/tests/test_tile.py
@@ -313,6 +313,18 @@ def test_datetime_cut(self):
result, bins = cut(data, 3, retbins=True)
tm.assert_series_equal(Series(result), expected)
+ def test_datetime_bin(self):
+ data = [np.datetime64('2012-12-13'), np.datetime64('2012-12-15')]
+ bins = [np.datetime64('2012-12-12'), np.datetime64('2012-12-14'),
@jreback

jreback Dec 5, 2016

Contributor

use pd.Timestamp(....) instead of direct np.datetime64

@jreback

jreback Dec 5, 2016 edited

Contributor

actually you prob want to test with datetime.datetime, Timestamp, np.datetime64 (but just put them in a loop something like

data = ['2012-12-12', '2012-12-14']

for conv in [Timestamp(x).to_pydatetime, Timestamp, np.datetime64]:
      bins = [ conv(v) for v in data ]

also test
bins = pd.to_datetime(data)

these should all work, because internally you need to wrap a Timestamp converter around each of fhe bins (if dtype==M8) or Timedelta if dtype==m8

@aileronajay

aileronajay Dec 22, 2016

Contributor

@jreback made the changes, i added a new method in tile.py to handle the time type bins

jorisvandenbossche added this to the 0.20.0 milestone Dec 6, 2016

@aileronajay can you update this?

Contributor

aileronajay commented Dec 14, 2016

@jorisvandenbossche i am caught up with some stuff right now, i should be able to make these changes next week

aileronajay added some commits Dec 4, 2016

@aileronajay aileronajay allowing datetime and timedelta datatype in pd cut bins 355e569
@aileronajay aileronajay added test for datetime bin type ac919cf
@aileronajay aileronajay added method for time type bins in pd cut and modified tests
82bffa1

jreback closed this in 3e4f839 Dec 22, 2016

Contributor

jreback commented Dec 22, 2016

thanks!

@ShaharBental ShaharBental added a commit to ShaharBental/pandas that referenced this pull request Dec 26, 2016

@aileronajay @ShaharBental aileronajay + ShaharBental ENH: allowing datetime and timedelta datatype in pd cut bins
xref #14714, follow-on to #14737

Author: Ajay Saxena <aileronajay@gmail.com>

Closes #14798 from aileronajay/cut_timetype_bin and squashes the following commits:

82bffa1 [Ajay Saxena] added method for time type bins in pd cut and modified tests
ac919cf [Ajay Saxena] added test for datetime bin type
355e569 [Ajay Saxena]  allowing datetime and timedelta datatype in pd cut bins
4c75674
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment