Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DatetimeIndex.__getitem__ with boolean Index mask with False raises TypeError #22533

Closed
PH82 opened this issue Aug 29, 2018 · 3 comments · Fixed by #22852
Closed

BUG: DatetimeIndex.__getitem__ with boolean Index mask with False raises TypeError #22533

PH82 opened this issue Aug 29, 2018 · 3 comments · Fixed by #22852
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@PH82
Copy link

PH82 commented Aug 29, 2018

import pandas as pd
import pytz
import datetime
import functools


def isNotWeekendIfEndBeforeStart (idxDateTime, startTime, endTime):
    if idxDateTime.dayofweek < 4: return True
    if idxDateTime.dayofweek == 5: return False
    if idxDateTime.dayofweek == 4: return idxDateTime.time() <= endTime if endTime else True
    if idxDateTime.dayofweek == 6: return idxDateTime.time() > startTime if startTime else False
    return True

if __name__ == '__main__':
    # pd.show_versions()
    # print(pd.date_range(start=datetime.datetime(2016, 3, 18, 0, 0), end=datetime.datetime(2016, 3, 22, 0, 0), freq='D',
    #                     tz='UTC', closed='right'))
    nytz = pytz.timezone('America/New_York')
    startTimeBkt = nytz.localize(datetime.datetime(2018,6,1,22))
    endTimeBkt = nytz.localize(datetime.datetime(2018,7,31,10,15))

    print(startTimeBkt)
    print(endTimeBkt)

    rng = pd.date_range(start=startTimeBkt, end=endTimeBkt, freq='900S', tz=nytz, closed='right')
    #print(rng)

    startCut=datetime.time(17, tzinfo=nytz)
    endCut=datetime.time(15, tzinfo=nytz)
    print(startCut)
    print(endCut)

    removeWeekends = functools.partial(isNotWeekendIfEndBeforeStart, startTime=startCut, endTime=endCut)
    rng = rng[rng.map(removeWeekends)]
    print(rng)

The above code creates a datetimeindex of 15 minute intervals and then removes all rows deemed as weekend times (although weekends are not defined by midnight and use a specific time of day) This works with pandas version 19 but when I looked to upgrade a numpy error is returned. This occurs on both Python 2.7 and Python 3.6

Errors:

Pandas 0.20.3

Traceback (most recent call last):
File "PandasTest.py", line 41, in
rng = rng[rng.map(removeWeekends)]
File "%PY_HOME%\lib\site-packages\pandas\core\indexes\datetimelike.py", line 296, in getitem
result = getitem(key)
IndexError: only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices

Pandas 0.21.1

Traceback (most recent call last):
File "PandasTest.py", line 41, in
rng = rng[rng.map(removeWeekends)]
File "%PY_HOME%\lib\site-packages\pandas\core\indexes\datetimelike.py", line 279, in getitem
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
File "%PY_HOME%\lib\site-packages\numpy\core_internal.py", line 365, in _view_is_safe
raise TypeError("Cannot change data-type for object array.")
TypeError: Cannot change data-type for object array.

Pandas 0.22.0

Traceback (most recent call last):
File "PandasTest.py", line 41, in
rng = rng[rng.map(removeWeekends)]
File "%PY_HOME%\lib\site-packages\pandas\core\indexes\datetimelike.py", line 279, in getitem
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
File "%PY_HOME%\lib\site-packages\numpy\core_internal.py", line 365, in _view_is_safe
raise TypeError("Cannot change data-type for object array.")
TypeError: Cannot change data-type for object array.

Pandas 0.23.4

Traceback (most recent call last):
File "PandasTest.py", line 41, in
rng = rng[rng.map(removeWeekends)]
File "%PY_HOME%\lib\site-packages\pandas\core\indexes\datetimelike.py", line 411, in getitem
key = lib.maybe_booleans_to_slice(key.view(np.uint8))
File "%PY_HOME%\lib\site-packages\numpy\core_internal.py", line 365, in _view_is_safe
raise TypeError("Cannot change data-type for object array.")
TypeError: Cannot change data-type for object array.

@mroeschke
Copy link
Member

mroeschke commented Aug 29, 2018

Here's a similar reproducible example:

In [27]: pd.__version__
Out[27]: '0.24.0.dev0+523.gbf6763458'

In [28]: dti = pd.DatetimeIndex(['2012-01-01'])

# Interestingly works for True
In [29]: dti[pd.Index([True])]
Out[29]: DatetimeIndex(['2012-01-01'], dtype='datetime64[ns]', freq=None)

In [30]: dti[pd.Index([False])]
TypeError: Cannot change data-type for object array.

Investigations and PR's welcome!

@mroeschke mroeschke changed the title Regression: DateTimeIndex Map Function Returns numpy Error BUG: DatetimeIndex.__getitem__ with boolean mask with False raises TypeError Aug 29, 2018
@mroeschke mroeschke added Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves labels Aug 29, 2018
@mroeschke mroeschke changed the title BUG: DatetimeIndex.__getitem__ with boolean mask with False raises TypeError BUG: DatetimeIndex.__getitem__ with boolean Index mask with False raises TypeError Aug 29, 2018
@mroeschke
Copy link
Member

Note, this works if you convert the resulting Index from map to a numpy array:

In [45]: rng[rng.map(removeWeekends).values.astype(bool)]
Out[45]:
DatetimeIndex(['2018-06-03 17:15:00-04:00', '2018-06-03 17:30:00-04:00',
               '2018-06-03 17:45:00-04:00', '2018-06-03 18:00:00-04:00',
               '2018-06-03 18:15:00-04:00', '2018-06-03 18:30:00-04:00',
               '2018-06-03 18:45:00-04:00', '2018-06-03 19:00:00-04:00',
               '2018-06-03 19:15:00-04:00', '2018-06-03 19:30:00-04:00',
               ...
               '2018-07-31 08:00:00-04:00', '2018-07-31 08:15:00-04:00',
               '2018-07-31 08:30:00-04:00', '2018-07-31 08:45:00-04:00',
               '2018-07-31 09:00:00-04:00', '2018-07-31 09:15:00-04:00',
               '2018-07-31 09:30:00-04:00', '2018-07-31 09:45:00-04:00',
               '2018-07-31 10:00:00-04:00', '2018-07-31 10:15:00-04:00'],
              dtype='datetime64[ns, America/New_York]', length=3941, freq=None)

@sinhrks
Copy link
Member

sinhrks commented Sep 27, 2018

This also affects to other dtypes. Will send a fix.

pd.Index([1])[pd.Index([False])]
# IndexError: arrays used as indices must be of integer (or boolean) type

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants