Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex #11679

Closed
lopezco opened this issue Nov 23, 2015 · 7 comments

Comments

Projects
None yet
5 participants
@lopezco
Copy link

commented Nov 23, 2015

addtl example / repro in #13908

Hello,

I encounter some problems using the data provided here: data.txt

When I was trying to use .loc on my DateTimeIndex, I go t the following error

KeyError: "['2015-03-01T02:00:00.000000000+0100'] not in index"

Notice that the error occurs only if we have the DST changing time in the DateTimeIndex

While debugging I've found that the error comes from the _convert_to_indexer method at:

# indexing.py
mask = check == -1
if mask.any():
    raise KeyError('%s not in index' % objarr[mask]) <------------------------------------

Here is the copy-paste example using the data on the .txt attached:

data = pd.read_csv("data.txt"), parse_dates={'time':['Date']})
data.set_index('time', inplace=True)
data.index = data.index.tz_localize('Europe/Paris', ambiguous='infer').tz_convert('UTC')

for i in range(1, len(data)):
    try:-aware 
        data.loc[data.index[:i], 'value'] = -1
    except Exception as e:
        print i,e
        break

Other important fact: the error occurs for pandas version 0.17.0 and 0.17.1. However, I've managed to make it work with an older version of pandas (0.16.2)

I hope this is explicit enough.

UPDATE

In[19]: pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.19.0-33-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.17.1
nose: 1.3.7
pip: 7.1.2
setuptools: 18.5
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.1
statsmodels: None
IPython: 4.0.0
sphinx: None
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: None
tables: None
numexpr: 2.4.6
matplotlib: 1.5.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None
@jreback

This comment has been minimized.

Copy link
Contributor

commented Nov 23, 2015

pls show a copy-pastable example, and pd.show_versions()

@lopezco lopezco changed the title error in in _convert_to_indexer while using .loc for DST changing times (with tz-aware DateTimeIndex) error in in _convert_to_indexer while using .loc with a DataFrame tha has DST changing times in its DateTimeIndex Nov 23, 2015

@briandavidgreen

This comment has been minimized.

Copy link

commented Nov 23, 2015

I'm experiencing a similar issue, specifically when using a DateTimeIndex as a major_axis in a panel. Somewhere inside the pandas indexing, the dtype of a time-zone aware DateTimeIndex is being stripped away. Copy-paste example:
import pandas as pd

##########################################################################################
#Works
axis_0 = ['A','B','C']
axis_1 = pd.DatetimeIndex(start='2015-11-04',end='2015-11-05',freq='120T')
axis_2 = ['d','e','f']

normal_panel = pd.Panel(items = axis_0, major_axis = axis_1, minor_axis = axis_2)

normal_panel.ix[:,:,'d'] = 1.
normal_panel.ix[:,:,'e'] = normal_panel.ix[:,:,'d']

print normal_panel.ix[:,:,'e']

##########################################################################################
#Doesn't work
axis_0 = ['A','B','C']
axis_1 = pd.DatetimeIndex(start='2015-11-04',end='2015-11-05',freq='120T',tz='US/Central')
axis_2 = ['d','e','f']

tz_panel = pd.Panel(items = axis_0, major_axis = axis_1, minor_axis = axis_2)

tz_panel.ix[:,:,'d'] = 1.
tz_panel.ix[:,:,'e'] = tz_panel.ix[:,:,'d']

print tz_panel.ix[:,:,'e']

@briandavidgreen

This comment has been minimized.

Copy link

commented Nov 23, 2015

INSTALLED VERSIONS

commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None

pandas: 0.17.0
nose: 1.3.7
pip: 7.1.2
setuptools: 18.4
Cython: 0.23.4
numpy: 1.10.1
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 4.0.0
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.4.3
openpyxl: 2.2.6
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.7.7
lxml: 3.4.4
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.9
pymysql: None
psycopg2: None

@briandavidgreen

This comment has been minimized.

Copy link

commented Nov 23, 2015

Similarly I am able to produce joseRLC's issue when slicing on the major_axis:

tz_panel.ix[:,tz_panel.major_axis[0:1],'d'] = tz_panel.ix[:,tz_panel.major_axis[0:1],'e']

KeyError Traceback (most recent call last)
in ()
----> 1 tz_panel.ix[:,tz_panel.major_axis[0:1],'d'] = tz_panel.ix[:,tz_panel.major_axis[0:1],'e']

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in setitem(self, key, value)
112
113 def setitem(self, key, value):
--> 114 indexer = self._get_setitem_indexer(key)
115 self._setitem_with_indexer(indexer, value)
116

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in _get_setitem_indexer(self, key)
104
105 if isinstance(key, tuple) and not self.ndim < len(key):
--> 106 return self._convert_tuple(key, is_setter=True)
107
108 try:

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in _convert_tuple(self, key, is_setter)
153 else:
154 for i, k in enumerate(key):
--> 155 idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
156 keyidx.append(idx)
157 return tuple(keyidx)

C:\Anaconda2\lib\site-packages\pandas\core\indexing.pyc in _convert_to_indexer(self, obj, axis, is_setter)
1119 mask = check == -1
1120 if mask.any():
-> 1121 raise KeyError('%s not in index' % objarr[mask])
1122
1123 return _values_from_object(indexer)

KeyError: "['2015-11-04T00:00:00.000000000-0600'] not in index"

@lopezco lopezco changed the title error in in _convert_to_indexer while using .loc with a DataFrame tha has DST changing times in its DateTimeIndex error in in _convert_to_indexer while using .loc with DateTimeIndex Nov 24, 2015

@briandavidgreen

This comment has been minimized.

Copy link

commented Nov 24, 2015

@joseRLC issue title should probably say "...with tz_aware DateTimeIndex" The issue does not occur for non-tz_aware indices.

@lopezco lopezco changed the title error in in _convert_to_indexer while using .loc with DateTimeIndex error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex Nov 24, 2015

@lopezco

This comment has been minimized.

Copy link
Author

commented Nov 24, 2015

@briandavidgreen you're right. Sorry!

@jreback jreback added this to the 0.18.1 milestone Mar 1, 2016

@jreback jreback modified the milestones: 0.18.2, 0.18.1 Apr 25, 2016

@jreback jreback added the Indexing label Aug 4, 2016

@jreback jreback changed the title error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex BUG: error in in _convert_to_indexer while using .loc with tz-aware DateTimeIndex Aug 4, 2016

@jorisvandenbossche jorisvandenbossche modified the milestones: Next Major Release, 0.19.0 Aug 21, 2016

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jun 28, 2017

originally I linked #13908 with an example to this one. It seems fixed on master, so if someone wants to see if this issue is closable as well would be great comment

@mroeschke mroeschke referenced this issue Jun 23, 2018

Merged

TST: Clean old timezone issues PT2 #21612

10 of 10 tasks complete

@jreback jreback modified the milestones: Next Major Release, 0.24.0 Jun 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.