Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: The query method does not support index or column named datetime #35595

Open
esse-byte opened this issue Aug 7, 2020 · 3 comments
Open
Labels
Bug expressions pd.eval, query

Comments

@esse-byte
Copy link

In [1]: import pandas as pd
In [2]: index = pd.MultiIndex.from_product([['T0', 'T1'], ['A', 'B']], names=['datetime', 'count'])
In [3]: frame = pd.DataFrame({'isnull': range(len(index))}, index=index)
In [4]: frame.query('count == "A"') # works
Out[4]: 
                isnull
datetime count        
T0       A           0
T1       A           2

In [5]: frame.query('isnull < 2') # works
Out[5]: 
                isnull
datetime count        
T0       A           0
         B           1

In [6]: frame.query('datetime == "T0"') # failed caused by KeyError
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
D:\home\tools\Anaconda\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

...

D:\home\tools\Anaconda\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2646                 return self._engine.get_loc(key)
   2647             except KeyError:
-> 2648                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   2649         indexer = self.get_indexer([key], method=method, tolerance=tolerance)
   2650         if indexer.ndim > 1 or indexer.size > 1:

pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: False

In [7]: index2 = pd.MultiIndex.from_product([pd.date_range('20200101', '20200202'), ['A', 'B']], names=['datetime', 'count'])
In [8]: frame2 = pd.DataFrame({'isnull': range(len(index2))}, index=index2)
In [9]: frame2.query('datetime == "2020-01-01"') # failed caused by TypeError
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
D:\home\tools\Anaconda\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   2645             try:
-> 2646                 return self._engine.get_loc(key)
   2647             except KeyError:

...

D:\home\tools\Anaconda\lib\site-packages\pandas\core\indexes\datetimes.py in get_loc(self, key, method, tolerance)
    721 
    722             try:
--> 723                 stamp = Timestamp(key)
    724                 if stamp.tzinfo is not None and self.tz is not None:
    725                     stamp = stamp.tz_convert(self.tz)

pandas\_libs\tslibs\timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()
pandas\_libs\tslibs\conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()
TypeError: Cannot convert input [False] of type <class 'numpy.bool_'> to Timestamp

Problem description

The query method failed when the query expression contains datetime.

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.8.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 10, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : Chinese (Simplified)_China.936

pandas : 1.0.5
numpy : 1.18.5
pytz : 2020.1
dateutil : 2.8.1
pip : 20.1.1
setuptools : 49.2.0.post20200714
Cython : None
pytest : None
hypothesis : None
sphinx : 3.1.2
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.16.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.2.2
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : None
pyxlsb : None
s3fs : None
scipy : 1.5.0
sqlalchemy : None
tables : 3.6.1
tabulate : 0.8.3
xarray : None
xlrd : 1.2.0
xlwt : None
xlsxwriter : None
numba : 0.50.1

@esse-byte esse-byte added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 7, 2020
@samukweku
Copy link
Contributor

Does it work if you set the engine argument to python?

@esse-byte
Copy link
Author

Does it work if you set the engine argument to python?

It does not work either

@samukweku
Copy link
Contributor

I suspect it is confusing it with the datetime module name from python; if you use a different name it works, if you try to access it using frame[frame.index.get_level_values("datetime") == "T0"] it works. That's my guess

@jbrockmendel jbrockmendel added expressions pd.eval, query and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 2, 2020
@jbrockmendel jbrockmendel changed the title The query method does not support index or column named datetime BUG: The query method does not support index or column named datetime Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug expressions pd.eval, query
Projects
None yet
Development

No branches or pull requests

3 participants