Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: pytables index expressions fail in Python 3.9 #37217

Closed
rebecca-palmer opened this issue Oct 18, 2020 · 4 comments
Closed

BUG: pytables index expressions fail in Python 3.9 #37217

rebecca-palmer opened this issue Oct 18, 2020 · 4 comments
Labels
Bug Closing Candidate May be closeable, needs more eyeballs IO HDF5 read_hdf, HDFStore Python 3.9
Milestone

Comments

@rebecca-palmer
Copy link
Contributor

  • [ y] I have checked that this issue has not already been reported. (Here - it has in Debian)

  • [ y] I have confirmed this bug exists on the latest version of pandas.

  • [ y] (optional) I have confirmed this bug exists on the master branch of pandas. (Though with a mix of Debian and pip dependencies, as pip doesn't have them all for 3.9 yet.)


Code Sample, a copy-pastable example

The tests in tests/io/pytables/test_store.py, or

import pandas as pd;from pandas.io.pytables import HDFStore;s1=HDFStore("tmp1.h5","w");df=pd.DataFrame([[1,2,3],[4,5,6]],columns=['A','B','C']);s1.append("d1",df,data_columns=["B"]);df2=s1.select("d1","index>df.index[0]");print(type(df2.index[0]))

Problem description

In Python 3.9, HDFStore.Select fails if it involves an index expression, with this traceback:

self = <DatetimeArray>
['2000-01-03 00:00:00', '2000-01-04 00:00:00', '2000-01-05 00:00:00',
 '2000-01-06 00:00:00', '2000-01...2-08 00:00:00',
 '2000-02-09 00:00:00', '2000-02-10 00:00:00', '2000-02-11 00:00:00']
Length: 30, dtype: datetime64[ns]
key = 4

    def __getitem__(self, key):
        if lib.is_integer(key):
            # fast-path
            result = self._ndarray[key]
            if self.ndim == 1:
                return self._box_func(result)
            return self._from_backing_data(result)
    
        key = extract_array(key, extract_numpy=True)
        key = check_array_indexer(self, key)
>       result = self._ndarray[key]
E       IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

pandas/core/arrays/_mixins.py:200: IndexError

The debugger says key is a pandas.core.computation.pytables.Constant, while in Python 3.8 (where this works) it is a plain int. The underlying cause may be Python replacing ast.Index with bare values.

The CI may have missed this because it skips optional dependencies on 3.9 (to avoid having to build them).

Possible fix

Warning: not fully tested.

--- a/pandas/core/computation/pytables.py
+++ b/pandas/core/computation/pytables.py

@@ -429,6 +429,10 @@ class PyTablesExprVisitor(BaseExprVisito
             value = value.value
         except AttributeError:
             pass
+        try:
+            slobj = slobj.value
+        except AttributeError:
+            pass
 
         try:
             return self.const_type(value[slobj], self.env)

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.9.0.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.0-11-amd64
Version : #1 SMP Debian 4.19.146-1 (2020-09-17)
machine : x86_64
processor :
byteorder : little
LC_ALL : C
LANG : C
LOCALE : None.None

pandas : 0+unknown
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.1.1
setuptools : 50.3.0
Cython : 0.29.21
pytest : 6.1.1
hypothesis : 5.32.1
sphinx : 3.2.1
blosc : 1.9.2
feather : None
xlsxwriter : 1.1.2
lxml.etree : 4.5.2
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.18.1
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.2.1
fsspec : 0.8.4
fastparquet : None
gcsfs : 0.7.1
matplotlib : 3.3.2
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : 0.5.1
scipy : 1.5.2
sqlalchemy : 1.3.19
tables : 3.6.1
tabulate : 0.8.7
xarray : None
xlrd : 1.1.0
xlwt : 1.3.0
numba : None

@rebecca-palmer rebecca-palmer added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 18, 2020
@fangchenli fangchenli added IO HDF5 read_hdf, HDFStore Python 3.9 and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 18, 2020
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Oct 26, 2020
ast.Index has been replaced by a bare value, so we need to do the
conversion from Constant to int

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Bug-Debian: https://bugs.debian.org/972015
Forwarded: pandas-dev/pandas#37217


Gbp-Pq: Name python39_compat.patch
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Nov 7, 2020
ast.Index has been replaced by a bare value, so we need to do the
conversion from Constant to int

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Bug-Debian: https://bugs.debian.org/972015
Forwarded: pandas-dev/pandas#37217


Gbp-Pq: Name python39_compat.patch
@simonjayhawkins
Copy link
Member

This might have been fixed by #38041. cc @jbrockmendel ?

@simonjayhawkins simonjayhawkins added this to the 1.1.5 milestone Nov 25, 2020
@jbrockmendel
Copy link
Member

i expect so, thats the same exception message i was getting locally

@simonjayhawkins
Copy link
Member

The tests in tests/io/pytables/test_store.py, or

since pytables was added to the test environment in #38041 and tests are passing, I don't think we need to add any tests to close this issue.

@simonjayhawkins simonjayhawkins added the Closing Candidate May be closeable, needs more eyeballs label Nov 25, 2020
@jreback
Copy link
Contributor

jreback commented Nov 28, 2020

closed via #38061

@jreback jreback closed this as completed Nov 28, 2020
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Dec 11, 2020
ast.Index has been replaced by a bare value, so we need to do the
conversion from Constant to int

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Bug-Debian: https://bugs.debian.org/972015
Forwarded: pandas-dev/pandas#37217


Gbp-Pq: Name python39_compat.patch
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Dec 20, 2020
ast.Index has been replaced by a bare value, so we need to do the
conversion from Constant to int

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Bug-Debian: https://bugs.debian.org/972015
Forwarded: pandas-dev/pandas#37217


Gbp-Pq: Name python39_compat.patch
raspbian-autopush pushed a commit to raspbian-packages/pandas that referenced this issue Jan 15, 2021
ast.Index has been replaced by a bare value, so we need to do the
conversion from Constant to int

Author: Rebecca N. Palmer <rebecca_palmer@zoho.com>
Bug-Debian: https://bugs.debian.org/972015
Forwarded: pandas-dev/pandas#37217


Gbp-Pq: Name python39_compat.patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Closing Candidate May be closeable, needs more eyeballs IO HDF5 read_hdf, HDFStore Python 3.9
Projects
None yet
Development

No branches or pull requests

5 participants