Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unspecific error message when setting singular index with np dtype #33017

Closed
JoElfner opened this issue Mar 25, 2020 · 1 comment · Fixed by #33026
Closed

Unspecific error message when setting singular index with np dtype #33017

JoElfner opened this issue Mar 25, 2020 · 1 comment · Fixed by #33026
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors Error Reporting Incorrect or improved errors from pandas Index Related to the Index class or subclasses
Milestone

Comments

@JoElfner
Copy link
Contributor

Code Sample, a copy-pastable example if possible

some_np_array = np.array([43, 56])

# unspecific error message 'TypeError: len() of unsized object':
pd.DataFrame([4324, 345], index=some_np_array[0])
pd.Series([4324, 345], index=some_np_array[0])

# specific error message 'TypeError: Index(...) must be called with a collection of some kind, 5 was passed':
pd.DataFrame([4324, 345], index=5)
pd.Series([4324, 345], index=5)

Problem description

When passing a non-collection type as an index, pandas typically raises a quite helpful error message of the following form:
TypeError: Index(...) must be called with a collection of some kind, 5 was passed

Whereas when passing a non-standard type non-collection item as an index, pandas seems to raise some unspecific and misleading error:
TypeError: len() of unsized object

Expected Output

Specific error message, even when using np.float64, np.int64 etc. as type of singular values.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.7.final.0
python-bits : 64
OS : Windows
OS-release : 7
machine : AMD64
processor : Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder : little
LC_ALL : None
LANG : en
LOCALE : None.None

pandas : 1.0.3
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.1.1.post20200323
Cython : 0.29.15
pytest : 5.4.1
hypothesis : 5.5.4
sphinx : 2.4.0
blosc : None
feather : None
xlsxwriter : 1.2.8
lxml.etree : 4.5.0
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.13.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.2
fastparquet : None
gcsfs : None
lxml.etree : 4.5.0
matplotlib : 3.1.3
numexpr : 2.7.1
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.4.1
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.15
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.8
numba : 0.48.0

@dsaxton
Copy link
Member

dsaxton commented Mar 25, 2020

I suppose it could raise a better error message (which likely shouldn't change if the value happens to come from inside a numpy array). The behavior now is because even the singleton np.array([1, 2])[0] has an __array__ attribute so the scalar check gets skipped: https://github.com/pandas-dev/pandas/blob/master/pandas/core/indexes/base.py#L398

@jreback jreback added Bug Constructors Series/DataFrame/Index/pd.array Constructors Error Reporting Incorrect or improved errors from pandas Index Related to the Index class or subclasses labels Mar 26, 2020
@jreback jreback added this to the 1.1 milestone Mar 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors Error Reporting Incorrect or improved errors from pandas Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants