Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too generic Exception in multindex reindex raise statement #21770

Closed
glyg opened this issue Jul 6, 2018 · 4 comments · Fixed by #21780
Closed

Too generic Exception in multindex reindex raise statement #21770

glyg opened this issue Jul 6, 2018 · 4 comments · Fixed by #21780
Labels
Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Milestone

Comments

@glyg
Copy link
Contributor

glyg commented Jul 6, 2018

Code Sample, a copy-pastable example if possible

import pandas as pd, numpy as np

idx = pd.MultiIndex.from_tuples([(0, 0), (1, 1), (1, 1), (2, 2)])
a = pd.Series(np.arange(4), index=idx)
new_idx = pd.MultiIndex.from_tuples([(0, 0), (1, 1), (2, 2)])
a.reindex(new_idx)
(...)  
 2094                 else:
-> 2095                     raise Exception("cannot handle a non-unique multi-index!")
   2096 
   2097         if not isinstance(target, MultiIndex):

Exception: cannot handle a non-unique multi-index!

Problem description

class MultiIndex(Index):
    (...)
    def reindex(self, target, method=None, level=None, limit=None,
        tolerance=None):
        (...)
                else:
                    raise Exception("cannot handle a non-unique multi-index!")

In MultiIndex.reindex, this exception is too generic, and thus not easily filtered. I think it should be an IndexError

Expected Output

raise IndexError("cannot handle a non-unique multi-index!")

I can do a very minimal PR to fix this if needed.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-24-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8

pandas: 0.23.1
pytest: 3.6.2
pip: 10.0.1
setuptools: 39.2.0
Cython: 0.28.3
numpy: 1.14.5
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: 3.4.4
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@gfyoung gfyoung added Indexing Related to indexing on series/frames, not to indexes themselves Error Reporting Incorrect or improved errors from pandas labels Jul 6, 2018
@gfyoung
Copy link
Member

gfyoung commented Jul 6, 2018

I might go with NotImplementedError (unless there's a reason this functionality shouldn't be supported at some point when a solid implementation is developed).

@glyg
Copy link
Contributor Author

glyg commented Jul 6, 2018

Digging a bit deeper, an Exception is also raised for the same reason in get_indexer . As I understand the mater you can't reindex with a duplicate index because there's no way to choose which of the identicaly indexed rows to choose with the new index.

@glyg
Copy link
Contributor Author

glyg commented Jul 6, 2018

Same situation with a simple Index raises a ValueError:

idx = pd.Index([0, 1, 1, 2])
a = pd.Series(np.arange(4), index=idx)
new_idx = pd.Index([0, 1, 2])
a.reindex(new_idx)
~/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
   3557         # trying to reindex on an axis with duplicates
   3558         if not self.is_unique and len(indexer):
-> 3559             raise ValueError("cannot reindex from a duplicate axis")
   3560 
   3561     def reindex(self, target, method=None, level=None, limit=None,

ValueError: cannot reindex from a duplicate axis

So maybe ValueError to be coherent?

@gfyoung
Copy link
Member

gfyoung commented Jul 6, 2018

Sounds reasonable to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants