Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too generic Exception in multindex reindex raise statement #21770

Closed
glyg opened this issue Jul 6, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@glyg
Copy link
Contributor

commented Jul 6, 2018

Code Sample, a copy-pastable example if possible

import pandas as pd, numpy as np

idx = pd.MultiIndex.from_tuples([(0, 0), (1, 1), (1, 1), (2, 2)])
a = pd.Series(np.arange(4), index=idx)
new_idx = pd.MultiIndex.from_tuples([(0, 0), (1, 1), (2, 2)])
a.reindex(new_idx)
(...)  
 2094                 else:
-> 2095                     raise Exception("cannot handle a non-unique multi-index!")
   2096 
   2097         if not isinstance(target, MultiIndex):

Exception: cannot handle a non-unique multi-index!

Problem description

class MultiIndex(Index):
    (...)
    def reindex(self, target, method=None, level=None, limit=None,
        tolerance=None):
        (...)
                else:
                    raise Exception("cannot handle a non-unique multi-index!")

In MultiIndex.reindex, this exception is too generic, and thus not easily filtered. I think it should be an IndexError

Expected Output

raise IndexError("cannot handle a non-unique multi-index!")

I can do a very minimal PR to fix this if needed.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-24-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: fr_FR.UTF-8

pandas: 0.23.1
pytest: 3.6.2
pip: 10.0.1
setuptools: 39.2.0
Cython: 0.28.3
numpy: 1.14.5
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: 3.4.4
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@gfyoung

This comment has been minimized.

Copy link
Member

commented Jul 6, 2018

I might go with NotImplementedError (unless there's a reason this functionality shouldn't be supported at some point when a solid implementation is developed).

@gfyoung gfyoung added the MultiIndex label Jul 6, 2018

@glyg

This comment has been minimized.

Copy link
Contributor Author

commented Jul 6, 2018

Digging a bit deeper, an Exception is also raised for the same reason in get_indexer . As I understand the mater you can't reindex with a duplicate index because there's no way to choose which of the identicaly indexed rows to choose with the new index.

@glyg

This comment has been minimized.

Copy link
Contributor Author

commented Jul 6, 2018

Same situation with a simple Index raises a ValueError:

idx = pd.Index([0, 1, 1, 2])
a = pd.Series(np.arange(4), index=idx)
new_idx = pd.Index([0, 1, 2])
a.reindex(new_idx)
~/anaconda3/envs/py36/lib/python3.6/site-packages/pandas/core/indexes/base.py in _can_reindex(self, indexer)
   3557         # trying to reindex on an axis with duplicates
   3558         if not self.is_unique and len(indexer):
-> 3559             raise ValueError("cannot reindex from a duplicate axis")
   3560 
   3561     def reindex(self, target, method=None, level=None, limit=None,

ValueError: cannot reindex from a duplicate axis

So maybe ValueError to be coherent?

@gfyoung

This comment has been minimized.

Copy link
Member

commented Jul 6, 2018

Sounds reasonable to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.