-
-
Notifications
You must be signed in to change notification settings - Fork 19k
Description
Due to #23980 the following code now raises a ValueError since 0.25:
Code Sample, a copy-pastable example if possible
ii = pd.IntervalIndex.from_tuples([(0, 10), (2, 12), (4, 14)])
pd.cut([5, 6], bins=ii)
Problem description
Before #23980 an IntervalIndex with overlapping columns could be used. It would return every Interval which is valid for the required data, which is obviously the correct solution.
In #23980 it was stated that this doesn't make sense in the context of cut. Unfortunately I missed the discussion over there (there really was None). I argue that by raising a value error we unnecessarily remove a valid feature: I use cut frequently as kind of a more versatile replacement to pd.rolling
for overlapping non-equal sized custom windows.
If there is a smarter way to do this I am happy to learn about it. Otherwise we should at least give the option to use overlapping indices in cut. Thus I would recommend to raise a warning instead of an error here:
pandas/pandas/core/reshape/tile.py
Lines 247 to 249 in 0fd888c
elif isinstance(bins, IntervalIndex): | |
if bins.is_overlapping: | |
raise ValueError("Overlapping IntervalIndex is not accepted.") |
Expected Output
Raise a warning maybe (I am still not sure if this is necessary) and return:
[(0, 10], (2, 12], (4, 14], (0, 10], (2, 12], (4, 14]]
Categories (3, interval[int64]): [(0, 10] < (2, 12] < (4, 14]]