New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: IntervalIndex constructor inconsistencies #18421

Closed
jschendel opened this Issue Nov 22, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@jschendel
Member

jschendel commented Nov 22, 2017

Code Sample, a copy-pastable example if possible

  1. IntervalIndex constructor ignores closed parameter for purely NA data:
In [3]: pd.IntervalIndex([np.nan], closed='both')
Out[3]:
IntervalIndex([nan]
              closed='right',
              dtype='interval[float64]')

In [4]: pd.IntervalIndex([np.nan, np.nan], closed='neither')
Out[4]:
IntervalIndex([nan, nan]
              closed='right',
              dtype='interval[float64]')

This only occurs on master, as it appears to be an over-correction resulting from the fix in #18340


  1. IntervalIndex also ignores closed when it conflicts with the how the input data is closed:
In [6]: ivs = [pd.Interval(0, 1, closed='both'), pd.Interval(10, 20, closed='both')]

In [7]: pd.IntervalIndex(ivs, closed='neither')
Out[7]:
IntervalIndex([[0, 1], [10, 20]]
              closed='both',
              dtype='interval[int64]')

The behavior above occurs on master, and is a result of #18340. The opposite behavior occurred prior, where intervals would always be coerced to match the closed specified by the constructor. Should probably raise when the constructor vs. inferred closed conflict.


  1. Inconsistent dtype for empty IntervalIndex depending on the method of construction:
In [2]: pd.IntervalIndex([]).dtype
Out[2]: interval[object]

In [3]: pd.IntervalIndex.from_intervals([]).dtype
Out[3]: interval[object]

In [4]: pd.IntervalIndex.from_breaks([]).dtype
Out[4]: interval[float64]

In [5]: pd.IntervalIndex.from_tuples([]).dtype
Out[5]: interval[float64]

In [6]: pd.IntervalIndex.from_arrays([], []).dtype
Out[6]: interval[float64]

Expected Output

  1. IntervalIndex constructor should not ignore the closed parameter for purely NA data (since it can't infer closed from the input data).

  2. IntervalIndex should raise when given conflicting closed vs. inferred closed from data.

  3. IntervalIndex should have the same dtype for empty data regardless of the method of construction. It's not immediately clear to me which dtype should be used, but my feeling is interval[object] since that's the behavior of the constructor/from_intervals.

@jreback

This comment has been minimized.

Contributor

jreback commented Nov 22, 2017

  1. should be object
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment