Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Series construction with EA dtype and index but no data fails #33846

Merged
merged 29 commits into from
May 2, 2020

Conversation

simonjayhawkins
Copy link
Member

@simonjayhawkins simonjayhawkins commented Apr 28, 2020

@simonjayhawkins simonjayhawkins added Bug ExtensionArray Extending pandas with custom dtypes or arrays. Constructors Series/DataFrame/Index/pd.array Constructors labels Apr 28, 2020
@jbrockmendel
Copy link
Member

Maybe a test in tests.extension?

@simonjayhawkins
Copy link
Member Author

Maybe a test in tests.extension?

Series construction passes through _from_sequence but this PR is not changing the EAs directly since the EA interface doesn't require accepting a scalar for _from_sequence, see @jorisvandenbossche comment #26469 (comment)

The problem is that the None / np.nan is eventually passed to the _from_sequence method, which expects a list-like. I don't think we should let people fix their _from_sequence, but we should rather make sure we never pass that to it.

so not sure what tests could be added to extension tests since we have a code check rule:

Check for the following code in the extension array base tests: tm.assert_frame_equal and tm.assert_series_equal

and this PR is about Series construction.

could maybe add more parameterisation to the Series test? Do we have a fixture for internal EA dtypes?

@simonjayhawkins
Copy link
Member Author

so not sure what tests could be added to extension tests since we have a code check rule:

Check for the following code in the extension array base tests: tm.assert_frame_equal and tm.assert_series_equal

ah, using tm instead of self.

will add tests to extension tests and remove this IntegerArray test from pandas/tests/series/test_constructors.py

@jorisvandenbossche
Copy link
Member

The errors seem to indicate that we pass the scalar to the ExtensionArray constructor. But it should maybe the responsibility of the Series constructor to convert a scalar to something like [scalar] * len before passing it to the EA constructor?

@jbrockmendel
Copy link
Member

something like [scalar] * len before passing it to the EA constructor

Thats what i was thinking, yah

@simonjayhawkins
Copy link
Member Author

simonjayhawkins commented Apr 29, 2020

something like [scalar] * len before passing it to the EA constructor

Thats what i was thinking, yah

That would be inconsistent with how non-EA series are constructed. Are we OK with that?

IIUC sanitize_array broadcasts the single element array to match the index.

the traceback in the issue OP is no longer relevant. on master..

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-bed90de02f57> in <module>
----> 1 pd.Series(pd.NA, index=[1, 2, 3], dtype="Int64")

c:\users\simon\pandas\pandas\core\series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    327                     data = data.copy()
    328             else:
--> 329                 data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
    330 
    331                 data = SingleBlockManager.from_array(data, index)

c:\users\simon\pandas\pandas\core\construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure)
    452         raise TypeError("Set type is unordered")
    453     else:
--> 454         subarr = _try_cast(data, dtype, copy, raise_cast_failure)
    455 
    456     # scalar like, GH

c:\users\simon\pandas\pandas\core\construction.py in _try_cast(arr, dtype, copy, raise_cast_failure)
    535         # DatetimeTZ case needs to go through maybe_cast_to_datetime
    536         array_type = dtype.construct_array_type()._from_sequence
--> 537         subarr = array_type(arr, dtype=dtype, copy=copy)
    538         return subarr
    539 

c:\users\simon\pandas\pandas\core\arrays\integer.py in _from_sequence(cls, scalars, dtype, copy)
    353     @classmethod
    354     def _from_sequence(cls, scalars, dtype=None, copy: bool = False) -> "IntegerArray":
--> 355         return integer_array(scalars, dtype=dtype, copy=copy)
    356 
    357     @classmethod

c:\users\simon\pandas\pandas\core\arrays\integer.py in integer_array(values, dtype, copy)
    144     TypeError if incompatible types
    145     """
--> 146     values, mask = coerce_to_array(values, dtype=dtype, copy=copy)
    147     return IntegerArray(values, mask)
    148 

c:\users\simon\pandas\pandas\core\arrays\integer.py in coerce_to_array(values, dtype, mask, copy)
    219         inferred_type = lib.infer_dtype(values, skipna=True)
    220         if inferred_type == "empty":
--> 221             values = np.empty(len(values))
    222             values.fill(np.nan)
    223         elif inferred_type not in [

TypeError: len() of unsized object

@simonjayhawkins
Copy link
Member Author

Not directly related to this PR, but may offer chance of avoiding _from_sequence and the constraints of the EA api is to investigate making pd.array constructor consistent with numpy. Not looked into this, so not sure at this stage whether the same constraints would apply.

>>> import numpy as np
>>> import pandas as pd
>>>
>>> pd.__version__
'1.1.0.dev0+1422.gb8c895429'
>>>
>>> np.array(1, dtype="int64")
array(1, dtype=int64)
>>>
>>> pd.array(1, dtype="Int64")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\simon\pandas\pandas\core\construction.py", line 270, in array
    raise ValueError(msg)
ValueError: Cannot pass scalar '1' to 'pandas.array'.
>>>

@simonjayhawkins
Copy link
Member Author

That would be inconsistent with how non-EA series are constructed. Are we OK with that?

actually atm this 'fix' is not 100% consistent

for pd.Series(5, index=[1, 2, 3], dtype="int") the scalar value 5 is passed to _try_cast and then on L563 construct_1d_ndarray_preserving_na is called with the scalar value. The return value is actually a 0d array contrary to the function name.

>>> pd.core.dtypes.cast.construct_1d_ndarray_preserving_na(5, np.dtype("int"))
array(5)

in the current 'fix' a 1d EA is returned from _try_cast, so not 100% consistent with non-EA types. so maybe #33846 (comment) may need to be fixed first.

L538 could be changed to continue and a construct_1d_pdarray_preserving_na created to duplicate
the behaviour of construct_1d_ndarray_preserving_na for EA types. wdyt?

@jorisvandenbossche
Copy link
Member

>>> pd.array(1, dtype="Int64")

pd.array is strictly 1D only, while your numpy example (np.array(1, dtype="int64")) is creating a 0D array. So for this case we don't need consistency with numpy IMO

@simonjayhawkins
Copy link
Member Author

something like [scalar] * len before passing it to the EA constructor

Thats what i was thinking, yah

try_cast returns an array. with non-EA dtypes it's a 0-d array and then in sanitize_array, the if getattr(subarr, "ndim", 0) == 0: broadcasts the scalar to the same length as the index if passed.

whereas atm, this 'fix' returns a 1-d array (we don't want to implement 0-d arrays for EA, #33846 (comment)) and then in

    elif subarr.ndim == 1:
        if index is not None:

            # a 1-element ndarray
            if len(subarr) != len(index) and len(subarr) == 1:
                subarr = construct_1d_arraylike_from_scalar(
                    subarr[0], len(index), subarr.dtype
                )

we create the correct length array from the single element EA array returned from _try_cast.

so if this fix is not suitable, I see a few other options.

  1. return a scalar from try_cast and then additional checks on the return type in sanitize_array
  2. pass the index through to try_cast so that the suggestion of doing [scalar] * len before passing it to the EA constructor
  3. avoid the call to try_cast for a scalar and an EA type.

I think that union return types (1) should be avoided, passing additional keywords (2) violates the single responsibility principle and additional special casing (3) should be avoided.

any other suggestions?

@simonjayhawkins
Copy link
Member Author

another constructor case fixed by this branch for consistency with numpy types.

>>> pd.Series(1, index=['foo'], dtype="int")
foo    1
dtype: int32
>>>
>>> pd.Series(1, index=['foo'], dtype="Int64")
foo    1
dtype: Int64
>>>

on master

>>> pd.Series(1, index=['foo'], dtype="int")
foo    1
dtype: int32
>>>
>>> pd.Series(1, index=['foo'], dtype="Int64")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\simon\pandas\pandas\core\series.py", line 329, in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
  File "C:\Users\simon\pandas\pandas\core\construction.py", line 454, in sanitize_array
    subarr = _try_cast(data, dtype, copy, raise_cast_failure)
  File "C:\Users\simon\pandas\pandas\core\construction.py", line 537, in _try_cast
    subarr = array_type(arr, dtype=dtype, copy=copy)
  File "C:\Users\simon\pandas\pandas\core\arrays\integer.py", line 355, in _from_sequence
    return integer_array(scalars, dtype=dtype, copy=copy)
  File "C:\Users\simon\pandas\pandas\core\arrays\integer.py", line 146, in integer_array
    values, mask = coerce_to_array(values, dtype=dtype, copy=copy)
  File "C:\Users\simon\pandas\pandas\core\arrays\integer.py", line 244, in coerce_to_array
    raise TypeError("values must be a 1D list-like")
TypeError: values must be a 1D list-like
>>>

so could add some more test cases.

@@ -533,6 +533,10 @@ def _try_cast(
if isinstance(dtype, ExtensionDtype) and dtype.kind != "M":
# create an extension array from its dtype
# DatetimeTZ case needs to go through maybe_cast_to_datetime

if lib.is_scalar(arr):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could look into how to identify a collection that could be considered a 'scalar' for some EA, eg JSONDtype. although I think out-of-scope for the issue that this PR attempts to fix (i.e. IntegerArray, where the scalars are scalars)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than this I would call: construct_1d_arraylike_from_scalar

but I wouldn't do this right here, rather on L453, e.g. add an elif is_scalar(data)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's option 3 in #33846 (comment)

do this just for EA types and keep the code path the same for non-EA types?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this will work generically

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm getting a few failures in pandas/tests/series/test_constructors.py. i'll push the change anyway use the ci to see what else fails while I investigate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk

@pytest.mark.xfail(reason="GH-26469")
def test_series_constructor_scalar_with_one_element_index(self, data):
# TypeError: data type not understood
super().test_series_constructor_scalar_with_one_element_index(data)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is some special casing for datetime in the Series construction. Although fixing this could also be considered out-of-scope for the issue that this PR attempts to close, I could look into this further and maybe raise a separate issue if not fixed here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opening a separate issue is fine for me as well, either way

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not failing with the changes as they stand atm. the previous 'fix' was inside a if isinstance(dtype, ExtensionDtype) and dtype.kind != "M" and hence not applicable to DatetimeTZDtype

@@ -151,6 +151,16 @@ def test_array_from_scalars(self, data):
# ValueError: PandasArray must be 1-dimensional.
super().test_array_from_scalars(data)

@skip_nested
def test_series_constructor_scalar_with_index(self, data):
# ValueError: Length of passed values is 1, index implies 3.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for the object dtype, the scalar is a tuple, so this failure is related to #33846 (comment)

@@ -732,7 +732,7 @@ ExtensionArray
^^^^^^^^^^^^^^

- Fixed bug where :meth:`Series.value_counts` would raise on empty input of ``Int64`` dtype (:issue:`33317`)
-
- Fixed bug in :class:`Series` construction with EA dtype and index but no data fails (:issue:`26469`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or scalar data?

expected = pd.Series([scalar] * 3, index=[1, 2, 3], dtype=dtype)
self.assert_series_equal(result, expected)

def test_series_constructor_scalar_with_one_element_index(self, data):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would maybe just combine this one with the test above (both are about scalar with index) ?

(and one test less to override in the subclasses)

@pytest.mark.xfail(reason="GH-26469")
def test_series_constructor_scalar_with_one_element_index(self, data):
# TypeError: data type not understood
super().test_series_constructor_scalar_with_one_element_index(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opening a separate issue is fine for me as well, either way

@pytest.mark.xfail(reason="GH-26469", strict=False)
def test_series_constructor_no_data_with_index(self, data, na_value):
# ValueError: Cannot convert non-finite values (NA or inf) to integer
super().test_series_constructor_no_data_with_index(data, na_value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why sparse is failing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, so much for the xfail. no longer failing. was only failing on only one of the fill values before hence the strict=False.

@pytest.mark.xfail(reason="GH-26469")
def test_series_constructor_no_data_with_index(self, data, na_value):
# pyarrow.lib.ArrowInvalid: only handle 1-dimensional arrays
super().test_series_constructor_no_data_with_index(data, na_value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why it is failing for this dtype?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the traceback is

_______________________________________________ TestConstructors.test_series_constructor_no_data_with_index ________________________________________________ 

self = <pandas.tests.extension.arrow.test_bool.TestConstructors object at 0x000001E9918B7B50>
dtype = <pandas.tests.extension.arrow.arrays.ArrowBoolDtype object at 0x000001E9918B79A0>, na_value = None

    def test_series_constructor_no_data_with_index(self, dtype, na_value):
>       result = pd.Series(index=[1, 2, 3], dtype=dtype)

pandas\tests\extension\base\constructors.py:37:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _  
pandas\core\series.py:283: in __init__
    data, index = self._init_dict(data, index, dtype)
pandas\core\series.py:372: in _init_dict
    s = create_series_with_explicit_dtype(
pandas\core\construction.py:629: in create_series_with_explicit_dtype
    return Series(
pandas\core\series.py:329: in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
pandas\core\construction.py:459: in sanitize_array
    subarr = _try_cast(data, dtype, copy, raise_cast_failure)
pandas\core\construction.py:542: in _try_cast
    subarr = array_type(arr, dtype=dtype, copy=copy)
pandas\tests\extension\arrow\arrays.py:82: in _from_sequence
    return cls.from_scalars(scalars)
pandas\tests\extension\arrow\arrays.py:72: in from_scalars
    arr = pa.chunked_array([pa.array(np.asarray(values))])
pyarrow\array.pxi:265: in pyarrow.lib.array
    ???
pyarrow\array.pxi:80: in pyarrow.lib._ndarray_to_array
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _  

>   ???
E   pyarrow.lib.ArrowInvalid: only handle 1-dimensional arrays

pyarrow\error.pxi:84: ArrowInvalid

hmm, seems to be because using lib.is_scalar, pa.NULL is passed to sanitize_array

>>> import pyarrow as pa
>>>
>>> pd._libs.lib.is_scalar(pa.NULL)
False

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche not sure if you want this fixed here. will raise a separate issue for this case in the meantime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's fine, we don't really support this dtype anyway, it's only to test certain things

@@ -533,6 +533,10 @@ def _try_cast(
if isinstance(dtype, ExtensionDtype) and dtype.kind != "M":
# create an extension array from its dtype
# DatetimeTZ case needs to go through maybe_cast_to_datetime

if lib.is_scalar(arr):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than this I would call: construct_1d_arraylike_from_scalar

but I wouldn't do this right here, rather on L453, e.g. add an elif is_scalar(data)

# RecursionError: maximum recursion depth exceeded in comparison
super().test_series_constructor_scalar_na_with_index(dtype, na_value)

@pytest.mark.xfail(reason="GH-26469")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a more informative message

@jreback
Copy link
Contributor

jreback commented Apr 30, 2020

hmm, as an aside, lib.is_scalar does not consider numpy 0-d array as a scalar

on master.

>>> pd.Series(np.array(1, dtype="int64"))
0    1
dtype: object
>>>
>>> pd.Series(np.array(1, dtype="int64"), index=[1, 2, 3])
1    1
2    1
3    1
dtype: int64
>>>
>>> pd.Series(np.array(1, dtype="int64"), index=[1, 2, 3], dtype="Int64")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\simon\pandas\pandas\core\series.py", line 329, in __init__
    data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True)
  File "C:\Users\simon\pandas\pandas\core\construction.py", line 427, in sanitize_array
    subarr = _try_cast(data, dtype, copy, raise_cast_failure)
  File "C:\Users\simon\pandas\pandas\core\construction.py", line 537, in _try_cast
    subarr = array_type(arr, dtype=dtype, copy=copy)
  File "C:\Users\simon\pandas\pandas\core\arrays\integer.py", line 355, in _from_sequence
    return integer_array(scalars, dtype=dtype, copy=copy)
  File "C:\Users\simon\pandas\pandas\core\arrays\integer.py", line 146, in integer_array
    values, mask = coerce_to_array(values, dtype=dtype, copy=copy)
  File "C:\Users\simon\pandas\pandas\core\arrays\integer.py", line 244, in coerce_to_array
    raise TypeError("values must be a 1D list-like")
TypeError: values must be a 1D list-like
>>>

the first case results in an object dtype (bug in master?). this PR dosn't fix the last case to make it consistent with case 2

correct, these need to be unwrapped (easiest), e.g. your check I think will work like this

if lib.is_scalar(data) or hasattr(data, 'ndim') and getattr(data, 'ndim') == 0:

@jreback jreback added this to the 1.1 milestone Apr 30, 2020
@@ -150,6 +150,21 @@ def test_from_dtype(self, data):
# construct from our dtype & string dtype
pass

@pytest.mark.xfail(reason="GH-26469")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these be a new issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few checks to go but I think we need a discussion on when to allow a collection to be treated as scalar. so yes, will probably raise an issue for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk, and just flip the references to that, otherwise lgtm. ping on green.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xref #33900 and #33901

try:
data = maybe_cast_to_datetime(data, dtype)
except TypeError:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know for what case it raises? As I would think that maybe_cast_to_datetime is not supposed to raise (or if it raises, it's an error that should bubble up to the user), so that might need a fix there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know for what case it raises?

it raises for string type with a string that satisfies

>>> pd.core.dtypes.common.is_datetime64_dtype("M")
True

The scalar used in extension tests is random and is sometimes, say "M", therefore a separate test has been added.

As I would think that maybe_cast_to_datetime is not supposed to raise (or if it raises, it's an error that should bubble up to the user), so that might need a fix there?

it raises TypeError: Cannot cast datetime64 to string

elif is_datetime64_dtype(value) and not is_datetime64_dtype(dtype):
if is_object_dtype(dtype):
if value.dtype != DT64NS_DTYPE:
value = value.astype(DT64NS_DTYPE)
ints = np.asarray(value).view("i8")
return tslib.ints_to_pydatetime(ints)
# we have a non-castable dtype that was passed
raise TypeError(f"Cannot cast datetime64 to {dtype}")

I suspect that maybe_cast_to_datetime shouldn't raise in this case, seems to contradict the maybe of maybe_cast_to_datetime

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that's a bug in maybe_cast_datetime. We should not check the actual value if it is a datetime64 dtype, but only if that value (eg because it being an array) has a dtype.
(that's related to passing the actual dtype vs the array to the is_..._dtype functions, something we have been discussing lately)

Can you see if this fixes it for you:

--- a/pandas/core/dtypes/cast.py
+++ b/pandas/core/dtypes/cast.py
@@ -1384,7 +1384,7 @@ def maybe_cast_to_datetime(value, dtype, errors: str = "raise"):
                         pass
 
         # coerce datetimelike to object
-        elif is_datetime64_dtype(value) and not is_datetime64_dtype(dtype):
+        elif is_datetime64_dtype(getattr(value, "dtype", None)) and not is_datetime64_dtype(dtype):
             if is_object_dtype(dtype):
                 if value.dtype != DT64NS_DTYPE:
                     value = value.astype(DT64NS_DTYPE)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you see if this fixes it for you:

thanks.

Comment on lines +41 to +44
# GH 33559 - empty index
result = pd.Series(index=[], dtype=dtype)
expected = pd.Series([], index=pd.Index([], dtype="object"), dtype=dtype)
self.assert_series_equal(result, expected)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR appears to also close #33559

the index discrepancy is consistent with non-EA types.

>>> import pandas as pd
>>> pd.__version__
'1.1.0.dev0+1446.g1c88e6aff'
>>> pd.Series(dtype="int64", index=[]).index
Index([], dtype='object')
>>>
>>> pd.Series(dtype="int64").index
Index([], dtype='object')
>>>
>>> pd.Series([], dtype="int64").index
RangeIndex(start=0, stop=0, step=1)
>>>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is another PR trying to clean this up

@jreback
Copy link
Contributor

jreback commented May 2, 2020

lgtm. can merge on green.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Constructors Series/DataFrame/Index/pd.array Constructors ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
4 participants