New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: invalid constrution of a Series with dtype=str #19853

Closed
jamesqo opened this Issue Feb 23, 2018 · 3 comments

Comments

Projects
None yet
3 participants
@jamesqo

jamesqo commented Feb 23, 2018

pd.Series('', dtype=str, index=range(1000))

throws a ValueError with the following message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\series.py", line 266, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 4402, in __init__
    fastpath=True)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 2957, in make_block
    return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 2082, in __init__
    placement=placement, **kwargs)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\internals.py", line 111, in __init__
    raise ValueError('Wrong number of dimensions')
ValueError: Wrong number of dimensions

Would it be possible to fix the behavior to initialize the series to '' (or at least provide a clearer message)?

@ZhuBaohe

This comment has been minimized.

Contributor

ZhuBaohe commented Feb 23, 2018

pd.Series('', dtype=object, index=range(1000))
That's ok. String uses 'object' dtype.

@jreback

This comment has been minimized.

Contributor

jreback commented Feb 23, 2018

this takes a different path in master. We pretty much treat str as object. So this is a construction bug.

In [4]: pd.Series('',  index=range(1000), dtype=str)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-3bd08f17c610> in <module>()
----> 1 pd.Series('',  index=range(1000), dtype=str)

~/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    237             else:
    238                 data = _sanitize_array(data, index, dtype, copy,
--> 239                                        raise_cast_failure=True)
    240 
    241                 data = SingleBlockManager(data, index, fastpath=True)

~/pandas/pandas/core/series.py in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   3260         # GH 16605
   3261         # If not empty convert the data to dtype
-> 3262         if not isna(data).all():
   3263             data = np.array(data, dtype=dtype, copy=False)
   3264 

AttributeError: 'bool' object has no attribute 'all'

@jamesqo there is no reason to specify a dtype here as this will be inferred to object dtype anyhow (str as I said above is pretty much an alias for object dtype).

a PR to fix is welcome.

@jreback jreback added this to the Next Major Release milestone Feb 23, 2018

@jreback jreback changed the title from "Wrong number of dimensions" when trying to initialize Series with empty string to BUG: invalid constrution of a Series with dtype=str Feb 23, 2018

@jreback

This comment has been minimized.

Contributor

jreback commented Feb 23, 2018

@jamesqo note that setting the string '' like this doesn't have much utilitiy. pandas has a full suite of string operations that are all NaN aware.

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Mar 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment