New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: invalid constrution of a Series with dtype=str #19853

jamesqo opened this Issue Feb 23, 2018 · 3 comments


None yet
3 participants

jamesqo commented Feb 23, 2018

pd.Series('', dtype=str, index=range(1000))

throws a ValueError with the following message:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\", line 266, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\", line 4402, in __init__
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\", line 2957, in make_block
    return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\", line 2082, in __init__
    placement=placement, **kwargs)
  File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\", line 111, in __init__
    raise ValueError('Wrong number of dimensions')
ValueError: Wrong number of dimensions

Would it be possible to fix the behavior to initialize the series to '' (or at least provide a clearer message)?


This comment has been minimized.


ZhuBaohe commented Feb 23, 2018

pd.Series('', dtype=object, index=range(1000))
That's ok. String uses 'object' dtype.


This comment has been minimized.


jreback commented Feb 23, 2018

this takes a different path in master. We pretty much treat str as object. So this is a construction bug.

In [4]: pd.Series('',  index=range(1000), dtype=str)
AttributeError                            Traceback (most recent call last)
<ipython-input-4-3bd08f17c610> in <module>()
----> 1 pd.Series('',  index=range(1000), dtype=str)

~/pandas/pandas/core/ in __init__(self, data, index, dtype, name, copy, fastpath)
    237             else:
    238                 data = _sanitize_array(data, index, dtype, copy,
--> 239                                        raise_cast_failure=True)
    241                 data = SingleBlockManager(data, index, fastpath=True)

~/pandas/pandas/core/ in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
   3260         # GH 16605
   3261         # If not empty convert the data to dtype
-> 3262         if not isna(data).all():
   3263             data = np.array(data, dtype=dtype, copy=False)

AttributeError: 'bool' object has no attribute 'all'

@jamesqo there is no reason to specify a dtype here as this will be inferred to object dtype anyhow (str as I said above is pretty much an alias for object dtype).

a PR to fix is welcome.

@jreback jreback added this to the Next Major Release milestone Feb 23, 2018

@jreback jreback changed the title from "Wrong number of dimensions" when trying to initialize Series with empty string to BUG: invalid constrution of a Series with dtype=str Feb 23, 2018


This comment has been minimized.


jreback commented Feb 23, 2018

@jamesqo note that setting the string '' like this doesn't have much utilitiy. pandas has a full suite of string operations that are all NaN aware.

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Mar 19, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment