Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: ValueError is mistakenly raised if a numpy array is assigned to a pd.Series of dtype=object and both have the same length #37748

Closed
2 of 3 tasks
nocluebutalotofit opened this issue Nov 10, 2020 · 2 comments · Fixed by #38266
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Milestone

Comments

@nocluebutalotofit
Copy link

nocluebutalotofit commented Nov 10, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

import pandas as pd
import numpy as np

pd.__version__ #  '1.1.3'
pdseries = pd.Series(index=[1,2,3,4], dtype=object)
pdseries.loc[1] = np.zeros(100)  # this works fine
pdseries.loc[3] = np.zeros(4)     # this raises a value error because len(pdseries)==len(np.zeros(4))

TypeError: only size-1 arrays can be converted to Python scalars
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/daniel/.conda/envs/production_system/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 2878, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "", line 1, in
pdseries.loc[3] = np.zeros(4)
File "/Users/daniel/.conda/envs/production_system/lib/python3.7/site-packages/pandas/core/indexing.py", line 670, in setitem
iloc._setitem_with_indexer(indexer, value)
File "/Users/daniel/.conda/envs/production_system/lib/python3.7/site-packages/pandas/core/indexing.py", line 1802, in _setitem_with_indexer
self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
File "/Users/daniel/.conda/envs/production_system/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 534, in setitem
return self.apply("setitem", indexer=indexer, value=value)
File "/Users/daniel/.conda/envs/production_system/lib/python3.7/site-packages/pandas/core/internals/managers.py", line 406, in apply
applied = getattr(b, f)(**kwargs)
File "/Users/daniel/.conda/envs/production_system/lib/python3.7/site-packages/pandas/core/internals/blocks.py", line 887, in setitem
values = values.astype(arr_value.dtype, copy=False)
ValueError: setting an array element with a sequence.

Problem description

It is possible to assign (numpy) arrays to elements of pandas.Series ofd type=object. Unfortunately, in case the array is of the same size as the Series a ValueError is raised.

How can one avoid this error?

Expected Output

The interesting thing is that the assignment takes place as expected:
In[42]: pdseries
Out[42]:
1 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ...
2 NaN
3 [0.0, 0.0, 0.0, 0.0]
4 NaN

One might argue that a warning could be useful but an error is misleading and tricky to debug.

Output of pd.show_versions()

INSTALLED VERSIONS

commit : db08276
python : 3.7.8.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.1.3
numpy : 1.19.2
pytz : 2020.1
dateutil : 2.8.1
pip : 20.2.4
setuptools : 49.6.0.post20201009
Cython : 0.29.21
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : 2.8.6 (dt dec pq3 ext lo64)
jinja2 : 2.11.2
IPython : 5.8.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.3.2
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.2.1
sqlalchemy : 1.3.20
tables : 3.6.1
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@nocluebutalotofit nocluebutalotofit added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 10, 2020
@nocluebutalotofit nocluebutalotofit changed the title BUG: ValueError is mistakenly raise if a numpy array is assigned to a pd.Series of dtype=object and both have the same length BUG: ValueError is mistakenly raised if a numpy array is assigned to a pd.Series of dtype=object and both have the same length Nov 10, 2020
@nocluebutalotofit
Copy link
Author

With pd.version '1.0.5' the bug does not occur.

@ma3da
Copy link
Contributor

ma3da commented Nov 26, 2020

xref #37486

@jreback jreback added this to the 1.2 milestone Dec 4, 2020
@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants