Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError adding boolean dataframe to panel fails in 0.16.2 but not 0.15.2 #11014

Closed
edboring opened this issue Sep 7, 2015 · 3 comments
Closed
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@edboring
Copy link

edboring commented Sep 7, 2015

In Pandas 0.16.2 an attempt to assign a boolean array (created by pd.notnull) to a Panel fails with "ValueError: could not broadcast input array from shape...". The same code is successful using Pandas 0.15.2.

>>> import pandas as pd
>>> import numpy as np
>>> 
>>> # create a test panel
... dates = pd.date_range('20150101', periods=6)
>>> df1 = pd.DataFrame([0,-.1,.2,np.nan,.23,np.nan], index=dates, columns=list('A'))
>>> df2 = pd.DataFrame([.3,np.nan,.3,np.nan,.6,.1], index=dates, columns=list('A'))
>>> data = {'Item1' : df1, 'Item2': df2}
>>> panel = pd.Panel(data)
>>> 
>>> # check the shape
... panel.loc[:, :, 'A'].shape
(6, 2)
>>> 
>>> # create a dataframe with boolean values indicating not null data values
... notnull = pd.notnull(panel.loc[:, :, 'A'])
>>> notnull.shape
(6, 2)
>>> 
>>> # attempt to store it in the panel
... panel.loc[:, :, 'NotNull'] = notnull
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/Users/edb/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 115, in __setitem__
    self._setitem_with_indexer(indexer, value)
  File "/Users/edb/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 445, in _setitem_with_indexer
    setter(item, v)
  File "/Users/edb/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 411, in setter
    s._data = s._data.setitem(indexer=pi, value=v)
  File "/Users/edb/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 2483, in setitem
    return self.apply('setitem', **kwargs)
  File "/Users/edb/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 2459, in apply
    applied = getattr(b, f)(**kwargs)
  File "/Users/edb/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 611, in setitem
    values[indexer] = value
ValueError: could not broadcast input array from shape (2) into shape (6)
>>> 

pd.show_versions() output for the failing version:

>>> pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.2
nose: 1.3.7
Cython: 0.22.1
numpy: 1.9.2
scipy: 0.15.1
statsmodels: 0.6.1
IPython: 3.2.0
sphinx: 1.3.1
patsy: 0.3.0
dateutil: 2.4.2
pytz: 2015.4
bottleneck: 1.0.0
tables: 3.2.0
numexpr: 2.4.3
matplotlib: 1.4.3
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 1.0.0
xlsxwriter: 0.7.3
lxml: 3.4.4
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.5
pymysql: None
psycopg2: None
>>>

pd.show_versions() for the working version:

pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.9.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.15.2
nose: 1.3.4
Cython: 0.22
numpy: 1.9.2
scipy: 0.15.1
statsmodels: 0.6.1
IPython: 3.0.0
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.4.1
pytz: 2015.2
bottleneck: None
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.4.3
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: 0.6.7
lxml: 3.4.2
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
rpy2: None
sqlalchemy: 0.9.9
pymysql: None
psycopg2: None
>>> 
@edboring edboring changed the title ValueError adding boolean dataframe to panel fails in 0.16.2 but not 0.15.1 ValueError adding boolean dataframe to panel fails in 0.16.2 but not 0.15.2 Sep 7, 2015
@jreback
Copy link
Contributor

jreback commented Sep 7, 2015

ok, looks buggy. that logic in their is quite complicated, but pretty well tested, some must have gone awry.

@jreback jreback added Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version labels Sep 7, 2015
@jreback jreback added this to the Next Major Release milestone Sep 7, 2015
@jreback
Copy link
Contributor

jreback commented Sep 7, 2015

cc @evanpw as you are now an expert at fixing some of these issues!

@jreback
Copy link
Contributor

jreback commented Sep 7, 2015

closed by #11021

@jreback jreback closed this as completed Sep 7, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests

2 participants