Skip to content

Odd behavior when broadcasting between DataFrame with MultiIndex and DataFrame with Index #19606

@Dr-Irv

Description

@Dr-Irv

Code Sample, a copy-pastable example if possible

d1 = pd.DataFrame({'C' : [i+1 for i in range(6)], 'D' : [(i+1)*10 for i in range(6)]},
                  index=pd.MultiIndex.from_product([[1,2],[10,20,30]], names=['A','B']))
d2 = pd.DataFrame({'E': [100,200]}, index=pd.Index([1,2], name='A'))
print(d1+d2)  # Gives odd result
print(d1.C + d2) # Gives error
print(d1.C + d2.E) # Works!

Problem description

What I am trying to do is broadcast the values in the DataFrame d2, using the shared name and values in the index with d1 (i.e., the index column 'A'). The expression d1+d2 gives this surprising result:

       C   D   E
A B             
1 10 NaN NaN NaN
  20 NaN NaN NaN
  30 NaN NaN NaN
2 10 NaN NaN NaN
  20 NaN NaN NaN
  30 NaN NaN NaN

The expression d1.C + d2 gives the error:

ValueError: cannot join with no level specified and no overlapping names

The final expression d1.C + d2.E gives an expected result

A  B 
1  10    101
   20    102
   30    103
2  10    204
   20    205
   30    206
dtype: int64

Expected Output

Since the broadcast works with adding the two series, for the expression d1+d2, I would have expected this result, or maybe an error:

        C    D
A B           
1 10  101  110
  20  102  120
  30  103  130
2 10  204  240
  20  205  250
  30  206  260

I don't see why the first result happens at all. Either the correct addition should happen, or an error should be raised.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.22.0
pytest: 3.0.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndexNumeric OperationsArithmetic, Comparison, and Logical operations

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions