-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Description
I came across some weird behavior when using assignment with a multi-index in Pandas 0.17.1
import pandas as pd
lvl0=range(2)
lvl1 = range(3)
index=pd.MultiIndex.from_product([lvl0,lvl1])
columns=['A','B']
df1=pd.DataFrame(data=1, columns=columns, index=index)
df2 = pd.DataFrame(data=2, index=lvl1, columns=['A'])
#this returns a dataframe without a multi index
print(df1.loc[0,df2.columns])
#but when you try an assignment it doesn't work
df1.loc[0,df2.columns]=df2
print(df1)
I debugged the pandas package a little bit and I think I know what is going. When you call df1.loc[0,df2.columns]
it uses some getitem method. But, df1.loc[0,df2.columns]=df2
uses some setitem method. When you set a break point in indexing.py in _NDFrameIndexer._get_setitem_indexer, then you see that the key that is being looked up in the index of the DataFrame is (0,['A']). Obviously, this key cannot be found, since A is a column. The result is that nothing is assigned.
I think this is a bug because there is inconsistent behaviour between setitem and getitem: when df1.loc[0,df2.columns] returns a DataFrame without a multindex (which isn't a copy) then assignment should be possible.