Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiindex with single level no longer allows selecting columns #29749

Closed
darindillon opened this issue Nov 20, 2019 · 0 comments · Fixed by #38026
Closed

Multiindex with single level no longer allows selecting columns #29749

darindillon opened this issue Nov 20, 2019 · 0 comments · Fixed by #38026
Labels
Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Milestone

Comments

@darindillon
Copy link

darindillon commented Nov 20, 2019

This works fine in pandas 0.22.0 but fails in 0.25.3 It appears a bug may have been introduced.

I have some old code that accidentally created a MultiIndex with a single level instead of a regular index, but everything worked fine with pandas 0.22.0. However, in new versions of pandas, that MultiIndex breaks pandas ability to select columns and gives an extremely misleading error message. In this case, I don't need the MultiIndex, so I can just remove that; but I think this code ought to work. (And it did work in the old 0.22.0 version).

import numpy as np
import pandas as pd

names = ['FirstColumn', 'SecondColumn']
data = np.array([[5,6],[7,8]])
df = pd.DataFrame(data, columns = [names]) #Bug: the brackets around "[names]" creates a
#multi-index but that was unintentional. (It's still legal, though). 
#But "df.head()" and "df.describe()" both look normal so you can't see anything is wrong. 

#All of the following work as expected in 0.22.0 but give misleading errors in 0.25.3.
#These all work with a regular index, but the 0.25.3 MultIndex doesn't like it. 
df['FirstColumn'] #ERROR! 
df.FirstColumn #ERROR! 
df.loc[:,'FirstColumn'] #ERROR! 

#This works, but it shouldn't be necessary to do this. The above syntax is more standard.
df.xs('FirstColumn', axis=1, level=0)

Those failing statements give misleading errors about only integer scalar arrays can be converted to a scalar index but that code looks like it ought to work, and it did work fine in the previous versions of pandas. Can we make the standard methods of selecting columns work with the MultiIndex?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants