Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiIndex.dropna() does not always drop NANs #19387

Closed
jzwinck opened this issue Jan 25, 2018 · 2 comments

Comments

@jzwinck
Copy link
Contributor

commented Jan 25, 2018

A MultiIndex "label" of -1 means NAN (though I have found no documentation of this). For example:

>>> pd.MultiIndex.from_arrays([['a', 'a'], ['x', np.nan]])
MultiIndex(levels=[['a'], ['x']],
           labels=[[0, 0], [0, -1]])

A MultiIndex can also be constructed with NAN values in levels:

>>> pd.MultiIndex(levels=[['a'], ['x', np.nan]], labels=[[0, 0], [0, 1]])
MultiIndex(levels=[['a'], ['x', nan]],
           labels=[[0, 0], [0, 1]])

MultiIndex.dropna() works for the first case, but does nothing for the second:

>>> pd.MultiIndex(levels=[['a'], ['x', np.nan]], labels=[[0,0], [0,1]]).dropna()
MultiIndex(levels=[['a'], ['x', nan]],
           labels=[[0, 0], [0, 1]])

It appears that MultiIndex.dropna() only drops rows whose label is -1, but not rows whose level is actually NAN. It should drop both types of rows, so the result should be:

MultiIndex(levels=[['a'], ['x']],
           labels=[[0], [0]])

I am using Pandas 0.20.3, NumPy 1.13.1, and Python 3.5.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 25, 2018

i believe this is fixed in 0.22 or master
pls give a try

@howsiwei

This comment has been minimized.

Copy link
Contributor

commented May 15, 2019

@jreback I just checked and it's fixed in neither 0.22 nor master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.