DataFrame.fillna corrupts columns with duplicated names #12344

Closed
hantusk opened this Issue Feb 16, 2016 · 3 comments

Comments

Projects
None yet
3 participants

hantusk commented Feb 16, 2016

# Pandas version 0.17.1
import pandas as pd
df = pd.DataFrame({'Same': 1.0, ' Same': pd.np.nan, '  Same': pd.np.nan}, index=[0,1,2])
df.columns = [c.strip() for c in df.columns]
df.iloc[:, 2] # Returns all 1.0

df.iloc[:, 0] = df.iloc[:, 0].fillna(df.iloc[:, 1])

df.iloc[:, 2] # Column 2 is corrupted and returns all NaN

jreback added this to the Next Major Release milestone Feb 16, 2016

Contributor

jreback commented Feb 16, 2016

hmm, that appears to be the case. pull-requests to investigate are welcome. Duplicate column support should work, though not as complete as more standard unique support.

Member

gfyoung commented Feb 22, 2016

@jreback : Where would the potentially offending code be located?

Contributor

jreback commented Feb 22, 2016

core/generic

@gfyoung gfyoung added a commit to gfyoung/pandas that referenced this issue Mar 6, 2016

@gfyoung gfyoung BUG: Allow assignment by indexing with duplicate column names
Closes gh-12344.
7265d29

@jreback jreback modified the milestone: 0.18.0, Next Major Release Mar 6, 2016

jreback closed this in a174898 Mar 6, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment