Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

engine='python' df.to_csv() duplicate columns doesn't work #3457

Closed
ghost opened this issue Apr 25, 2013 · 7 comments
Closed

engine='python' df.to_csv() duplicate columns doesn't work #3457

ghost opened this issue Apr 25, 2013 · 7 comments
Labels
Bug IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@ghost
Copy link

ghost commented Apr 25, 2013

example from 0.10.1, and carried on to to_csv(engine='python') in 0.11

note the first columns duplicates the second columns, which shouldn't be.

Corrolary #3454

from pandas.util.testing import makeCustomDataframe as mkdf

N=10
df= mkdf(N, 3)
df.columns = ['a','a','b']
path = "/tmp/k.csv"
df.to_csv("/tmp/k.csv")
!cat "/tmp/k.csv"
R0,a,a,b
R_l0_g0,R0C1,R0C1,R0C2
R_l0_g1,R1C1,R1C1,R1C2
R_l0_g10,R2C1,R2C1,R2C2
R_l0_g2,R3C1,R3C1,R3C2
R_l0_g3,R4C1,R4C1,R4C2
R_l0_g4,R5C1,R5C1,R5C2
R_l0_g5,R6C1,R6C1,R6C2
R_l0_g6,R7C1,R7C1,R7C2
R_l0_g7,R8C1,R8C1,R8C2
R_l0_g8,R9C1,R9C1,R9C2
@jreback
Copy link
Contributor

jreback commented Apr 25, 2013

added #3459, should close this one and #3455

In [5]: df.to_csv('/tmp/dups')

In [6]: !cat "/tmp/dups"
R0,a,a,b
R_l0_g0,R0C0,R0C1,R0C2
R_l0_g1,R1C0,R1C1,R1C2
R_l0_g2,R2C0,R2C1,R2C2
R_l0_g3,R3C0,R3C1,R3C2
R_l0_g4,R4C0,R4C1,R4C2
R_l0_g5,R5C0,R5C1,R5C2
R_l0_g6,R6C0,R6C1,R6C2
R_l0_g7,R7C0,R7C1,R7C2
R_l0_g8,R8C0,R8C1,R8C2
R_l0_g9,R9C0,R9C1,R9C2

In [8]: pd.read_csv('/tmp/dups',index_col=0)
Out[8]: 
            a   a.1     b
R0                       
R_l0_g0  R0C0  R0C1  R0C2
R_l0_g1  R1C0  R1C1  R1C2
R_l0_g2  R2C0  R2C1  R2C2
R_l0_g3  R3C0  R3C1  R3C2
R_l0_g4  R4C0  R4C1  R4C2
R_l0_g5  R5C0  R5C1  R5C2
R_l0_g6  R6C0  R6C1  R6C2
R_l0_g7  R7C0  R7C1  R7C2
R_l0_g8  R8C0  R8C1  R8C2
R_l0_g9  R9C0  R9C1  R9C2

@jreback
Copy link
Contributor

jreback commented Apr 25, 2013

fyi, I am not sure if there is a way to make read_csv not create a new named column 'a.1', if you specify names=[], it doesn't work....is this a bug?

@ghost
Copy link
Author

ghost commented Apr 25, 2013

It may not be clean, but I think it's by design, predates dupe column support I should think,
a quick way to allow data with dupe columns from an external source to be imported into pandas.

not a big deal IMO.

@jreback
Copy link
Contributor

jreback commented Apr 25, 2013

closed by #3459

@jreback jreback closed this as completed Apr 25, 2013
@ghost ghost reopened this Apr 25, 2013
@ghost
Copy link
Author

ghost commented Apr 25, 2013

I believe not so. forgot the engine='python' bit, leave it raising a warning exception?

@jreback
Copy link
Contributor

jreback commented Apr 25, 2013

ahh...I see about engine=python, why don't just raise on that, seeing as we fixed the regular engine

@ghost
Copy link
Author

ghost commented Apr 25, 2013

Fine by me. #3458 does that

@ghost ghost closed this as completed Apr 25, 2013
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

1 participant