New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.to_excel() cuts off columns #10982

Closed
balzer82 opened this Issue Sep 3, 2015 · 2 comments

Comments

Projects
None yet
4 participants
@balzer82

balzer82 commented Sep 3, 2015

Today I discovered a strange behavior: When I am writing a DataFrame with .to_excel(), it cuts columns. Compared with the same DataFrame with .to_csv() or .head(), you can see the difference, that the last 8 columns are missing.

You can reproduce this by downloading Features.pkl from here and then:

import pandas as pd
df = pd.read_pickle('Features.pkl')
df.head() # see the last 8 columns!
df.to_excel('Features.xlsx', index=False, header=False)
# see the Excel, you do not have these last 8 columns
# in a .to_csv() you have them

Funny part: If you df.ix[:,-71:].to_excel('Features.xlsx', index=False, header=False) you have one of the missing columns. If you do df.ix[:,-70:].to_excel('Features.xlsx', index=False, header=False) you have two and so on...

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.10.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8

pandas: 0.16.2
nose: 1.3.7
Cython: 0.22.1
numpy: 1.9.2
scipy: 0.16.0
statsmodels: 0.6.1
IPython: 3.2.0
sphinx: 1.3.1
patsy: 0.3.0
dateutil: 2.4.2
pytz: 2015.4
bottleneck: 1.0.0
tables: 3.2.0
numexpr: 2.4.3
matplotlib: 1.4.3
openpyxl: 1.8.6
xlrd: 0.9.3
xlwt: 1.0.0
xlsxwriter: 0.7.3
lxml: 3.4.4
bs4: 4.3.2
html5lib: None
httplib2: 0.9
apiclient: None
sqlalchemy: 1.0.5
pymysql: None
psycopg2: None
@dsm054

This comment has been minimized.

Show comment
Hide comment
@dsm054

dsm054 Sep 3, 2015

Contributor

In an odd coincidence, I think this is another Excel-dup-column issue:

>>> len(df.columns)
380
>>> len(set(df.columns))
372

and there are eight columns missing. By changing the window into df, you change which columns (if any) are duplicated.

Contributor

dsm054 commented Sep 3, 2015

In an odd coincidence, I think this is another Excel-dup-column issue:

>>> len(df.columns)
380
>>> len(set(df.columns))
372

and there are eight columns missing. By changing the window into df, you change which columns (if any) are duplicated.

@balzer82

This comment has been minimized.

Show comment
Hide comment
@balzer82

balzer82 Sep 4, 2015

Thats it! There are columns with exactly the same name on this position [-70:]. So no export of duplicated column names to_excel?

balzer82 commented Sep 4, 2015

Thats it! There are columns with exactly the same name on this position [-70:]. So no export of duplicated column names to_excel?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment