Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiIndex DataFrame to_csv() terminates python process #26303

Closed
miburk opened this issue May 7, 2019 · 2 comments

Comments

@miburk
Copy link

commented May 7, 2019

Code Sample, a copy-pastable example if possible

import traceback
import pandas as pd

# index is actually a MultiIndex
index = pd.Index([(1,), (2,), (3,)])
data = pd.DataFrame([[1, 2, 3]], columns=index)
data = data.reindex(columns=[(1,), (3,)])

try:
    # This call fails with a TypeError.
    data.to_csv('crash.csv')
except TypeError as err:
    traceback.print_exc()

# This print seems to be essential to trigger an immediate crash of the process
# in the following .to_csv call. The crash happens also if the preceding
# try/except block is removed.
print(data)
data.to_csv('crash.csv')

Problem description

The python-code above shows two, maybe three errors.
(1) The first call to DataFrame.to_csv raises a TypeError. This it uninformative at best. Printing the DataFrame works, so it should be possible to export it to csv.
(2) The second call to DataFrame.to_csv leads to terminating python with exit code -1073741819 (0xC0000005).
(3) It seems that the print call triggers this crash, even though I would expect a print to not alter the printed data structure.

The Windows event log shows an error happened in lib\site-packages\pandas_libs\writers.cp36-win_amd64.pyd.

Expected Output

Because the DataFrame is printable on the console, I would expect a successful call of to_csv().

Output of pd.show_versions()

Python Version: Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 27 2018, 03:37:03) [MSC v.1900 64 bit (AMD64)] on win32

INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.24.2
pytest: None
pip: 10.0.1
setuptools: 39.1.0
Cython: None
numpy: 1.15.4
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 7.0.1
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.7
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.6.1
xlrd: 1.2.0
xlwt: None
xlsxwriter: None
lxml.etree: 4.3.0
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@WillAyd

This comment has been minimized.

Copy link
Member

commented May 8, 2019

Strange indeed, though this segfaulted for me during the try...except block.

Investigation and PRs are always welcome!

@SummerGram

This comment has been minimized.

Copy link

commented May 10, 2019

I will do this. Assign it to me please.

@mahepe mahepe referenced this issue May 12, 2019

Merged

BUG: Fix MultiIndex DataFrame to_csv() segfault #26355

4 of 4 tasks complete

mahepe pushed a commit to mahepe/pandas that referenced this issue May 12, 2019

mahepe pushed a commit to mahepe/pandas that referenced this issue May 13, 2019

@jreback jreback modified the milestones: Contributions Welcome, 0.25.0 May 16, 2019

jreback added a commit that referenced this issue May 16, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.