Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

df.to_csv ignores compression when provided with a file handle #21227

sbarlock opened this issue May 28, 2018 · 2 comments


None yet
4 participants
Copy link

commented May 28, 2018

Code at issue

with open(outputfile, "w") as file_handle:
    df2.to_csv(file_handle, compression='gzip')

Problem description

The written file is not gzip compressed.

Expected Output

The output should be a gzip compressed csv file. Similar to what is obtained when using:


This comment has been minimized.

Copy link

commented May 29, 2018

Maybe related to #21144


This comment has been minimized.

Copy link

commented May 29, 2018

>>> import os
>>> import pandas as pd
>>> from pandas import *
>>> pd.__version__
>>> df = DataFrame(100 * [[123, 234, 435]])
>>> with open('test_compressed', 'w') as fh:
...     df.to_csv(fh, compression='gzip')
>>> fh_size = os.path.getsize('test_compressed')
>>> df.to_csv('test_compressed', compression='gzip')
>>> f_size = os.path.getsize('test_compressed')
>>> os.remove('test_compressed')
>>> assert fh_size == f_size
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>

looks like an existing behaviour dating back to version 0.20 or earlier.

Actually, the documentation of compression= says:

compression : string, optional

A string representing the compression to use in the output file. Allowed values are ‘gzip’, ‘bz2’, ‘zip’, ‘xz’. This input is only used when the first argument is a filename.

so it is not supported but may be a new use case perhaps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.