Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
ENH: DataFrame.to_csv support for "compression='gzip'" #7615
Comments
|
Would accept a pull-request to limit this. |
jreback
added CSV API Design
labels
Jun 30, 2014
jreback
added this to the
Someday
milestone
Jun 30, 2014
mmautner
commented
Jun 30, 2014
|
I think this out-of-scope for Pandas--just use this: https://docs.python.org/2/library/gzip.html please close |
francescomalandrino
commented
Jun 30, 2014
|
But compression='gzip' is accepted (and enacted) in pd.read_csv, which is why I was assuming to_csv behaves the same. |
mmautner
commented
Jun 30, 2014
|
The way you initially phrased the issue suggested that you were just guessing at keyword arguments--'compression' isn't a documented argument so I don't think your confusion is shared by many. You're welcome to submit a pull-request, I don't feel religious about this at all |
francescomalandrino
commented
Jun 30, 2014
|
Sorry, what I meant is:
There is no error in the documentation and both (1) and (2) make sense to me. Closing on the grounds that I won't be fixing it myself, and probably it's not a proper bug. |
francescomalandrino
closed this
Jun 30, 2014
mmautner
commented
Jun 30, 2014
|
Thanks! I definitely didn't mean to antagonize you--agreed that it's an unfortunate inconsistency |
jorisvandenbossche
changed the title from
DataFrame.to_csv accepts but does not enact "compression='gzip'" to ENH: DataFrame.to_csv support for "compression='gzip'"
Jul 1, 2014
jorisvandenbossche
added Enhancement and removed API Design
labels
Jul 1, 2014
jorisvandenbossche
reopened this
Jul 1, 2014
|
Would we want this feature, if someone would implement it? If so, we can leave it open marked as an enhancement proposal? |
msevrens
commented
Aug 22, 2014
|
I would also like to_csv to have the same functionality of from_csv. |
rmorgans
referenced
this issue
Mar 1, 2015
Open
API: read_csv,from_csv/to_csv keyword consistency #9568
|
+1, a compression argument for In Python 3.4, I use the following workaround: with gzip.open('path_to_file', 'wt') as write_file:
data_frame.to_csv(write_file) |
|
@dhimmel If you're interested in putting in the work, I think we're still open to a PR to add this feature. |
|
@shoyer, okay I will keep this in mind. I have a bit to learn first. |
This was referenced Oct 2, 2015
jreback
modified the milestone: 0.17.1, Someday
Oct 5, 2015
jreback
closed this
in #11219
Oct 12, 2015
rbrito
commented
Oct 20, 2015
|
Thank you so much for implementing this! Besides the aesthetics POV and fixing the asymmetry between read/write, this is a huge improvement to some people like me. |
jsmedmar
commented
Jun 30, 2016
|
This did not work for me, the output file isn't compressed. I'm using Pandas 0.18.1 |
|
@jsmedmar could you open a new issue with that demonstrating the problem? Thanks. |
indera
commented
Oct 18, 2016
|
@jsmedmar I see the "compression" argument is properly documented and it is working One confusing thing is that if you run the following code
you get a compressed file, but opening it in vim automatically decompresses it, so to verify that compression happened use the "head" command: $ head test.csv.gz |
francescomalandrino commentedJun 30, 2014
the DataFrame.to_csv method seems to accept a "compression" named parameter:
However, the file it creates is not compressed at all:
francesco@i3 ~/Desktop $ cat test.csv.gz
,a,b
0,0,1
1,2,3
2,4,5
3,6,7
4,8,9
How about either (i) actually implementing compression, or at least (ii) raise an error? The current behavior is confusing...