New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix writing compressed CSVs in python3 #188

Merged
merged 1 commit into from Apr 30, 2015

Conversation

Projects
None yet
2 participants
@cpcloud
Member

cpcloud commented Apr 30, 2015

closes #187

@cpcloud cpcloud self-assigned this Apr 30, 2015

@cpcloud cpcloud added this to the 0.3.3 milestone Apr 30, 2015

@cpcloud cpcloud added the bug label Apr 30, 2015

@cpcloud

This comment has been minimized.

Member

cpcloud commented Apr 30, 2015

@TomAugspurger look ok to you?

@@ -91,7 +92,10 @@ def append_dataframe_to_csv(c, df, dshape=None, **kwargs):
encoding=kwargs.get('encoding', c.encoding)
if ext(c.path) in compressed_open:
f = compressed_open[ext(c.path)](c.path, mode='a')
kwargs = dict(mode='at')
if sys.version_info[0] >= 3:

This comment has been minimized.

@TomAugspurger

TomAugspurger Apr 30, 2015

sys.version_info is a named tuple so you can do sys.version_info.major if you think thats cleaner

This comment has been minimized.

@cpcloud

cpcloud Apr 30, 2015

Member

alas, we have to support python 2.6, which doesn't implement sys.version_info as a namedtuple:

Python 2.6.9 |Continuum Analytics, Inc.| (unknown, Jan 10 2014, 13:33:57)
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.version_info
(2, 6, 9, 'final', 0)
f.write('a,1\nb,2\nc,3')
with tmpfile('.csv.gz') as gfn:
result = odo(fn, gfn)
assert odo(result, list) == odo(fn, list)

This comment has been minimized.

@TomAugspurger

TomAugspurger Apr 30, 2015

Do these need to be seeked back to the start? I don't know how odo handles opened files. Just worried you may be comparing 2 empty lists.

This comment has been minimized.

@cpcloud

cpcloud Apr 30, 2015

Member

We're going straight from the filename like your example, so there's nothing to rewind.

This comment has been minimized.

@cpcloud

cpcloud Apr 30, 2015

Member

In other words, we don't open the file handle until we're ready to convert. After we convert, the file handle is closed. We can call convert over and over without worrying about having a ton of files open.

This comment has been minimized.

@TomAugspurger

TomAugspurger Apr 30, 2015

Gotcha. I read tmpfile as a TempraryFile from the standard lib. Looks good!

@cpcloud

This comment has been minimized.

Member

cpcloud commented Apr 30, 2015

cool, going to give a go on windows and will merge if that's successful

@cpcloud

This comment has been minimized.

Member

cpcloud commented Apr 30, 2015

looks good on windows, merging

cpcloud added a commit that referenced this pull request Apr 30, 2015

Merge pull request #188 from cpcloud/compression-encoding
Fix writing compressed CSVs in python3

@cpcloud cpcloud merged commit 00cdf74 into blaze:master Apr 30, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@cpcloud cpcloud deleted the cpcloud:compression-encoding branch Apr 30, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment