Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

delimiter in CSV writer should support unicode #1949

Merged
merged 4 commits into from Jan 15, 2014

Conversation

taldcroft
Copy link
Member

I think the following should probably be made to work:

In [6]: t.write('test.csv', format='ascii', delimiter=u',')
ERROR: TypeError: "delimiter" must be an 1-character string [astropy.io.ascii.core]
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-1f963bd3edbd> in <module>()
----> 1 t.write('test.csv', format='ascii', delimiter=u',')

/Users/tom/Library/Python/2.7/lib/python/site-packages/astropy-0.4.dev6923-py2.7-macosx-10.8-x86_64.egg/astropy/table/table.pyc in write(self, *args, **kwargs)
   1732         passed through to the underlying data reader (e.g. `~astropy.io.ascii.ui.write`).
   1733         """
-> 1734         io_registry.write(self, *args, **kwargs)
   1735 
   1736     def copy(self, copy_data=True):

/Users/tom/Library/Python/2.7/lib/python/site-packages/astropy-0.4.dev6923-py2.7-macosx-10.8-x86_64.egg/astropy/io/registry.pyc in write(data, *args, **kwargs)
    356 
    357     writer = get_writer(format, data.__class__)
--> 358     writer(data, *args, **kwargs)
    359 
    360 

/Users/tom/Library/Python/2.7/lib/python/site-packages/astropy-0.4.dev6923-py2.7-macosx-10.8-x86_64.egg/astropy/io/ascii/connect.pyc in write_asciitable(table, filename, **kwargs)
     23 def write_asciitable(table, filename, **kwargs):
     24     from .ui import write
---> 25     return write(table, filename, **kwargs)
     26 
     27 io_registry.register_writer('ascii', Table, write_asciitable)

/Users/tom/Library/Python/2.7/lib/python/site-packages/astropy-0.4.dev6923-py2.7-macosx-10.8-x86_64.egg/astropy/io/ascii/ui.pyc in write(table, output, format, Writer, **kwargs)
    293     Writer = _get_format_class(format, Writer, 'Writer')
    294     writer = get_writer(Writer=Writer, **kwargs)
--> 295     lines = writer.write(table)
    296 
    297     # Write the lines to output

/Users/tom/Library/Python/2.7/lib/python/site-packages/astropy-0.4.dev6923-py2.7-macosx-10.8-x86_64.egg/astropy/io/ascii/core.pyc in write(self, table)
    914         # Write header and data to lines list
    915         lines = []
--> 916         self.header.write(lines)
    917         self.data.write(lines)
    918 

/Users/tom/Library/Python/2.7/lib/python/site-packages/astropy-0.4.dev6923-py2.7-macosx-10.8-x86_64.egg/astropy/io/ascii/core.pyc in write(self, lines)
    414                                        itertools.cycle(self.write_spacer_lines)):
    415                 lines.append(spacer_line)
--> 416             lines.append(self.splitter.join([x.name for x in self.cols]))
    417 
    418     @property

/Users/tom/Library/Python/2.7/lib/python/site-packages/astropy-0.4.dev6923-py2.7-macosx-10.8-x86_64.egg/astropy/io/ascii/core.pyc in join(self, vals)
    294                                          quotechar=self.quotechar,
    295                                          quoting=self.quoting,
--> 296                                          lineterminator='',
    297                                          )
    298         self.csv_writer_out.seek(0)

TypeError: "delimiter" must be an 1-character string

I found this in a script that was importig unicode_literals from __future__

cc @taldcroft

@taldcroft
Copy link
Member

@astrofrog - code attached.

I've taken the kludgey step of copying two key test files and putting a future unicode_literals import at the top. As discussed previously with @mdboom for astropy.table, there isn't any totally obvious way to do this cleanly, through perhaps there is some clever solution possible. This isn't a future-proof solution, but it does demonstrate that io.ascii should now work with unicode literals in place.

The main issue was related csv which is explicitly not unicode-aware, according to the docs. I also found a mistake in the way an input was tested for str-ness.

@mdboom
Copy link
Contributor

mdboom commented Jan 15, 2014

This looks good to me. Note that we aren't really "supporting unicode" here -- this will still fail with non-ascii characters -- but that's a pretty corner case -- hopefully no one is using unicode delimiters. Ideally, we would encode the delimiter etc. in the same encoding as whatever lines is in, but I don't think we (can) know what that is, so this is probably good enough.

Needs a CHANGES entry, but other than that I think this is good to merge. Let's just leave the duplicated test_*_unicode files in there for now, and I will take them out again as part of #1962.

taldcroft added a commit that referenced this pull request Jan 15, 2014
Allow passing unicode delimiters in io.ascii read and write
@taldcroft taldcroft merged commit 4210065 into astropy:master Jan 15, 2014
@taldcroft taldcroft deleted the ascii/uni-lit branch January 15, 2014 16:41
taldcroft added a commit that referenced this pull request Feb 12, 2014
Allow passing unicode delimiters in io.ascii read and write
taldcroft added a commit that referenced this pull request Feb 12, 2014
Allow passing unicode delimiters in io.ascii read and write
@embray
Copy link
Member

embray commented Feb 12, 2014

I merged this into the v0.3.x branch, and several of these tests are failing there. Not sure why yet but am investigating--any ideas?

@taldcroft
Copy link
Member

No obvious ideas. Is there a link to the test log we can look at?

embray added a commit that referenced this pull request Feb 12, 2014
…ifferences likely due to slight implementation differences between the two versions. Now all tests are passing except some of those introduced by #1949.  Still trying to understand the root of that problem.
@embray
Copy link
Member

embray commented Feb 12, 2014

I just fixed a few other small issues in the release branch, and narrowed it down to the tests from this PR that aren't working. Travis should have a build log in a bit.

@embray
Copy link
Member

embray commented Feb 12, 2014

Here we go: https://travis-ci.org/astropy/astropy/jobs/18745796

The doctest failures are unrelated.

@embray
Copy link
Member

embray commented Feb 14, 2014

Nevermind, I understand the issue now: The original version of this PR, as you wrote, added test_read_unicode.py and test_write_unicode.py as copies of the non-unicode_literals versions of those test modules (with a few other slight tweaks). Then the PR was backported those new test modules were added directly to the v0.3.x branch, even though they tested for features/fixes that are unique to the master branch. I redid the merge of this PR by using the versions of test_read.py and test_write.py from the v0.3.x branch and applying the appropriate changes in new versions of test_read_unicode.py and test_write_unicode.py.

taldcroft added a commit that referenced this pull request Feb 14, 2014
Allow passing unicode delimiters in io.ascii read and write
embray added a commit that referenced this pull request Feb 14, 2014
…ifferences likely due to slight implementation differences between the two versions. Now all tests are passing except some of those introduced by #1949.  Still trying to understand the root of that problem.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants