Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed a bug that caused newlines on Windows to be \r\r\n in ASCII tables #8659

Merged
merged 2 commits into from May 3, 2019

Conversation

astrofrog
Copy link
Member

@astrofrog astrofrog commented May 3, 2019

This is what happens when I decide to develop on Windows for fun!

On Windows:

import os

with open('test', 'w') as f:
    f.write(os.linesep.join(['1','2','3']))

with open('test', 'rb') as f:
    print(repr(f.read()))

returns:

b'1\r\r\n2\r\r\n3'

Fun! This is because by default Python will 'helpfully' translate \n to \r\n on Windows, so if one already uses os.linesep (\r\n), \r\n gets translated to \r\r\n. Don't ask. This causes #5126.

There are two possible solutions to this - either we always only use \n internally for newlines, or we continue to use os.linesep but we need to then open files with newline='' which disables any auto-translation.

This PR implements the latter. Note that this will still produce the incorrect output if users open files themselves without specifying newline= in the open() call, but if we use the first approach above, we will produce the incorrect output if users pass in other kinds of file-like objects which don't auto-translate newlines (e.g. StringIO).

Fixes #5126
Fixes #5299

@astrofrog astrofrog requested a review from taldcroft May 3, 2019 15:21
@astrofrog astrofrog added this to the v3.2 milestone May 3, 2019
@astrofrog astrofrog requested a review from pllim May 3, 2019 15:23
@codecov
Copy link

codecov bot commented May 3, 2019

Codecov Report

Merging #8659 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #8659   +/-   ##
=======================================
  Coverage   86.95%   86.95%           
=======================================
  Files         399      399           
  Lines       59329    59329           
  Branches     1100     1100           
=======================================
  Hits        51587    51587           
  Misses       7101     7101           
  Partials      641      641
Impacted Files Coverage Δ
astropy/io/ascii/ui.py 96.94% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5f4e3e...9677bf4. Read the comment docs.

Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept looks fine, just some nits.

with open(filename, 'r') as f:
content = f.read()

assert content.splitlines() == ['col', 'a', 'b', 'c']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It takes looking at the docs of splitlines to be sure this test is meaningful. I.e. I think that a\r\r\nb would split into ['a', '', 'b'] but I would not bet on it without trying. What about being more explicit and reading in binary mode and testing against 'col{0}a{0}b{0}c'.format(os.linesep).encode('ascii').

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitlines does split into ['a', '', 'b', '', 'c'] (the test fails before the fix) but I'll go with the more explicit test you mention.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it turns out the open in the test also needs a newline='' as it was automatically changing newlines to just \n!

@astrofrog astrofrog requested a review from taldcroft May 3, 2019 17:44
Copy link
Member

@taldcroft taldcroft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@taldcroft taldcroft merged commit 4166b6a into astropy:master May 3, 2019
bsipocz pushed a commit that referenced this pull request May 6, 2019
Fixed a bug that caused newlines on Windows to be \r\r\n in ASCII tables
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

wrong line terminator in csv and ecsv on windows issue with newline on Windows when writing ASCII table
2 participants