Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fix line_terminator option #23608

Closed
wants to merge 8 commits into from

Conversation

BenRussert
Copy link

@BenRussert BenRussert commented Nov 9, 2018

  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff

@pep8speaks
Copy link

pep8speaks commented Nov 9, 2018

Hello @BenRussert! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on January 16, 2019 at 06:09 Hours UTC

@BenRussert
Copy link
Author

This line won't ever reach 'os.linesep' if the default argument is '\n'. I noticed the issue because windows to_csv, and to_clipboard write \n when the docs indicate the default windows \r\n

I've only tried the tests locally on ubuntu, but I wonder if this will break tests for people running on windows if \n is hardcoded in the test. (I'll look later if this makes sense)

@codecov
Copy link

codecov bot commented Nov 10, 2018

Codecov Report

Merging #23608 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23608   +/-   ##
=======================================
  Coverage   92.38%   92.38%           
=======================================
  Files         166      166           
  Lines       52358    52358           
=======================================
  Hits        48373    48373           
  Misses       3985     3985
Flag Coverage Δ
#multiple 90.81% <ø> (ø) ⬆️
#single 43.07% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/io/formats/csvs.py 98.22% <ø> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 33f91d8...9897a49. Read the comment docs.

@gfyoung gfyoung added Output-Formatting __repr__ of pandas objects, to_string IO CSV read_csv, to_csv labels Nov 11, 2018
@gfyoung
Copy link
Member

gfyoung commented Nov 11, 2018

@BenRussert : Good start! Definitely going to need some tests.

@jreback
Copy link
Contributor

jreback commented Dec 3, 2018

@BenRussert can you add some tests for this

@BenRussert
Copy link
Author

I've added tests @gfyoung and @jreback, sorry for the delay!

I also found an issue! This closes #11720

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor nits on my end. FYI if you merge master going forward you shouldn't need to force push changes which makes tracking easier

pandas/tests/io/formats/test_to_csv.py Outdated Show resolved Hide resolved
pandas/tests/io/formats/test_to_csv.py Outdated Show resolved Hide resolved
pandas/tests/io/formats/test_to_csv.py Outdated Show resolved Hide resolved
@BenRussert
Copy link
Author

See also #20353, #11720. to_csv issue was fixed in #21406, but the issue remains for windows users in to_clipboard.

@@ -30,7 +30,7 @@ class CSVFormatter(object):
def __init__(self, obj, path_or_buf=None, sep=",", na_rep='',
float_format=None, cols=None, header=True, index=True,
index_label=None, mode='w', nanRep=None, encoding=None,
compression='infer', quoting=None, line_terminator='\n',
compression='infer', quoting=None, line_terminator=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to need a doc-string versionchanged, and an update

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On closer look, #24290 covered this. In fact, the same PR may have corrected the entire issue as long as the CSVFormatter class is always called with line_terminator=None.

I made a minor docs update to reflect that PR, let me know if you had something else in mind.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reference issue is off? in any event, your change looks fine.

@jreback
Copy link
Contributor

jreback commented Dec 14, 2018

is there an associated issue?

@BenRussert
Copy link
Author

is there an associated issue?

#11720 describes the behavior (which may have been fixed by #24290). There's also at least one or two other issues out there that can be closed.

@jreback
Copy link
Contributor

jreback commented Jan 5, 2019

can you merge master

@BenRussert
Copy link
Author

can you merge master

@jreback DId you want me to push a merge commit? Or rebase and force?

@jreback
Copy link
Contributor

jreback commented Jan 12, 2019

the former

@@ -561,3 +561,20 @@ def test_to_csv_compression(self, compression_only,
result = pd.read_csv(path, index_col=0,
compression=read_compression)
tm.assert_frame_equal(result, df)


def test_csv_formatter_line_terminator_default(monkeypatch):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than this, can you just modify the appropriate tests to pass in ['\n', os.linesep, None] as parameters (we might already have this)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify the appropriate tests to pass

@jreback There isn't an existing test of the CSVFormatter directly.

The intent of the PR is to ensure that if no line_terminator is specified, the default is os.linesep. Currently, CSVFormatter must be called with line_terminator=None or else \n will be used.

When I opened this PR, I was looking at v0.24.3, so I didn't realize the bug had been fixed in #21406 by calling the CSVFormatter class with line_terminator=None. I thought I might as well continue the PR so future devs don't have to remember to call CSVFormatter(line_terminator=None) if they expect the platform specific os.linesep behavior. Since most devs use \n platforms, it would be easy to miss this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this test is necessary and slightly confusing, can you remove.

@@ -30,7 +30,7 @@ class CSVFormatter(object):
def __init__(self, obj, path_or_buf=None, sep=",", na_rep='',
float_format=None, cols=None, header=True, index=True,
index_label=None, mode='w', nanRep=None, encoding=None,
compression='infer', quoting=None, line_terminator='\n',
compression='infer', quoting=None, line_terminator=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reference issue is off? in any event, your change looks fine.

@@ -561,3 +561,20 @@ def test_to_csv_compression(self, compression_only,
result = pd.read_csv(path, index_col=0,
compression=read_compression)
tm.assert_frame_equal(result, df)


def test_csv_formatter_line_terminator_default(monkeypatch):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think this test is necessary and slightly confusing, can you remove.

@WillAyd
Copy link
Member

WillAyd commented Jan 16, 2019

@BenRussert can you update your OP to reflect which issue this addresses? Somewhat confused reading through the comments as to what problem this solves

@BenRussert
Copy link
Author

I'll just close and say that #24290 fixed what I was going after. The default argument vs calling with None is what confused me.

@BenRussert BenRussert closed this Jan 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants