Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows test failure in io.jl #604

Closed
tkelman opened this issue May 14, 2014 · 8 comments
Closed

Windows test failure in io.jl #604

tkelman opened this issue May 14, 2014 · 8 comments

Comments

@tkelman
Copy link
Contributor

tkelman commented May 14, 2014

I think something is parsing incorrectly in test/data/comments/before_after_data.csv, though oddly it's sometimes a BoundsError and sometimes an error message with numbers of rows and columns from findcorruption.

  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.3.0-prerelease+3019 (2014-05-13 03:06 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit 00db10b (1 day old master)
|__/                   |  x86_64-w64-mingw32

julia> Pkg.installed("DataFrames")
v"0.5.4"

julia> using DataFrames

julia> df1 = readtable(joinpath(Pkg.dir("DataFrames"),"test/data/comments/before_after_dat
a.csv"), allowcomments = true)
ERROR: Saw 7 rows, 3 columns and 25 fields
 * Line 1 has 6 columns

 in error at error.jl:21
 in findcorruption at C:\Users\Tony\.julia\DataFrames\src\dataframe\io.jl:680

julia> include(joinpath(Pkg.dir("DataFrames"),"test","io.jl"))
ERROR: BoundsError()
 in findcorruption at C:\Users\Tony\.julia\DataFrames\src\dataframe\io.jl:663
while loading C:\Users\Tony\.julia\DataFrames\test\io.jl, in expression starting on line 104

julia> df1 = readtable(joinpath(Pkg.dir("DataFrames"),"test/data/comments/before_after_dat
a.csv"), allowcomments = true)
ERROR: Saw 7 rows, 3 columns and 25 fields
 * Line 1 has 6 columns

 in error at error.jl:21
 in findcorruption at C:\Users\Tony\.julia\DataFrames\src\dataframe\io.jl:680

julia> include(joinpath(Pkg.dir("DataFrames"),"test","io.jl"))
Warning: replacing module TestIO
ERROR: Saw 7 rows, 3 columns and 25 fields
 * Line 1 has 6 columns

 in error at error.jl:21
 in findcorruption at C:\Users\Tony\.julia\DataFrames\src\dataframe\io.jl:680
while loading C:\Users\Tony\.julia\DataFrames\test\io.jl, in expression starting on line 104
@tkelman
Copy link
Contributor Author

tkelman commented Jun 14, 2014

Bump. This is still a problem on DataFrames 0.5.5. Does anyone who understands the readtable parser have access to a Windows machine to test on?

@johnmyleswhite
Copy link
Contributor

Would be great to find somebody with a Windows machine and interest in working on this. The readtable parser is not that hard and I can walk somebody through it.

@tkelman
Copy link
Contributor Author

tkelman commented Jun 16, 2014

I'm happy to help work through where in the code this is happening - last time I tried digging into readtable on my own I didn't get far though.

But more importantly, I think I've figured out the underlying cause and how you should be able to reproduce it. There's some interaction between comment parsing and newlines. If I run dos2unix on the following files, then the tests pass:

test/data/comments/before_after_data.csv
test/data/skiplines/skipfront.csv

You should hopefully be able to reproduce the failure on Linux or Mac by running unix2dos on those same files (should be available under a dos2unix homebrew formula).

@johnmyleswhite
Copy link
Contributor

Thanks, that sounds like it's related to another bug people have raised in the past.

@garborg
Copy link
Contributor

garborg commented Jan 12, 2015

Appveyor isn't having trouble with this. Fixed, probably as of #669.

@garborg garborg closed this as completed Jan 12, 2015
@tkelman
Copy link
Contributor Author

tkelman commented Jan 12, 2015

Actually we changed the default line endings setting on the version of git that we bundle with the Windows binaries, so appveyor is checking out the test files with unix line endings. io.jl does still have a failure if you run unix2dos on the test data files, due to "multi-\nline\ntext" not equaling "multi-\r\nline\r\ntext"

@garborg garborg reopened this Jan 12, 2015
@garborg
Copy link
Contributor

garborg commented Jan 12, 2015

Thanks for checking -- I thought that had been fixed for a long time. I was hoping no one would have to touch the readtable code until CSVReaders.jl was ready to replace it, but if this is an easy fix, definitely worth fixing and testing.

@tkelman
Copy link
Contributor Author

tkelman commented Jan 12, 2015

The error's a lot less intimidating than it was 6 months ago, so that's an improvement

garborg added a commit that referenced this issue Jan 30, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants