Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BoundsError() when using readtable #689

Closed
tomrod opened this issue Sep 23, 2014 · 8 comments
Closed

BoundsError() when using readtable #689

tomrod opened this issue Sep 23, 2014 · 8 comments

Comments

@tomrod
Copy link

tomrod commented Sep 23, 2014

I get this error when trying to run DataFrames' readtable function:


BoundsError()
while loading In[4], in expression starting on line 1

in findcorruption at /home/ogre/.julia/v0.3/DataFrames/src/dataframe/io.jl:674
in readtable! at /home/ogre/.julia/v0.3/DataFrames/src/dataframe/io.jl:742
in readtable at /home/ogre/.julia/v0.3/DataFrames/src/dataframe/io.jl:823
in readtable at /home/ogre/.julia/v0.3/DataFrames/src/dataframe/io.jl:890


Using version:
Version 0.3.0 (2014-08-20 20:43 UTC)
http://julialang.org release
x86_64-linux-gnu

Appears similar to #586

File is CSV, separator = ',', and charset=us-ascii

enca -L None output:

7bit ASCII characters
CRLF line terminators

@tkelman
Copy link
Contributor

tkelman commented Sep 23, 2014

May also be related to #604 - does the file have any comments in it? Does anything change if you run dos2unix on the file?

@tomrod
Copy link
Author

tomrod commented Sep 23, 2014

Nothing changes if I use dos2unix or fromdos commands to convert (though it does get rid of the pesky CRLF characters).

Seeing now if there is a viable workaround by reading in the table using readdlm and feeding into a DataFrame

@tomrod
Copy link
Author

tomrod commented Sep 23, 2014

Ah. I found the root issue.

Some of my rows had length 96, some had 92. Removing the 92-length rows resulted in a clean read-in (this done post-dos2unix--no idea if this corrected another bug).

@johnmyleswhite
Copy link
Contributor

Just to clarify: was your data corrupt or did readtable fail to read clean data?

@tomrod
Copy link
Author

tomrod commented Sep 24, 2014

Failed to replace missing data at end of lines with an NA or other missing value. I can send the data if you would like to try a replication.

@johnmyleswhite
Copy link
Contributor

I think I'm good. It looked something like this:

a,b,c
1,2,3
4,5

Rather than like

a,b,c
1,2,3
4,5,

@tomrod
Copy link
Author

tomrod commented Sep 25, 2014

That is correct.

@johnmyleswhite
Copy link
Contributor

Ok. I think I'll close this then, since the data isn't fully valid. In the future, we'll need to give better diagnostics when we fail to read a line from an input file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants