Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Regression] fread can no longer read all files from the extended test suite #2285

Closed
st-pasha opened this issue Aug 6, 2017 · 1 comment

Comments

@st-pasha
Copy link
Contributor

st-pasha commented Aug 6, 2017

Testing the latest fread.c with "large test suite" of files shows problems with the following files:

fread("h2o-3/smalldata/jira/pubdev_2455.csv")
  Internal error: Last field of last field should select quote rule 2

fread("h2o-3/smalldata/jira/pubdev_2336.csv")
  Internal error: Last field of last field should select quote rule 2

fread("h2o-3/smalldata/glm_test/prostate_cat_train.csv")
  Line 290 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 9 fields but found 8: <<380       0  69   R2     b     a   1.90 20.70       >>

fread("h2o-3/smalldata/glm_test/prostate_cat_test.csv")
  Line 90 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 9 fields but found 8: <<378       1  76   R2     b     a   5.5 53.9       >>

fread("h2o-3/smalldata/glm_test/abcd.csv")
  Line 4 has too few fields when detecting types. Use fill=TRUE to pad with NA. Expecting 6 fields but found 5: <<1 1 0 1 1 >>

All of these files were read without errors in the previous version of fread.c

@st-pasha
Copy link
Contributor Author

st-pasha commented Aug 7, 2017

Here's a minified test case for these:

require(data.table)
cat("A  B  C\n1  2  3\n4  5  6", file=f<-tempfile())
data.table:::test(9999.1, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,6L)))

cat("A,B,C\n1,2,3\n4,5,", file=f<-tempfile())
data.table:::test(9999.2, fread(f), data.table(A=c(1L,4L), B=c(2L,5L), C=c(3L,NA)))

t = '"b","bc8d5",\n"c",,"2f685"\n"d",,\n,"cdfb9",\n'
cat(t, file=f<-tempfile()); 
data.table:::test(9999.3, fread(f), fread(t))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants