Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread read error “Expected sep (' ') but..” with partially quoted field #1079

Closed
geotheory opened this issue Mar 13, 2015 · 8 comments
Closed
Assignees
Labels
Milestone

Comments

@geotheory
Copy link

fread is having difficulty with a tab-separated file that features lines that are partially quoted. To illustrate (str_works confirms embedded quotes as the cause):

str_fails = 'L1\tsome\tunquoted\tstuff\nL2\tsome\t"half" quoted\tstuff\nL3\tthis\t"should work"\tok thought'
str_works = gsub('"', '', str_fails)
fread(str_works, sep='\t', header=F, skip=0L)
#    V1   V2          V3         V4
#1: L1 some    unquoted      stuff
#2: L2 some half quoted      stuff
#3: L3 this should work ok thought
fread(str_fails, sep='\t', header=F, skip=0L)
# Error in fread(str_fails, sep = "\t", header = F, skip = 0L) : 
#   Expected sep (' ') but 
# ' ends field 3 on line 1 when detecting types: L2 some    "half" quoted   stuff`

Would it be possible to add a quote argument to mitigate, or alternatively an automated way to identify embedded quotes during the quote classification code?

@arunsrinivasan arunsrinivasan self-assigned this Sep 7, 2015
@arunsrinivasan arunsrinivasan added this to the v1.9.6 milestone Sep 7, 2015
@arunsrinivasan
Copy link
Member

Thanks for the report. quote = "" is implemented. Either use that, or wrap your file with "" around character columns so that quote = "\"" can then read it properly. See ?fread quote argument.

@geotheory
Copy link
Author

I see the example above produces no error in latest version. Thanks for that. Confused by quote = "" is implemented though - I get unused argument (quote = ..) for everything I've tried..? Do you mean it's a valid argument for the function? I can't see any reference in the documentation to it.

@arunsrinivasan
Copy link
Member

I can only recommend to uninstall with remove.packages() and reinstall and try again. Have a look at the Readme and commit history if it's confusing.

@geotheory
Copy link
Author

Cheers

@arunsrinivasan
Copy link
Member

Removed quote argument with a better fix (slightly more robust) for quotes. Please upgrade and test.

@geotheory
Copy link
Author

Well it works on my example. Is this package supposed to be able to install from Github? When I try I get following error on library():

Loading required package: data.table Error in get(method, envir = home) : lazy-load database '/Users/user/Documents/R/library/3.1/data.table/R/data.table.rdb' is corrupt In addition: Warning messages: 1: In .registerS3method(fin[i, 1], fin[i, 2], fin[i, 3], fin[i, 4], : restarting interrupted promise evaluation 2: In get(method, envir = home) : restarting interrupted promise evaluation 3: In get(method, envir = home) : internal error -3 in R_decompress1

@jangorecki
Copy link
Member

@geotheory the latest version was just published on CRAN yesterday. Now the github version is the same to the CRAN so you can basically install most recent version from CRAN. As the package binaries may not be yet compiled on CRAN you can use install.packages("data.table", type="source"). PS. Inability to install pkg from github should goes as another issue, including steps to reproduce (as there are many ways to install pkgs), session info, etc.

@geotheory
Copy link
Author

OK, thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants