Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control-Z as EOF #1612

Closed
daroczig opened this issue Mar 25, 2016 · 7 comments
Closed

Control-Z as EOF #1612

daroczig opened this issue Mar 25, 2016 · 7 comments
Labels
Milestone

Comments

@daroczig
Copy link

@daroczig daroczig commented Mar 25, 2016

Some CSV files generated on MS DOS/Windows, can have ^Z as the end-of-file character as eg at https://www.treasury.gov/ofac/downloads/sdn.csv which results in an error when calling fread:

Expected sep (',') but new line, EOF (or other non printing character) ends field 1 on line 6 when detecting types: ^Z

Removing that character from the end of the file resolves the problem.

Session info:

> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 15.10

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.4

loaded via a namespace (and not attached):
[1] magrittr_1.5   plyr_1.8.3     tools_3.2.2    reshape2_1.4.1 Rcpp_0.12.3   
[6] stringi_1.0-1  stringr_1.0.0  chron_2.3-47  

But I can reproduce this problem with the most recent dev version of data.table as well at 6f58f5c.

@jangorecki

This comment has been minimized.

Copy link
Member

@jangorecki jangorecki commented Mar 25, 2016

some awk or sed should be good workaround for now

@daroczig

This comment has been minimized.

Copy link
Author

@daroczig daroczig commented Mar 25, 2016

Yeah, as said, "removing that character from the end of the file resolves the problem" :) But I though it's worth reporting as others might have the very same issue. Not high-priority for sure.

@skanskan

This comment has been minimized.

Copy link

@skanskan skanskan commented Jul 7, 2017

It would be great if fread could remove it automatically.

mattdowle added a commit that referenced this issue Jul 7, 2017
@mattdowle mattdowle added this to the v1.10.6 milestone Jul 7, 2017
@skanskan

This comment has been minimized.

Copy link

@skanskan skanskan commented Jul 8, 2017

How can we install version 1.10.6?
I think
install.packages("data.table", type = "source", repos = "http://Rdatatable.github.io/data.table")
would install version 1.10.5

@st-pasha

This comment has been minimized.

Copy link
Contributor

@st-pasha st-pasha commented Jul 9, 2017

Version 1.10.6 hasn't been released yet. There is only 1.10.4 on CRAN, and the "dev" version (1.10.5) -- which is based on the master branch in this repo.

@skanskan

This comment has been minimized.

Copy link

@skanskan skanskan commented Jul 9, 2017

I said it because I read "mattdowle added this to the v1.10.6 milestone". I thought we could try it in the dev version in some way.

@mattdowle

This comment has been minimized.

Copy link
Member

@mattdowle mattdowle commented Jul 13, 2017

@skanskan When last number is odd, that's the dev release. v1.10.5 will be renamed v1.10.6 when it is released to CRAN. Otherwise we all get confused when we grab the dev at different times. It is only possible to obtain an even numbered version number from CRAN and is a guaranteed checkpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.