New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread: showProgress must be 0 or 1 #1111

Closed
dlebauer opened this Issue Apr 10, 2015 · 14 comments

Comments

Projects
None yet
9 participants
@dlebauer

What I expected

fread("foo.csv") to import a data.table

What I did

  1. Installed the most recent version
  2. created a simple csv file
  3. tried to read a csv
install_github("Rdatatable/data.table")
write.csv(1, "tmp.csv")
fread("tmp.csv", verbose = TRUE)

What happened:

# Error in fread("tmp.csv", verbose = TRUE) : 
#   showProgress must be 0 or 1, currently
> sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] chron_2.3-45 tools_3.1.1 

What works

Works normally on Ubuntu 14.04 / R 3.1.0

@arunsrinivasan

This comment has been minimized.

Show comment
Hide comment
@arunsrinivasan

arunsrinivasan Apr 12, 2015

Member

What version of windows is this?

Member

arunsrinivasan commented Apr 12, 2015

What version of windows is this?

@dlebauer

This comment has been minimized.

Show comment
Hide comment
@dlebauer

dlebauer Apr 12, 2015

Windows 8.1

Windows 8.1

@arunsrinivasan

This comment has been minimized.

Show comment
Hide comment
@arunsrinivasan

arunsrinivasan Apr 12, 2015

Member

Could you try uninstalling RCurl and data.table using remove.packages and reinstalling and try again? I am not able to reproduce this error.. :-(

Member

arunsrinivasan commented Apr 12, 2015

Could you try uninstalling RCurl and data.table using remove.packages and reinstalling and try again? I am not able to reproduce this error.. :-(

@dlebauer

This comment has been minimized.

Show comment
Hide comment
@dlebauer

dlebauer Apr 28, 2015

@arunsrinivasan Sorry it took a while to get back to this, but the error persists even after uninstalling / reinstalling RCurl (Cran) / data.table (from this repository as above)

@arunsrinivasan Sorry it took a while to get back to this, but the error persists even after uninstalling / reinstalling RCurl (Cran) / data.table (from this repository as above)

@arunsrinivasan

This comment has been minimized.

Show comment
Hide comment
@arunsrinivasan

arunsrinivasan Apr 29, 2015

Member

I'm not quite sure how to reproduce this.

Member

arunsrinivasan commented Apr 29, 2015

I'm not quite sure how to reproduce this.

@ericsgagnon

This comment has been minimized.

Show comment
Hide comment
@ericsgagnon

ericsgagnon Jun 18, 2015

Just moved to Win 7 64bit, after installing latest versions, am getting the same issue, but only when I run the 64bit version of R - the same code works in the 32bit version:

64-bit: showProgress Error

> library(data.table)
data.table 1.9.5  For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
> write.csv(1, "tmp.csv")
> fread("tmp.csv", verbose = TRUE)
Error in fread("tmp.csv", verbose = TRUE) : 
  showProgress must be 0 or 1, currently
>
>
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] chron_2.3-45

32-bit: works as expected

> library(data.table)
data.table 1.9.5  For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
> write.csv(1, "tmp.csv")
> fread("tmp.csv", verbose = TRUE)
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.000000 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... ','
Detected 2 columns. Longest stretch was from line 1 to line 2
Starting data input on line 1 (either column names or first row of data). First 10 characters: "","x"
All the fields on line 1 are character fields. Treating as the column names.
Count of eol: 2 (including 1 at the end)
Count of sep: 1
nrow = MIN( nsep [1] / ncol [2] -1, neol [2] - nblank [1] ) = 1
Type codes (   first 5 rows): 41
Type codes: 41 (after applying colClasses and integer64)
Type codes: 41 (after applying drop or select (if supplied)
Allocating 2 column slots (2 - 0 dropped)
Read 1 rows. Exactly what was estimated and allocated up front
   0.006s ( 86%) Memory map (rerun may be quicker)
   0.001s ( 14%) sep and header detection
   0.000s (  0%) Count rows (wc -l)
   0.000s (  0%) Column type detection (first, middle and last 5 rows)
   0.000s (  0%) Allocation of 1x2 result (xMB) in RAM
   0.000s (  0%) Reading data
   0.000s (  0%) Allocation for type bumps (if any), including gc time if triggered
   0.000s (  0%) Coercing data already read in type bumps (if any)
   0.000s (  0%) Changing na.strings to NA
   0.007s        Total
   V1 x
1:  1 1
>
>
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] chron_2.3-45

Just moved to Win 7 64bit, after installing latest versions, am getting the same issue, but only when I run the 64bit version of R - the same code works in the 32bit version:

64-bit: showProgress Error

> library(data.table)
data.table 1.9.5  For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
> write.csv(1, "tmp.csv")
> fread("tmp.csv", verbose = TRUE)
Error in fread("tmp.csv", verbose = TRUE) : 
  showProgress must be 0 or 1, currently
>
>
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] chron_2.3-45

32-bit: works as expected

> library(data.table)
data.table 1.9.5  For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
> write.csv(1, "tmp.csv")
> fread("tmp.csv", verbose = TRUE)
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.000000 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... ','
Detected 2 columns. Longest stretch was from line 1 to line 2
Starting data input on line 1 (either column names or first row of data). First 10 characters: "","x"
All the fields on line 1 are character fields. Treating as the column names.
Count of eol: 2 (including 1 at the end)
Count of sep: 1
nrow = MIN( nsep [1] / ncol [2] -1, neol [2] - nblank [1] ) = 1
Type codes (   first 5 rows): 41
Type codes: 41 (after applying colClasses and integer64)
Type codes: 41 (after applying drop or select (if supplied)
Allocating 2 column slots (2 - 0 dropped)
Read 1 rows. Exactly what was estimated and allocated up front
   0.006s ( 86%) Memory map (rerun may be quicker)
   0.001s ( 14%) sep and header detection
   0.000s (  0%) Count rows (wc -l)
   0.000s (  0%) Column type detection (first, middle and last 5 rows)
   0.000s (  0%) Allocation of 1x2 result (xMB) in RAM
   0.000s (  0%) Reading data
   0.000s (  0%) Allocation for type bumps (if any), including gc time if triggered
   0.000s (  0%) Coercing data already read in type bumps (if any)
   0.000s (  0%) Changing na.strings to NA
   0.007s        Total
   V1 x
1:  1 1
>
>
> sessionInfo()
R version 3.2.0 (2015-04-16)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] chron_2.3-45

@mattdowle

This comment has been minimized.

Show comment
Hide comment
@mattdowle

mattdowle Jun 18, 2015

Member

How odd. Upped priority. Thanks for great report.

Member

mattdowle commented Jun 18, 2015

How odd. Upped priority. Thanks for great report.

@mattdowle mattdowle added this to the v1.9.6 milestone Jun 18, 2015

@mattdowle

This comment has been minimized.

Show comment
Hide comment
@mattdowle

mattdowle Jun 19, 2015

Member

Can't reproduce either. Best guess is at some corruption in the 32bit/64bit DLLs on your Windows box. It has happened before and I've seen odd problems like this when an older version of a DLL gets loaded with newer R package code.

  1. Reboot to utterly remove all usage of all DLLs even from zombie processes.
  2. Start 64bit R only.
  3. remove.packages("data.table")
  4. install.packages("data.table")
  5. test.data.table()
  6. Try fread again.

I looked at the code as well and can't see any problems. When data.table starts it creates a global option. Type the following. This is what I get and is correct.

> require(data.table)
> getOption("datatable.showProgress")
[1] 1
> storage.mode(getOption("datatable.showProgress"))
[1] "integer"

When you call fread it fetches that variable :

fread = function(..., showProgress=getOption("datatable.showProgress"), ...)

then all it does is pass that down to C level, and wraps with as.integer() to be on the safe side :

ans = .Call(Creadfile, ..., as.integer(showProgress))

Now because that argument is the last one, that's why I think some mismatch in DLL versions has happened i.e. it is somehow calling an old DLL that has less arguments. I know this sounds like Windows bashing but it's all I can think of as I've seen it before.

Regardless, I've added extra tracing at C level to hone in on the problem :

// Extra tracing for apparent Windows problem: https://github.com/Rdatatable/data.table/issues/1111
    if (!isInteger(showProgressArg)) error("showProgress is not type integer but type '%s'. Please report.", type2char(TYPEOF(showProgressArg)));
    if (LENGTH(showProgressArg)!=1) error("showProgress is not length 1 but length %d. Please report.", LENGTH(showProgressArg));
    int showProgress = INTEGER(showProgressArg)[0];
    if (showProgress!=0 && showProgress!=1)
        error("showProgress is not 0 or 1 but %d. Please report.", showProgress);    

Closing for now but please let us know whether either the reboot-purge-reinstall or the new tracing reveals anything.

Member

mattdowle commented Jun 19, 2015

Can't reproduce either. Best guess is at some corruption in the 32bit/64bit DLLs on your Windows box. It has happened before and I've seen odd problems like this when an older version of a DLL gets loaded with newer R package code.

  1. Reboot to utterly remove all usage of all DLLs even from zombie processes.
  2. Start 64bit R only.
  3. remove.packages("data.table")
  4. install.packages("data.table")
  5. test.data.table()
  6. Try fread again.

I looked at the code as well and can't see any problems. When data.table starts it creates a global option. Type the following. This is what I get and is correct.

> require(data.table)
> getOption("datatable.showProgress")
[1] 1
> storage.mode(getOption("datatable.showProgress"))
[1] "integer"

When you call fread it fetches that variable :

fread = function(..., showProgress=getOption("datatable.showProgress"), ...)

then all it does is pass that down to C level, and wraps with as.integer() to be on the safe side :

ans = .Call(Creadfile, ..., as.integer(showProgress))

Now because that argument is the last one, that's why I think some mismatch in DLL versions has happened i.e. it is somehow calling an old DLL that has less arguments. I know this sounds like Windows bashing but it's all I can think of as I've seen it before.

Regardless, I've added extra tracing at C level to hone in on the problem :

// Extra tracing for apparent Windows problem: https://github.com/Rdatatable/data.table/issues/1111
    if (!isInteger(showProgressArg)) error("showProgress is not type integer but type '%s'. Please report.", type2char(TYPEOF(showProgressArg)));
    if (LENGTH(showProgressArg)!=1) error("showProgress is not length 1 but length %d. Please report.", LENGTH(showProgressArg));
    int showProgress = INTEGER(showProgressArg)[0];
    if (showProgress!=0 && showProgress!=1)
        error("showProgress is not 0 or 1 but %d. Please report.", showProgress);    

Closing for now but please let us know whether either the reboot-purge-reinstall or the new tracing reveals anything.

@mattdowle mattdowle closed this Jun 19, 2015

mattdowle added a commit that referenced this issue Jun 19, 2015

@ericsgagnon

This comment has been minimized.

Show comment
Hide comment
@ericsgagnon

ericsgagnon Jun 26, 2015

Sorry for the delayed follow up. I have another windows system that had the same issue - Win 8.1 64 bit, remainder of the setup was the same. System reboots didn't help, but uninstalling/reinstalling data.table (again, from github) twice on each machine resolve the issue. Unfortunately, I have no output to relay as everything seems to work properly. If you'd like me to repost any system info (or additional system info), please let me know. Thanks for data.table in general and the help on this one, Matt!

Eric

Sorry for the delayed follow up. I have another windows system that had the same issue - Win 8.1 64 bit, remainder of the setup was the same. System reboots didn't help, but uninstalling/reinstalling data.table (again, from github) twice on each machine resolve the issue. Unfortunately, I have no output to relay as everything seems to work properly. If you'd like me to repost any system info (or additional system info), please let me know. Thanks for data.table in general and the help on this one, Matt!

Eric

@HughParsonage

This comment has been minimized.

Show comment
Hide comment
@HughParsonage

HughParsonage Aug 17, 2015

Member

I had the same problem, only after updating to the development version 1.9.5. The error persists even if showProgress=1 is given explicitly

R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] chron_2.3-47
Member

HughParsonage commented Aug 17, 2015

I had the same problem, only after updating to the development version 1.9.5. The error persists even if showProgress=1 is given explicitly

R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.9.5

loaded via a namespace (and not attached):
[1] chron_2.3-47
@jangorecki

This comment has been minimized.

Show comment
Hide comment
@jangorecki

jangorecki Aug 17, 2015

Member

@HughParsonage did you tried the way which helps ericsgagnon to solve the issue? I would recommend to close all R session before reinstalling on clean session. I think it may happens some session were not closed correctly so the process manager may be useful to confirm that. Then reinstall on clean session should be enough. I wonder if using drat and sharing compiled libraries would solve such issues.

Member

jangorecki commented Aug 17, 2015

@HughParsonage did you tried the way which helps ericsgagnon to solve the issue? I would recommend to close all R session before reinstalling on clean session. I think it may happens some session were not closed correctly so the process manager may be useful to confirm that. Then reinstall on clean session should be enough. I wonder if using drat and sharing compiled libraries would solve such issues.

@ladida771

This comment has been minimized.

Show comment
Hide comment
@ladida771

ladida771 Oct 9, 2015

I had exactly the same error and what solved the problem is the closing of all R sessions running on the computer before installing data.table (as suggested by jangorecki).

I had exactly the same error and what solved the problem is the closing of all R sessions running on the computer before installing data.table (as suggested by jangorecki).

@ProfFancyPants

This comment has been minimized.

Show comment
Hide comment
@ProfFancyPants

ProfFancyPants Oct 20, 2015

I second the same error when using 1.9.7. I down graded to 1.9.6 and it works fine. My best guess is that the file has a number of description lines akin to a report.

Report: Funding Source Bucket Report        
Affiliate:  XXXXXXXXX       
Date Range: DDDD-DD-DD - DDDD-DD-DD     
Direct/Contracted Services: Contract Services       
XX/XXX Services:    XX Services     
Population: (all)       
Rate Set:   XXXNN XXXXXXXXX CONTRACT RATES      

PROVIDER NAME   PROVIDER ID CASE #  XXX PROCEDURE CODE
# The table...

where "PROVIDER NAME"... is the start of the actual table.
If I try to save the .csv without the intro label lines so that the first line of the file is the first provider line then the fread works perfectly.

I second the same error when using 1.9.7. I down graded to 1.9.6 and it works fine. My best guess is that the file has a number of description lines akin to a report.

Report: Funding Source Bucket Report        
Affiliate:  XXXXXXXXX       
Date Range: DDDD-DD-DD - DDDD-DD-DD     
Direct/Contracted Services: Contract Services       
XX/XXX Services:    XX Services     
Population: (all)       
Rate Set:   XXXNN XXXXXXXXX CONTRACT RATES      

PROVIDER NAME   PROVIDER ID CASE #  XXX PROCEDURE CODE
# The table...

where "PROVIDER NAME"... is the start of the actual table.
If I try to save the .csv without the intro label lines so that the first line of the file is the first provider line then the fread works perfectly.

@ritaItaly

This comment has been minimized.

Show comment
Hide comment
@ritaItaly

ritaItaly Jan 10, 2017

Hi,
I am getting the same issue on unix (I have no control on R there):
monte_carlo_thresholds <- fread("/mapr/mapr03r/analytic_users/uitmp/ecomm/projs/research/releasability/thresholds.dat",sep="\t")
Error: isLOGICAL(showProgress) is not TRUE

I checked:

getOption("datatable.showProgress")
[1] 1
storage.mode(getOption("datatable.showProgress"))
[1] "integer"
sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS release 6.3 (Final)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] data.table_1.10.0

loaded via a namespace (and not attached):
[1] tools_3.3.1

Hi,
I am getting the same issue on unix (I have no control on R there):
monte_carlo_thresholds <- fread("/mapr/mapr03r/analytic_users/uitmp/ecomm/projs/research/releasability/thresholds.dat",sep="\t")
Error: isLOGICAL(showProgress) is not TRUE

I checked:

getOption("datatable.showProgress")
[1] 1
storage.mode(getOption("datatable.showProgress"))
[1] "integer"
sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS release 6.3 (Final)

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] data.table_1.10.0

loaded via a namespace (and not attached):
[1] tools_3.3.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment