Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] dev fread considerably slower (10x) than current CRAN fread #2107

Closed
HughParsonage opened this issue Apr 8, 2017 · 10 comments
Closed

[Misc] dev fread considerably slower (10x) than current CRAN fread #2107

HughParsonage opened this issue Apr 8, 2017 · 10 comments

Comments

@HughParsonage
Copy link
Member

@HughParsonage HughParsonage commented Apr 8, 2017

I am experiencing slow speeds with the new version of fread.

CRAN version

library(data.table)

system.time({all_polygons <- lapply(dir(pattern = "\\.csv$"), fread)})

Read 4394767 rows and 7 (of 7) columns from 0.222 GB file in 00:00:08
Read 1379467 rows and 7 (of 7) columns from 0.068 GB file in 00:00:03
Read 1407213 rows and 7 (of 7) columns from 0.069 GB file in 00:00:03
Read 1834016 rows and 7 (of 7) columns from 0.091 GB file in 00:00:04
user system elapsed
19.20 0.08 19.28

Dev version (same code), installed today

Read 4394767 rows x 7 columns from 0.222GB file in 01:30.924 wall clock time (can be slowed down by any other open apps even if seemingly idle)
Read 1379467 rows x 7 columns from 0.068GB file in 00:28.938 wall clock time (can be slowed down by any other open apps even if seemingly idle)
Read 416937 rows x 7 columns from 0.020GB file in 00:08.594 wall clock time (can be slowed down by any other open apps even if seemingly idle)
Read 1407213 rows x 7 columns from 0.069GB file in 00:29.412 wall clock time (can be slowed down by any other open apps even if seemingly idle)
Read 1834016 rows x 7 columns from 0.091GB file in 00:38.512 wall clock time (can be slowed down by any other open apps even if seemingly idle)
Read 1028604 rows x 7 columns from 0.051GB file in 00:21.361 wall clock time (can be slowed down by any other open apps even if seemingly idle)
user system elapsed
2543.06 1.08 223.93

Similarly, with the DT example provided in ?fread:

CRAN

system.time({fread("DT.csv")})
user system elapsed
1.67 0.00 1.73

Dev:

Read 1000000 rows x 6 columns from 0.050GB file in 00:15.523 wall clock time (can be slowed down by any other open apps even if seemingly idle)
user system elapsed
177.96 0.01 15.53

I did have other programs open. Is it expected that having any program open would result in such a reduction in speed?

sessionInfo()

R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)


With verbose = TRUE. First 1.10.4, then 1.10.5

> library(data.table)
data.table 1.10.4
  The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
  Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
  Release notes, videos and slides: http://r-datatable.com
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                      
[5] LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.10.4

loaded via a namespace (and not attached):
[1] tools_3.3.3
> fread("DT.csv", verbose = TRUE)
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.049546 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... ','
Detected 6 columns. Longest stretch was from line 1 to line 30
Starting data input on line 1 (either column names or first row of data). First 10 characters: a,b,c,d,e,
All the fields on line 1 are character fields. Treating as the column names.
Count of eol: 1000001 (including 1 at the end)
Count of sep: 5000000
nrow = MIN( nsep [5000000] / (ncol [6] -1), neol [1000001] - endblanks [1] ) = 1000000
Type codes (point  0): 113431
Type codes (point  1): 113431
Type codes (point  2): 113431
Type codes (point  3): 113431
Type codes (point  4): 113431
Type codes (point  5): 113431
Type codes (point  6): 113431
Type codes (point  7): 113431
Type codes (point  8): 113431
Type codes (point  9): 113431
Type codes (point 10): 113431
Type codes: 113431 (after applying colClasses and integer64)
Type codes: 113431 (after applying drop or select (if supplied)
Allocating 6 column slots (6 - 0 dropped)
Read 1000000 rows. Exactly what was estimated and allocated up front
   0.000s (  0%) Memory map (rerun may be quicker)
   0.000s (  0%) sep and header detection
   0.148s (  9%) Count rows (wc -l)
   0.016s (  1%) Column type detection (100 rows at 10 points)
   0.098s (  6%) Allocation of 1000000x6 result (xMB) in RAM
   1.408s ( 84%) Reading data
   0.000s (  0%) Allocation for type bumps (if any), including gc time if triggered
   0.000s (  0%) Coercing data already read in type bumps (if any)
   0.001s (  0%) Changing na.strings to NA
   1.671s        Total
           a   b          c    d           e   f
      1: 899 707  1.3258750  baz -0.45269512 145
      2: 435  NA  0.3934530  foo         Inf 707
      3: 770 875  0.6292662             -Inf 574
      4: 566 515         NA quux -0.60283779 421
      5: 982 737  0.6028758       1.10495280 663
     ---                                        
 999996: 776 115 -0.1153838 quux  0.63482317 983
 999997: 313 926 -0.8554561  qux  0.05537117 261
 999998: 251 242  0.6270674  foo  1.71461292 340
 999999: 614 722  1.0111390  bar  1.60265119 668
1000000: 684 399  1.3670426 quux -1.33576393 811

(Restart R.)

remove.packages("data.table")

install.packages("data.table", type = "source",
+                  repos = "http://Rdatatable.github.io/data.table") # Then install devel version
trying URL 'http://Rdatatable.github.io/data.table/src/contrib/data.table_1.10.5.tar.gz'
Content type 'application/octet-stream' length 3094360 bytes (3.0 MB)
downloaded 3.0 MB

* installing *source* package 'data.table' ...
** libs

*** arch - i386
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c assign.c -o assign.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c between.c -o between.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c bmerge.c -o bmerge.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c chmatch.c -o chmatch.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c dogroups.c -o dogroups.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c fastmean.c -o fastmean.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c fcast.c -o fcast.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c fmelt.c -o fmelt.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c forder.c -o forder.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c frank.c -o frank.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c fread.c -o fread.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c fsort.c -o fsort.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c fwrite.c -o fwrite.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c gsumm.c -o gsumm.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c ijoin.c -o ijoin.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c init.c -o init.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c inrange.c -o inrange.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c openmp-utils.c -o openmp-utils.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c quickselect.c -o quickselect.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c rbindlist.c -o rbindlist.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c reorder.c -o reorder.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c shift.c -o shift.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c subset.c -o subset.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c transpose.c -o transpose.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c uniqlist.c -o uniqlist.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c vecseq.c -o vecseq.o
c:/Rtools/mingw_32/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O3 -Wall  -std=gnu99 -mtune=core2 -c wrappers.c -o wrappers.o
c:/Rtools/mingw_32/bin/gcc -shared -s -static-libgcc -o data.table.dll tmp.def assign.o between.o bmerge.o chmatch.o dogroups.o fastmean.o fcast.o fmelt.o forder.o frank.o fread.o fsort.o fwrite.o gsumm.o ijoin.o init.o inrange.o openmp-utils.o quickselect.o rbindlist.o reorder.o shift.o subset.o transpose.o uniqlist.o vecseq.o wrappers.o -fopenmp -Ld:/Compiler/gcc-4.9.3/local330/lib/i386 -Ld:/Compiler/gcc-4.9.3/local330/lib -LC:/R/R-33~1.3/bin/i386 -lR
mv data.table.dll datatable.dll
if [ "Windows_NT" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id datatable.dll datatable.dll; fi
installing to C:/R/R-3.3.3/library/data.table/libs/i386

*** arch - x64
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c assign.c -o assign.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c between.c -o between.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c bmerge.c -o bmerge.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c chmatch.c -o chmatch.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c dogroups.c -o dogroups.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c fastmean.c -o fastmean.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c fcast.c -o fcast.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c fmelt.c -o fmelt.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c forder.c -o forder.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c frank.c -o frank.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c fread.c -o fread.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c fsort.c -o fsort.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c fwrite.c -o fwrite.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c gsumm.c -o gsumm.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c ijoin.c -o ijoin.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c init.c -o init.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c inrange.c -o inrange.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c openmp-utils.c -o openmp-utils.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c quickselect.c -o quickselect.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c rbindlist.c -o rbindlist.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c reorder.c -o reorder.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c shift.c -o shift.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c subset.c -o subset.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c transpose.c -o transpose.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c uniqlist.c -o uniqlist.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c vecseq.c -o vecseq.o
c:/Rtools/mingw_64/bin/gcc  -I"C:/R/R-33~1.3/include" -DNDEBUG     -I"d:/Compiler/gcc-4.9.3/local330/include"  -fopenmp   -O2 -Wall  -std=gnu99 -mtune=core2 -c wrappers.c -o wrappers.o
c:/Rtools/mingw_64/bin/gcc -shared -s -static-libgcc -o data.table.dll tmp.def assign.o between.o bmerge.o chmatch.o dogroups.o fastmean.o fcast.o fmelt.o forder.o frank.o fread.o fsort.o fwrite.o gsumm.o ijoin.o init.o inrange.o openmp-utils.o quickselect.o rbindlist.o reorder.o shift.o subset.o transpose.o uniqlist.o vecseq.o wrappers.o -fopenmp -Ld:/Compiler/gcc-4.9.3/local330/lib/x64 -Ld:/Compiler/gcc-4.9.3/local330/lib -LC:/R/R-33~1.3/bin/x64 -lR
mv data.table.dll datatable.dll
if [ "Windows_NT" != "Windows_NT" ] && [ `uname -s` = 'Darwin' ]; then install_name_tool -id datatable.dll datatable.dll; fi
installing to C:/R/R-3.3.3/library/data.table/libs/x64
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded
*** arch - i386
*** arch - x64
* DONE (data.table)

The downloaded source packages are inC:\Users\hughp\AppData\Local\Temp\RtmpEDgMmt\downloaded_packages> library(data.table)

> fread("DT.csv", verbose = TRUE)
Parameter na.strings == <<NA>>
None of the 1 na.strings are numeric (such as '-9999').
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 0.049546 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 starting: <<a,b,c,d,e,f>>
Detecting sep ...
  sep==','(ascii 44)  with 101 lines of 6 fields using quote rule 0
Detected 6 columns on line 1. This line is either column names or first data row (first 30 chars): <<a,b,c,d,e,f>>
All the fields on line 1 are character fields. Treating as the column names.
Number of sampling jump points  = 101 because 53199178 bytes from row 1 to eof / (2 * 5275 jump0size) == 5042
Type codes (jump 000)    : 224542  Quote rule 0
Type codes (jump 100)    : 224542  Quote rule 0
=====
 Sampled 10049 rows (handled \n inside quoted fields) at 101 jump points including middle and very end
 Bytes from first data row on line 2 to the end of last row: 53199178
 Line length: mean=53.20 sd=1.44 min=36 max=59
 Estimated nrow: 53199178 / 53.20 = 999979
 Initial alloc = 1099977 rows (999979 + 10%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
=====
Type codes (colClasses)  : 224542
Type codes (drop|select) : 224542
Allocating 6 column slots (6 - 0 dropped)
Reading 60 chunks of 0.846MB (16666 rows) using 12 threads
Read 1000000 rows x 6 columns from 0.050GB file in 00:15.595 wall clock time (can be slowed down by any other open apps even if seemingly idle)
Final type counts
         0 : drop     
         0 : logical  
         3 : integer  
         0 : integer64
         2 : double   
         1 : character
Thread buffers were grown 0 times (if all 12 threads each grew once, this figure would be 12)
=============================
   0.000s (  0%) Memory map
   0.000s (  0%) sep, ncol and header detection
   0.034s (  0%) Column type detection using 10049 sample rows
   0.103s (  1%) Allocation of 1000000 rows x 6 cols (0.037GB) plus 0.007GB of temporary buffers
  15.458s ( 99%) Reading data
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
  15.595s        Total
           a   b          c    d           e   f
      1: 899 707  1.3258750  baz -0.45269512 145
      2: 435  NA  0.3934530  foo         Inf 707
      3: 770 875  0.6292662             -Inf 574
      4: 566 515         NA quux -0.60283779 421
      5: 982 737  0.6028758       1.10495280 663
     ---                                        
 999996: 776 115 -0.1153838 quux  0.63482317 983
 999997: 313 926 -0.8554561  qux  0.05537117 261
 999998: 251 242  0.6270674  foo  1.71461292 340
 999999: 614 722  1.0111390  bar  1.60265119 668
1000000: 684 399  1.3670426 quux -1.33576393 811
@scottstanfield
Copy link

@scottstanfield scottstanfield commented Apr 9, 2017

Jumping in: I couldn't tell if you were using DT version 1.10.5 or older. I had the same issue with performance in 1.10.4 and removing the installing from github fixed it.

library(data.table)
print(version)
print(sessionInfo())

dt <- fread('bigfile.csv', stringsAsFactors=T)

On my 36 core machine with NVME, I read a 3.6M row CSV with ~300 columns in 24 seconds with 1.10.5. With 1.10.4, pre-parallel (I think), it took 2m 42 seconds.

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Apr 9, 2017

@HughParsonage Hm. Interesting. No it's not expected to be that slow even with other apps open. Please run with verbose=TRUE and post full output. Also if it's Windows try rebooting and installing clean. Sometimes different versions of .dll can get mixed up on Windows. Something somewhere is going wrong.

@HughParsonage
Copy link
Member Author

@HughParsonage HughParsonage commented Apr 9, 2017

Updated OP to include full output with verbose.

@scottstanfield
Copy link

@scottstanfield scottstanfield commented Apr 9, 2017

I'm just lurking on this issue...I did find 1.10.5 on Ubuntu 16.04 with R 3.3.3 'Another Canoe' is working well for me now.

I'm using this tiny benchmark.r to test perf everytime I update. The https://gist.github.com/scottstanfield/2e1e71f32d397309d0faf6a32d49e91c

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Apr 9, 2017

Thanks for verbose output. All looks good in terms of compile and threads kicking in. All I can see is that the file is small at 50MB, then it's chunking into 60 pieces given to 12 threads. Each thread only gets 5 pieces to do. 5 < 12. Maybe it's sticking when that's the case. If so it's a bug in the logic and not to do Windows but this data size and nThread combination. I should be able to reproduce on Linux. Can you send me one of the files you see problem? What is the machine as well please (I can look it up online to see if it has 12MB of L2 cache?)

@HughParsonage
Copy link
Member Author

@HughParsonage HughParsonage commented Apr 9, 2017

I'm happy to give you the files, but in the meantime the issue reproduces also with n=1e8: 148.252s vs 1560.475s. Verbose output below.

I'm running an i7-6800K machine @ 3.40 GHz with 128 GB RAM. Hard-drive is a Samsung SSD 850 EVO 1TB.

CRAN

> set.seed(1)
> n=1e8
> DT = data.table( a=sample(1:1000,n,replace=TRUE),
+                  b=sample(1:1000,n,replace=TRUE),
+                  c=rnorm(n),
+                  d=sample(c("foo","bar","baz","qux","quux"),n,replace=TRUE),
+                  e=rnorm(n),
+                  f=sample(1:1000,n,replace=TRUE) )
> DT[2,b:=NA_integer_]
> DT[4,c:=NA_real_]
> DT[3,d:=NA_character_]
> DT[5,d:=""]
> DT[2,e:=+Inf]
> DT[3,e:=-Inf]
> fwrite(DT, "DT.csv")
                                                                                                                                     
> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C                       LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.10.4

loaded via a namespace (and not attached):
[1] tools_3.3.3
> fread("DT.csv", verbose = TRUE)
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 4.954535 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 after skip or autostart
This line is the autostart and not blank so searching up for the last non-blank ... line 1
Detecting sep ... ','
Detected 6 columns. Longest stretch was from line 1 to line 30
Starting data input on line 1 (either column names or first row of data). First 10 characters: a,b,c,d,e,
All the fields on line 1 are character fields. Treating as the column names.
Count of eol: 100000001 (including 1 at the end)
Count of sep: 500000000
nrow = MIN( nsep [500000000] / (ncol [6] -1), neol [100000001] - endblanks [1] ) = 100000000
Type codes (point  0): 113431
Type codes (point  1): 113431
Type codes (point  2): 113431
Type codes (point  3): 113431
Type codes (point  4): 113431
Type codes (point  5): 113431
Type codes (point  6): 113431
Type codes (point  7): 113431
Type codes (point  8): 113431
Type codes (point  9): 113431
Type codes (point 10): 113431
Type codes: 113431 (after applying colClasses and integer64)
Type codes: 113431 (after applying drop or select (if supplied)
Allocating 6 column slots (6 - 0 dropped)
Read 100000000 rows and 6 (of 6) columns from 4.955 GB file in 00:02:29
Read 100000000 rows. Exactly what was estimated and allocated up front
   0.000s (  0%) Memory map (rerun may be quicker)
   0.000s (  0%) sep and header detection
   5.239s (  4%) Count rows (wc -l)
   0.004s (  0%) Column type detection (100 rows at 10 points)
   0.939s (  1%) Allocation of 100000000x6 result (xMB) in RAM
 141.995s ( 96%) Reading data
   0.000s (  0%) Allocation for type bumps (if any), including gc time if triggered
   0.000s (  0%) Coercing data already read in type bumps (if any)
   0.075s (  0%) Changing na.strings to NA
 148.252s        Total
             a   b          c    d          e   f
        1: 266 593  0.3101614 quux -0.9514252 500
        2: 373  NA  1.6320205  bar        Inf 253
        3: 573 754 -0.5334998            -Inf 451
        4: 909 965         NA  baz -0.1766243 740
        5: 202 654 -0.4235275      -0.4945436 494
       ---                                       
 99999996: 127 100  0.9068599  foo -1.1319779 481
 99999997: 201 344 -3.1681951 quux  0.0405684 119
 99999998: 682 864  0.1459793  bar -0.7597580 145
 99999999: 580 361 -0.9512592  baz -1.1614999 971
100000000: 664 793  0.1269717  foo -0.6184478 252

Dev version

> library(data.table)
data.table 1.10.5 IN DEVELOPMENT built 2017-04-04 00:37:37 UTC; travis
  The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
  Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
  Release notes, videos and slides: http://r-datatable.com
> fread("DT.csv", verbose = TRUE)
Parameter na.strings == <<NA>>
None of the 1 na.strings are numeric (such as '-9999').
Input contains no \n. Taking this to be a filename to open
File opened, filesize is 4.954535 GB.
Memory mapping ... ok
Detected eol as \r\n (CRLF) in that order, the Windows standard.
Positioned on line 1 starting: <<a,b,c,d,e,f>>
Detecting sep ...
  sep==','(ascii 44)  with 101 lines of 6 fields using quote rule 0
Detected 6 columns on line 1. This line is either column names or first data row (first 30 chars): <<a,b,c,d,e,f>>
All the fields on line 1 are character fields. Treating as the column names.
Number of sampling jump points  = 101 because 5319891523 bytes from row 1 to eof / (2 * 5274 jump0size) == 504350
Type codes (jump 000)    : 224542  Quote rule 0
Type codes (jump 100)    : 224542  Quote rule 0
=====
 Sampled 10049 rows (handled \n inside quoted fields) at 101 jump points including middle and very end
 Bytes from first data row on line 2 to the end of last row: 5319891523
 Line length: mean=53.18 sd=1.46 min=34 max=60
 Estimated nrow: 5319891523 / 53.18 = 100028797
 Initial alloc = 110031677 rows (100028797 + 10%) using bytes/max(mean-2*sd,min) clamped between [1.1*estn, 2.0*estn]
=====
Type codes (colClasses)  : 224542
Type codes (drop|select) : 224542
Allocating 6 column slots (6 - 0 dropped)
Reading 5076 chunks of 0.999MB (19706 rows) using 12 threads
Read 100000000 rows x 6 columns from 4.955GB file in 26:00.475 wall clock time (can be slowed down by any other open apps even if seemingly idle)
Final type counts
         0 : drop     
         0 : logical  
         3 : integer  
         0 : integer64
         2 : double   
         1 : character
Thread buffers were grown 0 times (if all 12 threads each grew once, this figure would be 12)
=============================
   0.001s (  0%) Memory map
   0.000s (  0%) sep, ncol and header detection
   0.032s (  0%) Column type detection using 10049 sample rows
   0.706s (  0%) Allocation of 100000000 rows x 6 cols (3.689GB) plus 0.009GB of temporary buffers
1559.737s (100%) Reading data
   0.000s (  0%) Rereading 0 columns due to out-of-sample type exceptions
1560.475s        Total
             a   b          c    d          e   f
        1: 266 593  0.3101614 quux -0.9514252 500
        2: 373  NA  1.6320205  bar        Inf 253
        3: 573 754 -0.5334998            -Inf 451
        4: 909 965         NA  baz -0.1766243 740
        5: 202 654 -0.4235275      -0.4945436 494
       ---                                       
 99999996: 127 100  0.9068599  foo -1.1319779 481
 99999997: 201 344 -3.1681951 quux  0.0405684 119
 99999998: 682 864  0.1459793  bar -0.7597580 145
 99999999: 580 361 -0.9512592  baz -1.1614999 971
100000000: 664 793  0.1269717  foo -0.6184478 252



@HughParsonage
Copy link
Member Author

@HughParsonage HughParsonage commented Apr 10, 2017

I was able to reproduce the issue on a different Windows machine running i7 4770K @3.40GHz. I got 1.4s vs 12.1s for the example in the ?fread.

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Apr 15, 2017

Attempt 1 did not work. No need to test.

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Apr 15, 2017

Ok attempt 2 worked. I borrowed a Windows 8.1 laptop and have confirmed.
Please retest and confirm ok now. I see 97s down to 1.2s
To be sure, please download the .zip from AppVeyor.

data.table 1.10.5 IN DEVELOPMENT built 2017-04-15 11:11:18 UTC; appveyor
install.packages("https://ci.appveyor.com/api/buildjobs/1txx94ruv769jc42/artifacts/data.table_1.10.5.zip", repos=NULL)

@mattdowle mattdowle added this to the v1.10.6 milestone Apr 15, 2017
@HughParsonage
Copy link
Member Author

@HughParsonage HughParsonage commented Apr 15, 2017

Confirmed. I get 0.28s for the n=1e6 example and 3.77s for the n=1e8 example. Absolutely ridiculous speeds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants