Reading very large Feather file over the network is much slower than reading RDS files. Is this a bug or am I doing something wrong?
In this reproducible example, I wrote 524 thousand rows in local folder and in a network folder. While in the local folder feather is very quick, on a network folder, read_feather took 139s while readRDS took 1.11secs !!
Can you help me?
library(feather)
data=mtcars;for (i in 1:14) {data=rbind(data,data)}
nrow(data) #524 K rows
#> [1] 524288
##### LOCAL ####
r='c:/temp/dataTest.rds'
f='c:/temp/dataTest.feather'
## Saving RDS on Local
system.time(saveRDS(data,r))
#> user system elapsed
#> 1 0 1
## Reading RDS on Local
system.time(readRDS(r))
#> user system elapsed
#> 0.44 0.02 0.45
## Saving Feather on Local
system.time(feather::write_feather(data,f))
#> user system elapsed
#> 0.02 0.01 0.03
## Reading Feather on Local
system.time(feather::read_feather(f))
#> user system elapsed
#> 0.00 0.03 0.05
file.remove(r,f)
#> [1] TRUE TRUE
##### NETWORK ####
r='//server/folder/dataTest.rds'
f='//server/folder/dataTest.feather'
## Saving RDS on Network
system.time(saveRDS(data,r))
#> user system elapsed
#> 1.08 0.05 1.33
## Reading RDS on Network
system.time(readRDS(r))
#> user system elapsed
#> 0.42 0.00 1.11
## Saving Feather on Network
system.time(feather::write_feather(data,f))
#> user system elapsed
#> 0.01 0.05 0.60
## Reading Feather on Network
system.time(feather::read_feather(f))
#> user system elapsed
#> 0.02 0.20 139.51
file.remove(r,f)
#> [1] TRUE TRUE
Created on 2018-06-29 by the reprex package (v0.2.0).
The text was updated successfully, but these errors were encountered:
Unfortunately, the memory mapping flag is not exposed right now in the R API. I don't have a timeline to fix, but a PR would be welcome. In the meantime, I suggest you copy files locally before reading them on this particular network; I'm sorry for the inconvenience
Reading very large Feather file over the network is much slower than reading RDS files. Is this a bug or am I doing something wrong?
In this reproducible example, I wrote 524 thousand rows in local folder and in a network folder. While in the local folder feather is very quick, on a network folder, read_feather took 139s while readRDS took 1.11secs !!
Can you help me?
The text was updated successfully, but these errors were encountered: