You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using capture.output with large objects can be really slow. Doing some console testing, number of rows is non-linear in its time-to-encode:
system.time( zz<- capture.output(write.table(data.frame(a=1:10000), sep="\t")) )
# user system elapsed # 0.2 0.0 0.2
system.time( zz<- capture.output(write.table(data.frame(a=1:20000), sep="\t")) )
# user system elapsed # 0.84 0.00 0.84
system.time( zz<- capture.output(write.table(data.frame(a=1:30000), sep="\t")) )
# user system elapsed # 2.08 0.00 2.08
system.time( zz<- capture.output(write.table(data.frame(a=1:40000), sep="\t")) )
# user system elapsed # 4.20 0.00 4.21
system.time( zz<- capture.output(write.table(data.frame(a=1:50000), sep="\t")) )
# user system elapsed # 7.70 0.02 7.75
Whereas using temporary files is quite a bit faster:
tf<-"foo.txt"
system.time( {
write.table(data.frame(a=1:10000), sep="\t", file=tf)
zz<- paste(readLines(tf), collapse="\r\n")
writeClipboard(zz)
})
# user system elapsed # 0.03 0.00 0.03
system.time( {
write.table(data.frame(a=1:50000), sep="\t", file=tf)
zz<- paste(readLines(tf), collapse="\r\n")
writeClipboard(zz)
})
# user system elapsed # 0.13 0.00 0.13
And pasting into Excel works as expected in a fraction of the time.
Even going to the extreme row-count of Excel 2013/2016 (allowing for a header row):
system.time( {
write.table(data.frame(a=1:1048575), sep="\t", file=tf)
zz<- paste(readLines(tf), collapse="\r\n")
writeClipboard(zz)
})
# user system elapsed # 3.72 0.08 3.89
(I don't want to try that with capture.output, though by its exponential progression I imagine it would take on the order of 700 seconds if it completed at all.)
BTW: I'm testing this on R-3.2.5 on win10_64, so I don't know if or how much impact this would have on other architectures.
The text was updated successfully, but these errors were encountered:
Oh, interesting problem! I'd not expected clipr to be used for such large payloads, but then again, why not? The temp file solution looks elegant enough (I'd use the tempfile function to create a filepath, though.)
capture.output is also used on OS X and X11-like systems, so let me take a look at the behavior there. If there's no adverse performance effects on those platforms to writing to a tempfile, I'll implement it.
Yes, tempfile was the assumed preferred method over my hard-coded filename.
It's always interesting to see how others use your packages in ways you had not imagined. I'm frequently trying to copy things between R and Excel, at times pushing Excel's limits. I was pointed to your package by @alistaire on StackOverflow, and that's when I learned about utils::readClipboard and family. I've had my own home-grown function using read.delim("clipboard", ...), and it would just instantly fail with large data, so I've been resorting to intermediary CSV files. Though I may not replace my home-grown utils with clipr, I'm deeply appreciative looking at your code to improve my stuff. Thanks!
Using
capture.output
with large objects can be really slow. Doing some console testing, number of rows is non-linear in its time-to-encode:Whereas using temporary files is quite a bit faster:
And pasting into Excel works as expected in a fraction of the time.
Even going to the extreme row-count of Excel 2013/2016 (allowing for a header row):
(I don't want to try that with
capture.output
, though by its exponential progression I imagine it would take on the order of 700 seconds if it completed at all.)BTW: I'm testing this on R-3.2.5 on win10_64, so I don't know if or how much impact this would have on other architectures.
The text was updated successfully, but these errors were encountered: