Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str() is slow for R6 objects containing large vectors #159

Closed
kevinushey opened this issue Sep 26, 2018 · 6 comments
Closed

str() is slow for R6 objects containing large vectors #159

kevinushey opened this issue Sep 26, 2018 · 6 comments

Comments

@kevinushey
Copy link

@kevinushey kevinushey commented Sep 26, 2018

library(R6)
Big <- R6Class("Big", public = list(data = rnorm(1E7)))
big <- Big$new()
system.time(str(big))
#> Classes 'Big', 'R6' <Big>
#>   Public:
#>     clone: function (deep = FALSE) 
#>     data: -0.594100136328869 0.104513371816434 0.531537555504934 - ...
#>    user  system elapsed 
#>  19.191   0.417  19.630

See also: rstudio/rstudio#3544

Could R6 subset large vectors before calling str()?

@gaborcsardi
Copy link
Contributor

@gaborcsardi gaborcsardi commented Sep 26, 2018

Seems like str calls format and the format.R6 method is slow.

Loading

@wch
Copy link
Member

@wch wch commented Sep 26, 2018

It looks most of the time is spent in this paste():

R6/R/print.R

Line 114 in 1c1f425

else if (is.atomic(obj)) trim(paste(as.character(obj), collapse = " "))

Given that the subsequent trim() call uses the default width of 60 characters, I suppose it would be OK to do something like:

  paste(as.character(head(obj, 60))

Do you guys see anything that I'm overlooking?

Loading

@gaborcsardi
Copy link
Contributor

@gaborcsardi gaborcsardi commented Sep 26, 2018

I think head() is a good idea. But it only works for objects with a [ method, so you would need some fallback, maybe just format(x) or the class of the object, like <environment>:

❯ head(environment())
Error in x[seq_len(n)] : object of type 'environment' is not subsettable

❯ head(ps::ps_handle())
Error in x[seq_len(n)] : object of type 'externalptr' is not subsettable

❯ format(environment())
[1] "<environment: R_GlobalEnv>"

❯ format(ps::ps_handle())
[1] "<ps::ps_handle> PID=18088, NAME=R, AT=2018-09-26 20:00:02"

Loading

@gaborcsardi
Copy link
Contributor

@gaborcsardi gaborcsardi commented Sep 26, 2018

Btw. there are some possibly related changes in R-3.5.1-patched, it has a workaround for print(runif(1e7)) taking a long time, i.e. it subsets using getOption(max.print) before doing the formatting. Maybe it is worth checking if format was also fixed there, not just print(). OTOH we want to fix older R versions as well...

Loading

@wch
Copy link
Member

@wch wch commented Sep 26, 2018

But head() should work for the is.atomic() case, right? For the other cases you listed, it would already print them out in different ways.

Loading

@gaborcsardi
Copy link
Contributor

@gaborcsardi gaborcsardi commented Sep 26, 2018

Yes, for is.atomic is should work fine, I think.

Loading

@wch wch closed this in cec3820 Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants