Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str() is slow for R6 objects containing large vectors #159

Closed
kevinushey opened this issue Sep 26, 2018 · 6 comments

Comments

@kevinushey
Copy link

commented Sep 26, 2018

library(R6)
Big <- R6Class("Big", public = list(data = rnorm(1E7)))
big <- Big$new()
system.time(str(big))
#> Classes 'Big', 'R6' <Big>
#>   Public:
#>     clone: function (deep = FALSE) 
#>     data: -0.594100136328869 0.104513371816434 0.531537555504934 - ...
#>    user  system elapsed 
#>  19.191   0.417  19.630

See also: rstudio/rstudio#3544

Could R6 subset large vectors before calling str()?

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented Sep 26, 2018

Seems like str calls format and the format.R6 method is slow.

@wch

This comment has been minimized.

Copy link
Member

commented Sep 26, 2018

It looks most of the time is spent in this paste():

R6/R/print.R

Line 114 in 1c1f425

else if (is.atomic(obj)) trim(paste(as.character(obj), collapse = " "))

Given that the subsequent trim() call uses the default width of 60 characters, I suppose it would be OK to do something like:

  paste(as.character(head(obj, 60))

Do you guys see anything that I'm overlooking?

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented Sep 26, 2018

I think head() is a good idea. But it only works for objects with a [ method, so you would need some fallback, maybe just format(x) or the class of the object, like <environment>:

❯ head(environment())
Error in x[seq_len(n)] : object of type 'environment' is not subsettable

❯ head(ps::ps_handle())
Error in x[seq_len(n)] : object of type 'externalptr' is not subsettable

❯ format(environment())
[1] "<environment: R_GlobalEnv>"

❯ format(ps::ps_handle())
[1] "<ps::ps_handle> PID=18088, NAME=R, AT=2018-09-26 20:00:02"
@gaborcsardi

This comment has been minimized.

Copy link
Member

commented Sep 26, 2018

Btw. there are some possibly related changes in R-3.5.1-patched, it has a workaround for print(runif(1e7)) taking a long time, i.e. it subsets using getOption(max.print) before doing the formatting. Maybe it is worth checking if format was also fixed there, not just print(). OTOH we want to fix older R versions as well...

@wch

This comment has been minimized.

Copy link
Member

commented Sep 26, 2018

But head() should work for the is.atomic() case, right? For the other cases you listed, it would already print them out in different ways.

@gaborcsardi

This comment has been minimized.

Copy link
Member

commented Sep 26, 2018

Yes, for is.atomic is should work fine, I think.

@wch wch closed this in cec3820 Sep 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.