Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning matrices from generated quantities is slow #519

Closed
bletham opened this issue Apr 24, 2018 · 11 comments
Closed

Returning matrices from generated quantities is slow #519

bletham opened this issue Apr 24, 2018 · 11 comments

Comments

@bletham
Copy link

bletham commented Apr 24, 2018

Summary:

Returning medium-sized matrices from generated quantities is extremely slow.

Description:

I originally raised this in stan-dev/stan#2516. I'm doing some computation in generated quantities in which I produce a bunch of samples from already-fitted model parameters. To do this with a point estimate, I am running rstan::optimizing initialized at the point estimate, and with iter=0. This does the computation I expect, but is extremely slow if the thing being generated in generated quantities is a medium-sized matrix.

The optimization finishes immediately with "Optimization terminated normally: Maximum number of iterations hit", but then there is a long delay before the command actually finishes: With a 3000x100 matrix it is a 5s delay, with a 3000x200 matrix a 10s delay, etc.

Optimizing the model below runs instantly in cmdstan, suggesting that it is an issue at the interface. It also returns very quickly in pystan.

This may be related to #464. Is there a workaround here? 3000x200 didn't seem to me like that large of a matrix to be causing 10s of I/O, especially relative to the amount of data produced by MCMC.

Reproducible Steps:

library(rstan)
model_code = '
data {
  int N;
  vector[N] y;
}

parameters {
  real theta;
}

model {
  y ~ normal(theta, 1);
}

generated quantities {
  matrix[3000, 200] A;
  for (j in 1:200) {
    for (i in 1:3000) {
      A[i, j] = 1;
    }
  }

  print("finished gq");
}
'
model = stan_model(model_code=model_code)

dat <- list(N=20, y=rnorm(20, 2))

t1 = Sys.time()
res <- optimizing(
  model, data = dat, init = function(){list(theta=2)}, iter=0
)
print(Sys.time() - t1)  # 10s on my machine

RStan Version:

2.17.3

R Version:

3.3.3

Operating System:

Debian

@bletham
Copy link
Author

bletham commented Apr 24, 2018

For one additional piece of context, we are trying to move all of the prediction for prophet into Stan (currently the model is fit in Stan but predictions are done in R/py). Part of the prediction is a simulation of future trend changes, and it is returning the results of that simulation that is the bottleneck.

@bob-carpenter
Copy link

Thanks for exploring further and moving. We want to make this fast because we're about to roll out some standalone generated quantities evaluation so that you can posterior draws and rerun with new generated quantities blocks.

@bletham
Copy link
Author

bletham commented Apr 25, 2018

That functionality would be extremely useful, really excited to hear about it!

@maverickg
Copy link
Contributor

At least a couple of copying is done in this case (see here https://github.com/stan-dev/rstan/blob/develop/rstan/rstan/inst/include/rstan/stan_fit.hpp#L535).

However, this might not be the main reason it takes an unexpected time. For now, it takes quite some time to generate the names for the parameters. For example,

> print(Sys.time() - t1)  # 10s on my machine
Time difference of 14.22466 secs

> system.time( rstan:::flat_one_par('a', c(3000, 200)))
   user  system elapsed
  8.166   0.056   8.225

@sakrejda
Copy link
Contributor

seq_array_ind loops over all the indexes (at a glance I think it does) so that would be slow. That could easily use R's 'gl' function.

@seanjtaylor
Copy link

Hey @bob-carpenter and @bgoodri, do you think you could estimate how long this will take to fix? Two weeks vs two months vs. 6 months would make a big difference for us. We'd like to plan the next Prophet release and this is a big decision point for us (whether to push our prediction into generated quantities or keep it in Python/R).

@bgoodri
Copy link
Contributor

bgoodri commented Apr 30, 2018 via email

@bob-carpenter
Copy link

bob-carpenter commented May 1, 2018 via email

@bgoodri
Copy link
Contributor

bgoodri commented May 1, 2018 via email

@maverickg
Copy link
Contributor

seq_array_ind is the not the main reason. The conversion in R tool quite some time.

> rstan:::flat_one_par
function (n, d, col_major = FALSE)
{
    if (0 == length(d))
        return(n)
    nameidx <- seq_array_ind(d, col_major)
    names <- apply(nameidx, 1, function(x) paste(n, "[", paste(x,
        collapse = ","), "]", sep = ""))
    as.vector(names)
}
> system.time(nameidx <- rstan:::seq_array_ind(d, FALSE))
   user  system elapsed
  0.758   0.007   0.765
> n <- 'a'
> system.time(names <- apply(nameidx, 1, function(x) paste(n, "[", paste(x,
+         collapse = ","), "]", sep = "")))
   user  system elapsed
  7.329   0.021   7.352

I will see if using the names returned from c++ code is faster.

@maverickg
Copy link
Contributor

this helps a lot. 73f4ba7

but in the above case, it still needs 2 or 3 seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants