Returning matrices from generated quantities is slow #519

bletham · 2018-04-24T18:26:29Z

Summary:

Returning medium-sized matrices from generated quantities is extremely slow.

Description:

I originally raised this in stan-dev/stan#2516. I'm doing some computation in generated quantities in which I produce a bunch of samples from already-fitted model parameters. To do this with a point estimate, I am running rstan::optimizing initialized at the point estimate, and with iter=0. This does the computation I expect, but is extremely slow if the thing being generated in generated quantities is a medium-sized matrix.

The optimization finishes immediately with "Optimization terminated normally: Maximum number of iterations hit", but then there is a long delay before the command actually finishes: With a 3000x100 matrix it is a 5s delay, with a 3000x200 matrix a 10s delay, etc.

Optimizing the model below runs instantly in cmdstan, suggesting that it is an issue at the interface. It also returns very quickly in pystan.

This may be related to #464. Is there a workaround here? 3000x200 didn't seem to me like that large of a matrix to be causing 10s of I/O, especially relative to the amount of data produced by MCMC.

Reproducible Steps:

library(rstan)
model_code = '
data {
  int N;
  vector[N] y;
}

parameters {
  real theta;
}

model {
  y ~ normal(theta, 1);
}

generated quantities {
  matrix[3000, 200] A;
  for (j in 1:200) {
    for (i in 1:3000) {
      A[i, j] = 1;
    }
  }

  print("finished gq");
}
'
model = stan_model(model_code=model_code)

dat <- list(N=20, y=rnorm(20, 2))

t1 = Sys.time()
res <- optimizing(
  model, data = dat, init = function(){list(theta=2)}, iter=0
)
print(Sys.time() - t1)  # 10s on my machine

RStan Version:

2.17.3

R Version:

3.3.3

Operating System:

Debian

The text was updated successfully, but these errors were encountered:

bletham · 2018-04-24T18:36:05Z

For one additional piece of context, we are trying to move all of the prediction for prophet into Stan (currently the model is fit in Stan but predictions are done in R/py). Part of the prediction is a simulation of future trend changes, and it is returning the results of that simulation that is the bottleneck.

bob-carpenter · 2018-04-25T14:48:39Z

Thanks for exploring further and moving. We want to make this fast because we're about to roll out some standalone generated quantities evaluation so that you can posterior draws and rerun with new generated quantities blocks.

bletham · 2018-04-25T17:05:46Z

That functionality would be extremely useful, really excited to hear about it!

maverickg · 2018-04-25T17:57:46Z

At least a couple of copying is done in this case (see here https://github.com/stan-dev/rstan/blob/develop/rstan/rstan/inst/include/rstan/stan_fit.hpp#L535).

However, this might not be the main reason it takes an unexpected time. For now, it takes quite some time to generate the names for the parameters. For example,

> print(Sys.time() - t1)  # 10s on my machine
Time difference of 14.22466 secs

> system.time( rstan:::flat_one_par('a', c(3000, 200)))
   user  system elapsed
  8.166   0.056   8.225

sakrejda · 2018-04-26T14:27:14Z

seq_array_ind loops over all the indexes (at a glance I think it does) so that would be slow. That could easily use R's 'gl' function.

seanjtaylor · 2018-04-30T18:07:29Z

Hey @bob-carpenter and @bgoodri, do you think you could estimate how long this will take to fix? Two weeks vs two months vs. 6 months would make a big difference for us. We'd like to plan the next Prophet release and this is a big decision point for us (whether to push our prediction into generated quantities or keep it in Python/R).

bgoodri · 2018-04-30T18:25:29Z

I'll work on it, but I don't know what the ultimate fix is. I suspect we need to move to storing things in a matrix rather than a std::vector for each margin, but that would entail a lot of change to very old code.

…

On Apr 30, 2018 2:07 PM, "Sean J. Taylor" ***@***.***> wrote: Hey @bob-carpenter <https://github.com/bob-carpenter> and @bgoodri <https://github.com/bgoodri>, do you think you could estimate how long this will take to fix? Two weeks vs two months vs. 6 months would make a big difference for us. We'd like to plan the next Prophet release and this is a big decision point for us (whether to push our prediction into generated quantities or keep it in Python/R). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#519 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADOrqnc3GTrUo9dhpKondcQ6_fgdN0uIks5tt1LigaJpZM4TiL9t> .

bob-carpenter · 2018-05-01T00:23:08Z

That conversion shouldn't be expensive if we do it on the C++ side. Given that the CmdStan version isn't slow, the bottleneck isn't just in traversing the existing data structures.

bgoodri · 2018-05-01T00:43:19Z

For me, the bottleneck is the stack overflow.

…

On Mon, Apr 30, 2018 at 8:23 PM, Bob Carpenter ***@***.***> wrote: That conversion shouldn't be expensive if we do it on the C++ side. Given that the CmdStan version isn't slow, the bottleneck isn't just in traversing the existing data structures. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#519 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADOrqtc8QAYIhUftIzUc2BqMVHLl0DpCks5tt6rtgaJpZM4TiL9t> .

maverickg · 2018-05-01T00:53:01Z

seq_array_ind is the not the main reason. The conversion in R tool quite some time.

> rstan:::flat_one_par
function (n, d, col_major = FALSE)
{
    if (0 == length(d))
        return(n)
    nameidx <- seq_array_ind(d, col_major)
    names <- apply(nameidx, 1, function(x) paste(n, "[", paste(x,
        collapse = ","), "]", sep = ""))
    as.vector(names)
}

> system.time(nameidx <- rstan:::seq_array_ind(d, FALSE))
   user  system elapsed
  0.758   0.007   0.765
> n <- 'a'
> system.time(names <- apply(nameidx, 1, function(x) paste(n, "[", paste(x,
+         collapse = ","), "]", sep = "")))
   user  system elapsed
  7.329   0.021   7.352

I will see if using the names returned from c++ code is faster.

maverickg · 2018-05-01T02:24:14Z

this helps a lot. 73f4ba7

but in the above case, it still needs 2 or 3 seconds.

bletham mentioned this issue Apr 24, 2018

Move predictions to Stan facebook/prophet#501

Closed

maverickg mentioned this issue May 3, 2018

Bugfix/issue#519 #522

Merged

martinmodrak mentioned this issue Jun 4, 2018

Consolidate and refactor output from algorithms stan-dev/stan#2534

Open

bgoodri closed this as completed Jun 13, 2018

bletham mentioned this issue Jul 26, 2019

Implement cmdstanpy as optional Python/Stan interface facebook/prophet#1045

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Returning matrices from generated quantities is slow #519

Returning matrices from generated quantities is slow #519

bletham commented Apr 24, 2018

bletham commented Apr 24, 2018

bob-carpenter commented Apr 25, 2018

bletham commented Apr 25, 2018

maverickg commented Apr 25, 2018

sakrejda commented Apr 26, 2018

seanjtaylor commented Apr 30, 2018

bgoodri commented Apr 30, 2018 via email

bob-carpenter commented May 1, 2018 via email

bgoodri commented May 1, 2018 via email

maverickg commented May 1, 2018

maverickg commented May 1, 2018

Returning matrices from generated quantities is slow #519

Returning matrices from generated quantities is slow #519

Comments

bletham commented Apr 24, 2018

Summary:

Description:

Reproducible Steps:

RStan Version:

R Version:

Operating System:

bletham commented Apr 24, 2018

bob-carpenter commented Apr 25, 2018

bletham commented Apr 25, 2018

maverickg commented Apr 25, 2018

sakrejda commented Apr 26, 2018

seanjtaylor commented Apr 30, 2018

bgoodri commented Apr 30, 2018 via email

bob-carpenter commented May 1, 2018 via email

bgoodri commented May 1, 2018 via email

maverickg commented May 1, 2018

maverickg commented May 1, 2018