Optimization fails for long simplexes #2273

jmaattaswap · 2017-04-13T09:40:54Z

Summary:

Optimization for a regression model with a 5000-element simplex parameter does not work as expected. A fraction of the last elements of the simplex get values very close to zero.

Description:

I encountered the problem with a different model, but was able to come up with a simpler example that can be used to reproduce it:

Consider a regression model y ~ normal(X*w, sigma), where w is a simplex and X is an NxD matrix with e.g. N=10000, D=5000. Fitting simulated data to the model, Stan ends up setting a small but non-negligible fraction of the last elements of w to values very close to zero.

Reproducible Steps:

The following R program reproduces the issue on my MacBook Pro, OS X 10.11.6, R 3.3.3, RStan 2.14.2. (I doubt the issue is specific to RStan, so I did not attempt to reproduce the issue with e.g. cmdstan.)

library(rstan)

stan_src = "
  data {
    int<lower=1> N;
    int<lower=1> D;
    matrix[N, D] X;
    vector[N] y;
  }
  parameters {
    simplex[D] w;
    real<lower=0> sigma;
  }
  model {
    y ~ normal(X * w, sigma);
  }
"

N <- 10000
D <- 5000
w <- ifelse(runif(D) < 0.5, runif(D, min = 1, max = 2), 0)
w <- w / sum(w)
X <- matrix(rnorm(N * D), nrow = N)
y <- c(X %*% w) + rnorm(N, mean = 0, sd = 0.1)

sm <- stan_model(model_code = stan_src)
fit <- optimizing(sm, list(N = N, D = D, X = X, y = y), iter = 10000, refresh = 10)

w_hat <- fit$par[grep('^w\\[', names(fit$par))]
plot(log(w_hat))

Current Output:

The log-plot of the estimated values of w clearly shows the issue:

Expected Output:

Wouldn't have expected the sharp drop at the end of the plot.

The text was updated successfully, but these errors were encountered:

jmaattaswap · 2017-04-13T13:02:18Z

Looks like this can be solved by setting tol_rel_grad and tol_rel_obj to zeros and doing loads of more iterations. The defaults for the convergence options have always worked for me before, so it didn't immediately occur to me to consider them as culprits.

(Still, I find the sequential structure in the plot above fascinating—stick-breaking in action?)

Closing the issue.

bob-carpenter · 2017-04-13T19:12:36Z

Thanks for reporting. I think it's worth mentioning your example in the doc: #2265 (comment) It may arise from the way we build simplexes. It should be that an all-zero init leads to a symmetric simplex, but if you get non-zero inits, I'm pretty sure they're no longer symmetric.

jmaattaswap closed this as completed Apr 13, 2017

bob-carpenter mentioned this issue Apr 13, 2017

next manual, 2.15++ #2265

Closed

seantalts modified the milestone: v2.15.0 Apr 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization fails for long simplexes #2273

Optimization fails for long simplexes #2273

jmaattaswap commented Apr 13, 2017

jmaattaswap commented Apr 13, 2017

bob-carpenter commented Apr 13, 2017 via email

Optimization fails for long simplexes #2273

Optimization fails for long simplexes #2273

Comments

jmaattaswap commented Apr 13, 2017

Summary:

Description:

Reproducible Steps:

Current Output:

Expected Output:

jmaattaswap commented Apr 13, 2017

bob-carpenter commented Apr 13, 2017 via email