Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization fails for long simplexes #2273

Closed
jmaattaswap opened this issue Apr 13, 2017 · 2 comments
Closed

Optimization fails for long simplexes #2273

jmaattaswap opened this issue Apr 13, 2017 · 2 comments
Milestone

Comments

@jmaattaswap
Copy link

Summary:

Optimization for a regression model with a 5000-element simplex parameter does not work as expected. A fraction of the last elements of the simplex get values very close to zero.

Description:

I encountered the problem with a different model, but was able to come up with a simpler example that can be used to reproduce it:

Consider a regression model y ~ normal(X*w, sigma), where w is a simplex and X is an NxD matrix with e.g. N=10000, D=5000. Fitting simulated data to the model, Stan ends up setting a small but non-negligible fraction of the last elements of w to values very close to zero.

Reproducible Steps:

The following R program reproduces the issue on my MacBook Pro, OS X 10.11.6, R 3.3.3, RStan 2.14.2. (I doubt the issue is specific to RStan, so I did not attempt to reproduce the issue with e.g. cmdstan.)

library(rstan)

stan_src = "
  data {
    int<lower=1> N;
    int<lower=1> D;
    matrix[N, D] X;
    vector[N] y;
  }
  parameters {
    simplex[D] w;
    real<lower=0> sigma;
  }
  model {
    y ~ normal(X * w, sigma);
  }
"

N <- 10000
D <- 5000
w <- ifelse(runif(D) < 0.5, runif(D, min = 1, max = 2), 0)
w <- w / sum(w)
X <- matrix(rnorm(N * D), nrow = N)
y <- c(X %*% w) + rnorm(N, mean = 0, sd = 0.1)

sm <- stan_model(model_code = stan_src)
fit <- optimizing(sm, list(N = N, D = D, X = X, y = y), iter = 10000, refresh = 10)

w_hat <- fit$par[grep('^w\\[', names(fit$par))]
plot(log(w_hat))

Current Output:

The log-plot of the estimated values of w clearly shows the issue:

image

Expected Output:

Wouldn't have expected the sharp drop at the end of the plot.

@jmaattaswap
Copy link
Author

Looks like this can be solved by setting tol_rel_grad and tol_rel_obj to zeros and doing loads of more iterations. The defaults for the convergence options have always worked for me before, so it didn't immediately occur to me to consider them as culprits.

(Still, I find the sequential structure in the plot above fascinating—stick-breaking in action?)

Closing the issue.

@bob-carpenter
Copy link
Contributor

bob-carpenter commented Apr 13, 2017 via email

@seantalts seantalts modified the milestone: v2.15.0 Apr 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants