Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move predictions to Stan #501

Closed
bletham opened this issue Apr 24, 2018 · 5 comments
Closed

Move predictions to Stan #501

bletham opened this issue Apr 24, 2018 · 5 comments

Comments

@bletham
Copy link
Contributor

bletham commented Apr 24, 2018

Right now we fit the model inside Stan but then make predictions in R and Py.

It would be beneficial for reducing code duplication across R and Py versions to do the predictions in Stan, including the simulation of future trend changes. It would probably also be faster. This Stan code does that:

functions {
  matrix get_changepoint_matrix(vector t, vector t_change, int T, int S) {
    // Assumes t and t_change are sorted.
    matrix[T, S] A;
    row_vector[S] a_row;
    int cp_idx;

    // Start with an empty matrix.
    A = rep_matrix(0, T, S);
    a_row = rep_row_vector(0, S);
    cp_idx = 1;

    // Fill in each row of A.
    for (i in 1:T) {
      while ((cp_idx <= S) && (t[i] >= t_change[cp_idx])) {
        a_row[cp_idx] = 1;
        cp_idx += 1;
      }
      A[i] = a_row;
    }
    return A;
  }

  // Logistic trend functions

  vector logistic_gamma(real k, real m, vector delta, vector t_change, int S) {
    vector[S] gamma;  // adjusted offsets, for piecewise continuity
    vector[S + 1] k_s;  // actual rate in each segment
    real m_pr;

    // Compute the rate in each segment
    k_s[1] = k;
    for (i in 1:S) {
      k_s[i + 1] = k_s[i] + delta[i];
    }

    // Piecewise offsets
    m_pr = m; // The offset in the previous segment
    for (i in 1:S) {
      gamma[i] = (t_change[i] - m_pr) * (1 - k_s[i] / k_s[i + 1]);
      m_pr = m_pr + gamma[i];  // update for the next segment
    }
    return gamma;
  }
  
  vector logistic_trend(
    real k, real m, vector delta, vector t, vector cap, matrix A,
    vector t_change, int S
  ) {
    vector[S] gamma;

    gamma = logistic_gamma(k, m, delta, t_change, S);
    return cap ./ (1 + exp(-(k + A * delta) .* (t - (m + A * gamma))));
  }

  // Linear trend function

  vector linear_trend(
    real k, real m, vector delta, vector t, matrix A, vector t_change
  ) {
    return (k + A * delta) .* t + (m + A * (-t_change .* delta));
  }
  
  // Helper for getting appropriate trend
  
  vector get_trend(
    real k, real m, vector delta, vector t, vector cap, matrix A,
    vector t_change, int S, int trend_indicator
  ) {
    if (trend_indicator == 0) {
      return linear_trend(k, m, delta, t, A, t_change);
    } else {
      return logistic_trend(k, m, delta, t, cap, A, t_change, S);
    }
  }
}

data {
  int T;                    // Number of time periods
  int<lower=1> K;           // Number of regressors
  vector[T] t;              // Time
  vector[T] cap;            // Capacities for logistic trend
  vector[T] y;              // Time series
  int S;                    // Number of changepoints
  vector[S] t_change;       // Times of trend changepoints
  matrix[T, K] X;           // Regressors
  vector[K] sigmas;         // Scale on seasonality prior
  real<lower=0> tau;        // Scale on changepoints prior
  int trend_indicator;      // 0 for linear, 1 for logistic

  int T_pred;               // Number of prediction time periods
  vector[T_pred] t_pred;    // Times for predictions
  vector[T_pred] cap_pred;  // Predictive capacities
  matrix[T_pred, K] X_pred; // Predictive features
  int n_samp;               // Number of samples for trend change uncertainty
  int S_pred;               // Upper bound on number of future changepoints
}

transformed data {
  matrix[T, S] A;
  A = get_changepoint_matrix(t, t_change, T, S);
}

parameters {
  real k;                   // Base trend growth rate
  real m;                   // Trend offset
  vector[S] delta;          // Trend rate adjustments
  real<lower=0> sigma_obs;  // Observation noise
  vector[K] beta;           // Regressor coefficients
}

model {
  // Priors
  k ~ normal(0, 5);
  m ~ normal(0, 5);
  delta ~ double_exponential(0, tau);
  sigma_obs ~ normal(0, 0.1);
  beta ~ normal(0, sigmas);

  // Likelihood
  y ~ normal(
    get_trend(k, m, delta, t, cap, A, t_change, S, trend_indicator) + X * beta,
    sigma_obs
  );
}

generated quantities {
  // Make predictions.
  vector[T_pred] y_hat;
  vector[T_pred] seas_hat;
  vector[T_pred] trend_hat;
  matrix[T_pred, S] A_pred;
  matrix[T_pred, n_samp] trend_samples;
  vector[S + S_pred] t_change_sim;
  vector[S + S_pred] delta_sim;
  real lambda;
  matrix[T_pred, S + S_pred] A_sim;

  if (T_pred > 0) {
    // Get the main estimate
    seas_hat = X_pred * beta;

    A_pred = get_changepoint_matrix(t_pred, t_change, T_pred, S);
    trend_hat = get_trend(
      k, m, delta, t_pred, cap_pred, A_pred, t_change, S, trend_indicator
    );

    y_hat = trend_hat + seas_hat;

    // Estimate uncertainty with the generative model

    // Set the first S changepoints as fitted
    for (i in 1:S) {
      t_change_sim[i] = t_change[i];
      delta_sim[i] = delta[i];
    }
    // Get the Laplace scale
    lambda = mean(fabs(delta)) + 1e-8;

    for (i in 1:n_samp) {
      if (S_pred > 0) {
         Sample new changepoints from a Poisson process with rate S
         Sample changepoint deltas from Laplace(lambda)
        t_change_sim[S + 1] = 1 + exponential_rng(S);
        for (j in (S + 2):(S + S_pred)) {
          t_change_sim[j] = t_change_sim[j - 1] + exponential_rng(S);
        }
        for (j in (S + 1): (S + S_pred)) {
          delta_sim[j] = double_exponential_rng(0, lambda);
        }
      }
      // Compute trend with these changepoints
      A_sim = get_changepoint_matrix(t_pred, t_change_sim, T_pred, S + S_pred);
      trend_samples[:, i] = get_trend(
        k, m, delta_sim, t_pred, cap_pred, A_sim, t_change_sim, S + S_pred,
        trend_indicator
      );
    }
  }
}

Integrating this relies on being able to return the future trends from Stan to R, which is blocked on stan-dev/rstan#519.

@alok
Copy link

alok commented Jul 13, 2018

Is this still blocked?

@bletham
Copy link
Contributor Author

bletham commented Jul 17, 2018

Hopefully not, the upstream issue should have been fixed but I haven't yet checked to see if the data transfer is fast enough now for the scale we need. (Note that the fix is merged in github but not pushed to CRAN so testing will need to install rstan from github).

@adrfantini
Copy link
Contributor

Is this still planned or has this been implemented?

@bletham
Copy link
Contributor Author

bletham commented Sep 28, 2019

@adrfantini #865 is the PR working on this, though it ran into some blockers a few months ago that we still need to figure out the best way around (still comes back to transferring large matrices from Stan into R being really slow). So there's still work to be done.

@bletham
Copy link
Contributor Author

bletham commented Feb 21, 2020

In light of the change to support multiple stan backends (pystan and cmdstanpy), I think our previous strategy of trying to use stan's GQ block to try and handle predictions is going to be more complicated. I no longer think it will be worth the effort to try to move predictions into Stan (especially given the difficulties that we have already run into in trying to do this).

There are a few features that this would have made easier (trend that saturates on one end, Poisson link function). I think we'll be better off just biting the bullet and implementing those in Stan/R/Py and have switches the way we do for linear vs. logistic trend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants