Update Gaussian Process Models in Stan Reference Manual #16

drezap · 2018-08-14T06:59:33Z

Summary:

There are some of GP models in the Example Models/Gaussian Process Section of the Reference Manual that are invalid, and there also need be updated for some of the new kernels.

Description:

I will make a list of things I've noticed, and this is in no way designed to be comprehensive, and I will update this post as I go:

The joint distribution for unobserved y is contained in the total covariance matrix. On Page 259/260, we have something like the following:

transformed data {
  real delta = 1e-9;
  int<lower=1> N = N1 + N2;
  real x[N];
  for (n1 in 1:N1) x[n1] = x1[n1];
  for (n2 in 1:N2) x[N1 + n2] = x2[n2];
}

This does not make sense. We simply need two x vectors: x and x_pred, where x_pred are the out of sample predictions. If we take

generated quantities {
  vector[N2] y2;
  for (n2 in 1:N2)
    y2[n2] = normal_rng(f[N1 + n2], sigma);
}

then we generate predictions for indeces greater than N1 that are essentially just normal random variates, and we are incorporating nothing in we've approximated in the model. Another note, since we have a gaussian likelihood, we do not need the latent f and instead can use y directly. We only need latent f in generated quanties when the likelihood is non-gaussian.

Instead, we use matrix algebra (i.e. using the posterior predictive mean function and posterior predictive variance, and then the data and generated quantities blocks can look the same for all models and look something like this (for ARD/seperate length scale):

data {
  int<lower=1> N;
  int<lower=1> D;
  vector[D] x[N];
  int<lower=0,upper=1> y[N];

  int<lower=1> N_pred;
  vector[D] x_pred[N_pred];
}
parameters {
  real<lower=0> magnitude;
  real<lower=0> length_scale[D];
  vector[N] eta;
}

assuming we generate the predictive posterior correctly, and there is an example below.

I'm also keen on generating out of sample and in sample predictions in my generated quantities block. For binary classifier, assuming we've generated the latent f* properly (using f*, following GPML notation), this is as follows:

generated quantities {
  vector[N_pred] f_pred = gp_pred_rng(x_pred, f, x, magnitude, length_scale);
  int y_pred[N_pred];
  int y_pred_in[N];
  
  for (n in 1:N) y_pred_in[n] = bernoulli_logit_rng(f[n]); // in sample prediction
  for (n in 1:N_pred) y_pred[n] = bernoulli_logit_rng(f_pred[n]); // out of sample predictions
}

We also need note that the posterior predictive can based on likelihood or noise model we're assuming, and also on the covariance function. For example, in the binary classifier, logit example, we only need the mean function (Also note, I'm using the mean function without noisy predictions):

functions {
  vector gp_pred_rng(vector[] x_pred,
                     vector y1, vector[] x,
                     real magnitude, real[] length_scale) {
                     ) {
    int N = rows(y1);
    int N_pred = size(x_pred);
    vector[N_pred] f2;
    {
      matrix[N, N] K = gp_exp_quad_cov(x, magnitude, length_scale);
      matrix[N, N] L_K = cholesky_decompose(K);
      vector[N] L_K_div_y1 = mdivide_left_tri_low(L_K, y1);
      vector[N] K_div_y1 = mdivide_right_tri_low(L_K_div_y1', L_K)';
      matrix[N, N_pred] k_x_x_pred = gp_exp_quad_cov(x, x_pred, magnitude, length_scale);
      f2 = (k_x_x_pred' * K_div_y1);
    }
    return f2;
  }
}

This wasn't as organized as I'd hoped but it hits on some points.

Reproducible Steps:

If you copy and paste some of the notation in the Stan manual and plot the in sample predictive distribution, you will see what I'm talking about.

Current Version:

v2.18.0

The text was updated successfully, but these errors were encountered:

drezap · 2018-08-14T07:13:02Z

Hi -

I'm trying to locate Example Models section 18 Gaussian Process Models. I've gone through some of the files in stan/src/docs/reference-manual/ sequentially, but I've had no luck. Where can I find this section so I can do a pull request? Thanks!

rtrangucci · 2018-08-16T15:14:14Z

@drezap great point on points 2 and 3 for the prediction function, we should add this code to the manual. Just a note, the code in the user guide for the predictions using the latent functions isn't wrong, it's just not as efficient as it could be, as you rightly point out. Re the form for Gaussian models see pages 152 to 154 in the new guide. I wrote the section in the user guide to be more pedagogical, but I can see an argument for not including any inefficient code in the manual, even if it's used as a building block for later more efficient code.

mitzimorris transferred this issue from stan-dev/stan Dec 23, 2018

mitzimorris added this to the 2.18.++ milestone Jan 25, 2019

drezap closed this as completed Jul 3, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Gaussian Process Models in Stan Reference Manual #16

Update Gaussian Process Models in Stan Reference Manual #16

drezap commented Aug 14, 2018

drezap commented Aug 14, 2018

rtrangucci commented Aug 16, 2018 •

edited

Update Gaussian Process Models in Stan Reference Manual #16

Update Gaussian Process Models in Stan Reference Manual #16

Comments

drezap commented Aug 14, 2018

Summary:

Description:

Reproducible Steps:

Current Version:

drezap commented Aug 14, 2018

rtrangucci commented Aug 16, 2018 • edited

rtrangucci commented Aug 16, 2018 •

edited