You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some of GP models in the Example Models/Gaussian Process Section of the Reference Manual that are invalid, and there also need be updated for some of the new kernels.
Description:
I will make a list of things I've noticed, and this is in no way designed to be comprehensive, and I will update this post as I go:
The joint distribution for unobserved y is contained in the total covariance matrix. On Page 259/260, we have something like the following:
transformed data {
real delta = 1e-9;
int<lower=1> N = N1 + N2;
real x[N];
for (n1 in 1:N1) x[n1] = x1[n1];
for (n2 in 1:N2) x[N1 + n2] = x2[n2];
}
This does not make sense. We simply need two x vectors: x and x_pred, where x_pred are the out of sample predictions. If we take
generated quantities {
vector[N2] y2;
for (n2 in 1:N2)
y2[n2] = normal_rng(f[N1 + n2], sigma);
}
then we generate predictions for indeces greater than N1 that are essentially just normal random variates, and we are incorporating nothing in we've approximated in the model. Another note, since we have a gaussian likelihood, we do not need the latent f and instead can use y directly. We only need latent f in generated quanties when the likelihood is non-gaussian.
Instead, we use matrix algebra (i.e. using the posterior predictive mean function and posterior predictive variance, and then the data and generated quantities blocks can look the same for all models and look something like this (for ARD/seperate length scale):
assuming we generate the predictive posterior correctly, and there is an example below.
I'm also keen on generating out of sample and in sample predictions in my generated quantities block. For binary classifier, assuming we've generated the latent f* properly (using f*, following GPML notation), this is as follows:
generated quantities {
vector[N_pred] f_pred = gp_pred_rng(x_pred, f, x, magnitude, length_scale);
int y_pred[N_pred];
int y_pred_in[N];
for (n in 1:N) y_pred_in[n] = bernoulli_logit_rng(f[n]); // in sample prediction
for (n in 1:N_pred) y_pred[n] = bernoulli_logit_rng(f_pred[n]); // out of sample predictions
}
We also need note that the posterior predictive can based on likelihood or noise model we're assuming, and also on the covariance function. For example, in the binary classifier, logit example, we only need the mean function (Also note, I'm using the mean function without noisy predictions):
I'm trying to locate Example Models section 18 Gaussian Process Models. I've gone through some of the files in stan/src/docs/reference-manual/ sequentially, but I've had no luck. Where can I find this section so I can do a pull request? Thanks!
@drezap great point on points 2 and 3 for the prediction function, we should add this code to the manual. Just a note, the code in the user guide for the predictions using the latent functions isn't wrong, it's just not as efficient as it could be, as you rightly point out. Re the form for Gaussian models see pages 152 to 154 in the new guide. I wrote the section in the user guide to be more pedagogical, but I can see an argument for not including any inefficient code in the manual, even if it's used as a building block for later more efficient code.
Summary:
There are some of GP models in the Example Models/Gaussian Process Section of the Reference Manual that are invalid, and there also need be updated for some of the new kernels.
Description:
I will make a list of things I've noticed, and this is in no way designed to be comprehensive, and I will update this post as I go:
This does not make sense. We simply need two x vectors:
x
andx_pred
, wherex_pred
are the out of sample predictions. If we takethen we generate predictions for indeces greater than
N1
that are essentially just normal random variates, and we are incorporating nothing in we've approximated in the model. Another note, since we have a gaussian likelihood, we do not need the latentf
and instead can usey
directly. We only need latentf
ingenerated quanties
when the likelihood is non-gaussian.Instead, we use matrix algebra (i.e. using the posterior predictive mean function and posterior predictive variance, and then the
data
andgenerated quantities blocks
can look the same for all models and look something like this (for ARD/seperate length scale):assuming we generate the predictive posterior correctly, and there is an example below.
generated quantities
block. For binary classifier, assuming we've generated the latentf*
properly (usingf*
, following GPML notation), this is as follows:This wasn't as organized as I'd hoped but it hits on some points.
Reproducible Steps:
If you copy and paste some of the notation in the Stan manual and plot the in sample predictive distribution, you will see what I'm talking about.
Current Version:
v2.18.0
The text was updated successfully, but these errors were encountered: