-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] The Added Loss term for InducingKernel seems flipped in sign? #1657
Comments
Hi @maharjun - I see you've closed this issue? How is it resolved? |
It's because flipping the sign caused the errors to become pathological. Upon further investigation, I realize that It's because there are two errors that cancel each other out. The initialization of the InducingPointKernelAddedLossTerm in inducing_point_kernel.py flips the variational distribution and the prior distribution arguments which makes the above sign convention work. I'm opening this issue again so that this double error can be fixed. |
I think the current version is a bit confusing because of implicitly reversing the order of subtraction. Would the following be less confusing? def loss(self, *params):
prior_covar = self.prior_dist.lazy_covariance_matrix
variational_covar = self.variational_dist.lazy_covariance_matrix
diag = prior_covar.diag() - variational_covar.diag()
shape = prior_covar.shape[:-1]
noise_diag = self.likelihood._shaped_noise_covar(shape, *params).diag()
return -0.5 * (diag / noise_diag).sum() # As per Eq. 9 in original paper, we are subtracting this term (instead of adding as the class name "AddedLossTerm" suggests) |
@patel-zeel this seems reasonable - if you want to put up a PR for it. |
🐛 Bug
This is the current code for InducingPointKernelAddedLossTerm.loss
From what I see, this "loss term" is added into the mll that is returned by the
ExactMarginalLogLikelihood
class. This in itself is misleading as the loss is usually the negative of the mll.Moreover, the variational negative loss used to evaluate inducing points is given below
As can be seen, the above is the expression for the pseudo-mll that is maximized when optimizing the inducing points. in this, the component of
InducingPointKernelAddedLossTerm
is negative to the value that is being added into the loss.This is quite likely a significant bug. Please fix (just invert the sign of
diag
above)The text was updated successfully, but these errors were encountered: