Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] The Added Loss term for InducingKernel seems flipped in sign? #1657

Closed
maharjun opened this issue Jun 15, 2021 · 4 comments · Fixed by #1670
Closed

[Bug] The Added Loss term for InducingKernel seems flipped in sign? #1657

maharjun opened this issue Jun 15, 2021 · 4 comments · Fixed by #1670
Labels

Comments

@maharjun
Copy link

maharjun commented Jun 15, 2021

🐛 Bug

    def loss(self, *params):
        prior_covar = self.prior_dist.lazy_covariance_matrix
        variational_covar = self.variational_dist.lazy_covariance_matrix
        diag = prior_covar.diag() - variational_covar.diag()
        shape = prior_covar.shape[:-1]
        noise_diag = self.likelihood._shaped_noise_covar(shape, *params).diag()
        return 0.5 * (diag / noise_diag).sum()

This is the current code for InducingPointKernelAddedLossTerm.loss

From what I see, this "loss term" is added into the mll that is returned by the ExactMarginalLogLikelihood class. This in itself is misleading as the loss is usually the negative of the mll.

Moreover, the variational negative loss used to evaluate inducing points is given below

image

As can be seen, the above is the expression for the pseudo-mll that is maximized when optimizing the inducing points. in this, the component of InducingPointKernelAddedLossTerm is negative to the value that is being added into the loss.

This is quite likely a significant bug. Please fix (just invert the sign of diag above)

@maharjun maharjun added the bug label Jun 15, 2021
@gpleiss
Copy link
Member

gpleiss commented Jun 22, 2021

Hi @maharjun - I see you've closed this issue? How is it resolved?

@maharjun
Copy link
Author

It's because flipping the sign caused the errors to become pathological.

Upon further investigation, I realize that It's because there are two errors that cancel each other out. The initialization of the InducingPointKernelAddedLossTerm in inducing_point_kernel.py flips the variational distribution and the prior distribution arguments which makes the above sign convention work. I'm opening this issue again so that this double error can be fixed.

@patel-zeel
Copy link
Contributor

patel-zeel commented Oct 18, 2021

I think the current version is a bit confusing because of implicitly reversing the order of subtraction. Would the following be less confusing?

def loss(self, *params):
    prior_covar = self.prior_dist.lazy_covariance_matrix
    variational_covar = self.variational_dist.lazy_covariance_matrix
    diag = prior_covar.diag() - variational_covar.diag()
    shape = prior_covar.shape[:-1]
    noise_diag = self.likelihood._shaped_noise_covar(shape, *params).diag()
    return -0.5 * (diag / noise_diag).sum() # As per Eq. 9 in original paper, we are subtracting this term (instead of adding as the class name "AddedLossTerm" suggests)

@gpleiss
Copy link
Member

gpleiss commented Oct 23, 2021

@patel-zeel this seems reasonable - if you want to put up a PR for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants