[Bug] SingleTaskGP's wrong gradients when batch_size = 1 #279

yeahrmek · 2019-09-27T11:09:16Z

🐛 Bug

When I try to calculate the gradient of loss w.r.t. input, I get different results every run when I pass one point to GP.

To reproduce

import torch
from botorch.models import SingleTaskGP
from botorch.fit import fit_gpytorch_model
from gpytorch.mlls import ExactMarginalLogLikelihood

X = torch.randn(100, 1)
y = X.sum(dim=1, keepdims=True)**2

gp = SingleTaskGP(X, y)

mll = ExactMarginalLogLikelihood(gp.likelihood, gp)
fit_gpytorch_model(mll)

x_test = torch.randn(1, 1)
x_test.requires_grad_(True)

gp.eval()

# Calculate gradient w.r.t. to the same input point 5 times
for _ in range(5):
    loss = gp(x_test).mean.sum()
    loss.backward()

    print(x_test.grad)
    x_test = x_test.detach()
    x_test.requires_grad_(True)

The output looks like this

tensor([[0.0597]])
tensor([[-4.8402]])
tensor([[-6.9655e+37]])
tensor([[-2.6707e+17]])
tensor([[nan]])

Expected Behavior

I expect the same result in each iteration. The bug appears only when I try to evaluate gradient at one input point. If I use batch size greater than 1:

x_test = torch.randn(2, 1)
x_test.requires_grad_(True)

gp.eval()

# Calculate gradient w.r.t. to the same input point 5 times
for _ in range(5):
    loss = gp(x_test).mean.sum()
    loss.backward()

    print(x_test.grad)
    x_test = x_test.detach()
    x_test.requires_grad_(True)

The output will be correct

tensor([[2.2270],
        [2.7313]])
tensor([[2.2270],
        [2.7313]])
tensor([[2.2270],
        [2.7313]])
tensor([[2.2270],
        [2.7313]])
tensor([[2.2270],
        [2.7313]])

System information

Botorch==0.1.3
GPyTorch=0.3.5
PyTorch==1.2.0
OS: Ubuntu 19.04

The text was updated successfully, but these errors were encountered:

yeahrmek · 2019-09-27T13:05:58Z

Looks like there is a problem with MaternKernel. At least when I change it to RBFKernel everything works correctly.

Balandat · 2019-09-27T13:47:34Z

Yeah there was a bug in pytorch’s cdist function. Try running this on the latest gpytorch master, that should fix this (by not using torch.cdist st all).

yeahrmek · 2019-09-27T16:43:59Z

On gpytorch master it works correctly, thank you!

yeahrmek added the bug Something isn't working label Sep 27, 2019

yeahrmek closed this as completed Sep 27, 2019

Balandat mentioned this issue Sep 30, 2019

[Bug] autograd of posterior mean w.r.t. x seems unstable #280

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] SingleTaskGP's wrong gradients when batch_size = 1 #279

[Bug] SingleTaskGP's wrong gradients when batch_size = 1 #279

yeahrmek commented Sep 27, 2019

yeahrmek commented Sep 27, 2019

Balandat commented Sep 27, 2019 •

edited

yeahrmek commented Sep 27, 2019

[Bug] SingleTaskGP's wrong gradients when batch_size = 1 #279

[Bug] SingleTaskGP's wrong gradients when batch_size = 1 #279

Comments

yeahrmek commented Sep 27, 2019

🐛 Bug

To reproduce

Expected Behavior

System information

yeahrmek commented Sep 27, 2019

Balandat commented Sep 27, 2019 • edited

yeahrmek commented Sep 27, 2019

Balandat commented Sep 27, 2019 •

edited