Expose jitter #2136

hughsalimbeni · 2022-09-13T21:15:01Z

Exposes jitter in variational_strategy
Added a test that checks variational vs exact gp with high precision comparison. This is not possible without setting jitter to a very small value

One small issue is that the previous behaviour had both 1e-3 and 1e-4 added in various places. I've only found one test that breaks with 1e-4 for both, so I've set this one to 1e-3 via the new exposed jitter_val arg.

Happy to do the other variational strategies, and move the defaults to the settings.

gpytorch/variational/variational_strategy.py

jacobrgardner

lgtm other than the lint errors and the import question!

gpytorch/variational/variational_strategy.py

gpleiss · 2022-09-14T15:16:11Z

Thanks for this @hughsalimbeni ! If you wouldn't mind also making these changes to the other variational strategies that would be great!

Co-authored-by: Geoff Pleiss <gpleiss@gmail.com>

hughsalimbeni · 2022-09-14T16:53:22Z

In the last commit I found one more in the grid interpolation strategy, but this time it breaks with anything smaller than 1e-3, so I've left explicit for now.

The default is now changed from 1e-3 to 1e-6 (or 1e-8 for float64), which is quite a large change. Might this cause problems for some users?

gpleiss · 2022-09-14T23:47:59Z

The default is now changed from 1e-3 to 1e-6 (or 1e-8 for float64), which is quite a large change. Might this cause problems for some users?

I imagine this should be okay - we had 1e-3 or 1e-4 in our ExactGP code until we moved to the context manager, and (AFAIK) it hasn't introduced too many numerical issues. We can always tighten this (or make a separate setting for variational models) if we see trouble down the line).

Thanks @hughsalimbeni !

hughsalimbeni · 2022-09-15T07:52:10Z

Odd that the test_decoupled_svgp_regression test failed as it was fine on my machine. I've changed to 1e-4. Hope it is ok now

jacobrgardner · 2022-09-15T11:26:05Z

Interesting also that it's failing against pytorch master only, but not pytorch's current stable release. Maybe 1e-6 is problematically small for variational GPs in fp32 and we should make a settiings.variational_cholesky_jitter with a default of 1e-4? For what it's worth, I've empirically noticed that K_uu matrices can be pretty artificially numerically sensitive with learned inducing locations, since optimization can sometimes put a few inducing points essentially directly on top of each other.

gpleiss · 2022-09-15T14:07:22Z

I'm going to add that setting, and then merge.

hughsalimbeni added 3 commits September 13, 2022 21:37

expose jitter in variational strategy

dca8bac

tidy line break

f639e1b

tidy

f42289a

jacobrgardner reviewed Sep 14, 2022

View reviewed changes