Implementing the most-likely heteroskedastic GP described in #180 #250

mc-robinson · 2019-08-26T15:52:54Z

Hi @Balandat, here is a very initial draft of the PR related to #180 .

A couple of things:

Normalization and standardization are really important for this function, but I am having trouble checking these requirements without the added batch dimension (check_standardization requires it). I'll probably need your help on this because it is unclear to me how I should enforce the batch shape since it is not a class extending a model with _set_dimensions
I have not yet implemented test functions, and may again need your expertise on this one. It's also a bit confusing since it is written as a function and not a test case for a model. I should also mention that none of the test cases I used for benchmarking in [Feature Request] Implement most likely heteroskedastic GP and write a tutorial notebook #180 had multidimensional outputs, so that may need to be checked.

Sorry this is such an incomplete PR, but hopefully it gives a draft function to work with, which you all can then extend. Happy to help any way I can. Also, I'll start trying to draft up the text for a notebook tutorial as suggested in #180

Balandat · 2019-08-26T16:03:59Z

Thanks @mc-robinson, this is a great start. Will take a closer look in the next couple of days and help out with the shape challenges.

Re the test: This should be pretty straightforward since the model is a just a standard HeteroskedasticGP - so we can just write test cases for the model fitting and then do a very basic sanity check on the model. From a purely mechanical pov this is the same as testing any other function, see e.g. this test case.

facebook-github-bot

@shubhankaray has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

matthewcarbone · 2023-03-08T14:28:38Z

@Balandat I'm digging up a really old PR here but it seemed like the right place to ask after doing a bit of searching for the right topic. It appears that this heteroskedastic noise GP is actually implemented and as such this PR can be closed (botorch.models.gp_regression. HeteroskedasticSingleTaskGP ). Is this correct? Just trying to make sure I'm using the right model. Thanks!

CompRhys · 2023-03-08T15:09:40Z

@Balandat I'm digging up a really old PR here but it seemed like the right place to ask after doing a bit of searching for the right topic. It appears that this heteroskedastic noise GP is actually implemented and as such this PR can be closed (botorch.models.gp_regression. HeteroskedasticSingleTaskGP ). Is this correct? Just trying to make sure I'm using the right model. Thanks!

Not @Balandat but have been looking at this today also so will chime in. Short answer no. The HeteroskedasticSingleTaskGP in Botorch requires a y variance for each input whereas this PR attempts to implement the most likely hetroskedastic GP model that takes just the inputs x and y and fits the noise GP using an EM loop. If you have uncertainty or precision measurements the class that is implemented is what you want if not then this PR implements a model that doesn't require the variances as inputs although I believe there is some bug/API issue preventing it being merged.

matthewcarbone · 2023-03-08T15:13:35Z

@CompRhys ah of course. This is what happens when I comment on PRs after just waking up. Thanks for clarifying!

Balandat · 2023-03-08T15:46:44Z

I believe there is some bug/API issue preventing it being merged

I don't think there is an API issue; basically there were some issues with the fitting stability (and there may be bugs). We'd basically need to get back to this and spend some time on it but it has been hard to find that time given our other priorities. If anyone needs this model it would be great if they could take a stab at it; we're very happy to provide feedback and input on it!

CompRhys · 2023-03-08T15:51:54Z

#861 - This is the potential API issue I think is related as when trying to piece this together it links to issues where the noise is still logged internally and this is having some impact as the inverse transform not happening

CompRhys · 2023-03-08T15:56:51Z

If anyone needs this model it would be great if they could take a stab at it; we're very happy to provide feedback and input on it!

Would you want it like this PR as a utility function as opposed to a Class with overridden fit method?

matthewcarbone · 2023-03-08T21:52:11Z

I'm trying to get approval to contribute upstream to BoTorch (re: CLA) so if I end up getting the green light this would definitely be something I'd be interested in trying to help with. Will see what happens 👍

sgbaird · 2023-03-09T03:49:37Z

Hey @CompRhys! Great to see you here! I was reading through a recent article from Milad Abolhasani and Keith Brown (DOI: 10.1557/s43577-023-00482-y), which led me back to heteroskedastic noise. Here's a brief snippet:

While [Bayesian optimization] has gained a great deal of traction in guiding experiments, Noack and Reyes discuss the limitations of these off-the-shelf tools and how they should be modified to improve progress. Specifically, key simplifying assumptions of most GPR approaches are that points that are equidistant in parameter space are equally correlated and each location in parameter space is expected to feature the same amount of noise.

The article from Noack and Reyes was an interesting read. My main takeaways were that non-stationary kernels are especially useful for materials science problems but are non-trivial to find optimal solutions due to the large number of hyperparameters, and (relevant to this PR) shifting towards a heteroskedastic treatment of noise is useful.

matthewcarbone · 2023-03-09T04:31:47Z

@sgbaird If you'd like, I can ask Kris [Reyes] about this next time I meet with him, we chat ~weekly 😁

I'm slowly being convinced that kernel flexibility is important. That said, I don't know how broadly useful very flexible or niche kernels are for the larger community. Definitely going to read that paper though, thanks for sharing.

Balandat · 2023-03-12T20:05:36Z

Would you want it like this PR as a utility function as opposed to a Class with overridden fit method?

Hmm so our classes don't actually have a fit method themselves, but we have moved to multiple-dispatch based fitting. So you could register a custom fitting method for this model as is done e.g. here: https://github.com/pytorch/botorch/blob/main/botorch/fit.py#LL286-L286C2

So there are two options here:

Implement a MostLikelyHeteroskeadsticGP model class and hook it up to the custom fitting strategy using multiple dispatch. This is a bit of a weird setup though - currently models that are fit are modified in-place during fitting, and they don't change type. The most likely heteroskedastic GP is really just a way of generating a HeteroskedasticSingleTaskGP object, it's just generated somewhat differently. I guess the way you could do this is subclass MostLikelyHeteroskeadsticGP from HeteroskedasticSingleTaskGP, change the constructor to not take in train_Yvar, and then in the custom fitting method just generate a sequence of intermediate models to estimate the variances, and load the (estimated) train_Yvar into the final model.
Just use a helper function as in this PR. This should be quite a bit simpler and not cause any issues / challenges with the class structure. I suggest going with this approach as a first pass (can always rethink that when we have a working model).

Balandat · 2023-03-12T20:09:31Z

I'm slowly being convinced that kernel flexibility is important. That said, I don't know how broadly useful very flexible or niche kernels are for the larger community. Definitely going to read that paper though, thanks for sharing.

Yeah It think this is a good point. If you're a domain expert and understand the problem well, customizing kernels, priors, etc., can be very helpful to improve performance on the problem at hand. But this increases complexity and the number of failure modes, so it's a tradeoff how much we want to have more specialized / flexible kernels supported in general purpose software.

mc-robinson added 4 commits August 26, 2019 14:15

added most likely hetero GP fitting procedure

0d9cf86

deleted standardization and scaling testing

0c8c694

added Module import

7b93b45

used black formatter

190592f

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Aug 26, 2019

facebook-github-bot reviewed Aug 26, 2019

View reviewed changes

Balandat mentioned this pull request Sep 11, 2019

FixedNoiseGaussianLikelihood results in negative variance cornellius-gp/gpytorch#864

Open

Balandat mentioned this pull request Dec 2, 2019

[Docs] Way to model heteroskedastic noise? cornellius-gp/gpytorch#982

Open

ArnoVel mentioned this pull request Dec 6, 2019

[Bug] Most Likely Heteroskedastic GP crashes because of NaN in noise_model cornellius-gp/gpytorch#990

Open

Balandat mentioned this pull request Jan 17, 2020

[Question] How to deal with high variance in the observation? facebook/Ax#244

Closed

Balandat mentioned this pull request Jan 30, 2020

Heteroskedastic likelihoods and log-noise models cornellius-gp/gpytorch#337

Merged

Balandat mentioned this pull request May 11, 2020

How is hetereoskedasticity handled? facebook/Ax#310

Closed

Balandat mentioned this pull request Sep 15, 2021

Variance inconsistency in HeteroskedasticSingleTaskGP #933

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing the most-likely heteroskedastic GP described in #180 #250

Implementing the most-likely heteroskedastic GP described in #180 #250

mc-robinson commented Aug 26, 2019

Balandat commented Aug 26, 2019

facebook-github-bot left a comment

matthewcarbone commented Mar 8, 2023

CompRhys commented Mar 8, 2023 •

edited

matthewcarbone commented Mar 8, 2023

Balandat commented Mar 8, 2023

CompRhys commented Mar 8, 2023 •

edited

CompRhys commented Mar 8, 2023

matthewcarbone commented Mar 8, 2023

sgbaird commented Mar 9, 2023 •

edited

matthewcarbone commented Mar 9, 2023

Balandat commented Mar 12, 2023

Balandat commented Mar 12, 2023

Implementing the most-likely heteroskedastic GP described in #180 #250

Are you sure you want to change the base?

Implementing the most-likely heteroskedastic GP described in #180 #250

Conversation

mc-robinson commented Aug 26, 2019

Balandat commented Aug 26, 2019

facebook-github-bot left a comment

Choose a reason for hiding this comment

matthewcarbone commented Mar 8, 2023

CompRhys commented Mar 8, 2023 • edited

matthewcarbone commented Mar 8, 2023

Balandat commented Mar 8, 2023

CompRhys commented Mar 8, 2023 • edited

CompRhys commented Mar 8, 2023

matthewcarbone commented Mar 8, 2023

sgbaird commented Mar 9, 2023 • edited

matthewcarbone commented Mar 9, 2023

Balandat commented Mar 12, 2023

Balandat commented Mar 12, 2023

CompRhys commented Mar 8, 2023 •

edited

CompRhys commented Mar 8, 2023 •

edited

sgbaird commented Mar 9, 2023 •

edited