Arc Kernel for Conditional Spaces #1023

BCJuan · 2020-01-16T14:49:57Z

🚀 Feature Request

Introduction

I am working on Bayesian optimization with gaussian processes for hyperparameter tuning of ML models. I am indirect user of your library (through Ax and Botorch. Many ML models have conditional configuration spaces. I understand that gaussian processes do not support directly conditional spaces. However, this article Raiders of the Lost Architecture:Kernels for Bayesian Optimization in Conditional Parameter Spaces defines a kernel which would be useful in such spaces.

Motivation

Would be useful to implement this kernel for use in upper libraries. The usage would allow the optimization of conditional configuration spaces by simply implementing this kernel (if I have understood correctly the article).

In the case where conditionality is related to hierarchy, this would solve issues such as this one. Also would be a nice try for a flaw consistently assigned to GPs.

Pitch

In a rough and simplistic way, the kernel consists in a cylindrical embedding in a standard RBF kernel. And then a product of the kernels for each dimension to build the final kernel. I leave the images of the descriptions found in the Github of the project.

I have made myself an implementation:

class ArcKernel(Kernel):

    def __init__(self, num_parameters, **kwargs):
        super(ArcKernel, self).__init__(has_lengthscale=True, **kwargs)

        self.register_parameter(
            name="raw_angle",
            parameter=torch.nn.Parameter(torch.randn(1, 1, self.ard_num_dims))
            )
        angle_constraint = Interval(0, 1)
        self.register_constraint("raw_angle", angle_constraint)

        self.register_parameter(
            name="raw_radius",
            parameter=torch.nn.Parameter(torch.randn(1, 1, self.ard_num_dims))
            )
        radius_constraint = Positive()
        self.register_constraint("raw_radius", radius_constraint)

        self.num_parameters = num_parameters

    def embedding(self, x):
        x_ = x.div(self.lengthscale)
        x_s = self.raw_radius*torch.sin(pi*self.raw_angle*x_)
        x_c = self.raw_radius*torch.sin(pi*self.raw_angle*x_)
        x_ = torch.cat((x_s, x_c), dim=-1).squeeze(0)
        return x_

    def forward(self, x1, x2, diag=False, **params):

        x1_, x2_ = self.embedding(x1), self.embedding(x2)
        return self.covar_dist(x1_, x2_, square_dist=True, diag=diag,
                               dist_postprocess_func=postprocess_rbf,
                               postprocess=True, **params)

And for calling it, it would be a composition of StructuredProductKernel, ScaleKernel, RBFKernel and the newly build ArcKernel.

I can make my self a pull request if the request is accepted. Also, I have not the enough knowledge to ensure that this kernel should work exactly as the explained in the paper. I have made my implementation but have not tested it in any experiment. if desired, I could do also the test bu would need orientation.

Thank you very much in advance.

P.S: you can find info summarized in the repository of the original authors, under the folder latex. I do attach also images from that source.

The text was updated successfully, but these errors were encountered:

gpleiss · 2020-01-16T15:02:47Z

Sure - this seems like it could be useful to have! We're definitely open to a PR :)

jacobrgardner · 2020-01-16T16:15:53Z

@BCJuan If you put up a pull request, my recommendation for the implementation as you've written it would be to have ArcKernel take a base_kernel in __init__ and then use that in forward rather than directly assuming an RBF kernel, e.g.

def forward(self, x1, x2, diag=False, **params):
    x1_, x2_ = self.embedding(x1), self.embedding(x2)
    return self.base_kernel(x1_, x2_)

See for example how ScaleKernel is implemented. This way we could support e.g. Matern base kernels (or, as the paper points out, your "favorite Euclidean covariance"):

base_kernel = MaternKernel(nu=2.5)
base_kernel.raw_lengthscale.requires_grad_(False)  # Don't learn base lengthscale since ArcKernel has one
covar_module = ArcKernel(base_kernel, num_parameters)

BCJuan · 2020-01-16T16:22:42Z

Thank you.

I am writing it properly and testing it so the PR makes sense: adding setters and possibilities of priors, etc

I will post asa I have the kernel decent for commit and it has passed some tests. I welcome greatly any other recommendation.

Balandat · 2020-01-17T05:33:18Z

This is great, @BCJuan. Please let us know over at botorch if you need any help with hooking this up with the acquisition functions.

BCJuan · 2020-01-17T15:23:11Z

Hi,

I have finally cleaned my implementation. Now I am able to make the PR.

I have reproduced the example of (Exact GPs)[https://gpytorch.readthedocs.io/en/latest/examples/01_Exact_GPs/Simple_GP_Regression.html] with the kernel and seems to work fine. I can upload the notebooks, or something similar; as you wish.

I do only worry about the kernel size definition. Now is simply a vector of the number of dimensions, but maybe would have to be something like

        self.register_parameter(
            name="raw_angle",
            parameter=torch.nn.Parameter(torch.zeros(*self.batch_shape, 1, self.ard_num_dims))
            )

@Balandat Thank you. Indeed, I have tried to implement it in the tutorial Botorch with Ax but there are numerical problems. The following error appears:

/home/kostal/anaconda3/envs/deep/lib/python3.7/site-packages/gpytorch/utils/cholesky.py:42: RuntimeWarning: A not p.d., added jitter of 1e-08 to the diagonal
  warnings.warn(f"A not p.d., added jitter of {jitter_new} to the diagonal", RuntimeWarning)

And finally the error:

untimeError: Lapack Error syev : 2 off-diagonal elements didn't converge to zero at /opt/conda/conda-bld/pytorch_1556653215914/work/aten/src/TH/generic/THTensorLapack.cpp:296

Also, the angle constrain which should be Interval(0, 1)is now `Positive() due to an error:

RuntimeError: value cannot be converted to type int64_t without overflow: inf

Balandat · 2020-01-18T00:30:08Z

Happy to help w/ debugging one the botorch end. Once your PR here is up, can you share the full code so I can reproduce?

BCJuan · 2020-01-25T09:20:53Z

Of course @Balandat. I have change the course of development a little though. I will make the tests for the kernel in gpytorch first and once it is clear that it works properly I will pass to Botorch. Maybe it was bold to go directly to botorch without checking first in Gpytorch. Nevertheless, it would be great to test it in Botorch and see the difference in conditional spaces with respect, for example, Matern kernel. If there is anything I can do, please do not doubt to ask. Thank you!

BCJuan added the enhancement label Jan 16, 2020

BCJuan changed the title ~~Arc Kernel for Hierarchical Spaces~~ Arc Kernel for Conditional Spaces Jan 16, 2020

BCJuan mentioned this issue Jan 17, 2020

Kernel class for arc kernel #1027

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arc Kernel for Conditional Spaces #1023

Arc Kernel for Conditional Spaces #1023

BCJuan commented Jan 16, 2020

gpleiss commented Jan 16, 2020

jacobrgardner commented Jan 16, 2020

BCJuan commented Jan 16, 2020

Balandat commented Jan 17, 2020

BCJuan commented Jan 17, 2020 •

edited

Loading

Balandat commented Jan 18, 2020

BCJuan commented Jan 25, 2020

Arc Kernel for Conditional Spaces #1023

Arc Kernel for Conditional Spaces #1023

Comments

BCJuan commented Jan 16, 2020

🚀 Feature Request

Introduction

Motivation

Pitch

gpleiss commented Jan 16, 2020

jacobrgardner commented Jan 16, 2020

BCJuan commented Jan 16, 2020

Balandat commented Jan 17, 2020

BCJuan commented Jan 17, 2020 • edited Loading

Balandat commented Jan 18, 2020

BCJuan commented Jan 25, 2020

BCJuan commented Jan 17, 2020 •

edited

Loading