Skip to content

VISH#3

Open
drsaadhamid wants to merge 30 commits into
developfrom
sgpr
Open

VISH#3
drsaadhamid wants to merge 30 commits into
developfrom
sgpr

Conversation

@drsaadhamid
Copy link
Copy Markdown
Collaborator

No description provided.

@drsaadhamid drsaadhamid requested a review from vdutor May 11, 2021 14:05
@vdutor vdutor changed the title Add moment matched log transform, example of setting q_mu and q_sqrt equivalent to SGPR, and an attempt to get lengthscale gradients working for the ChordMatern kernel VISH Jun 18, 2021
Comment thread gspheres/kernels/chord_matern.py Outdated

v = integrate.quad(integrand, -1.0, 1.0)[0]
return v * omega_d / C_1
@tf.custom_gradient
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you write down in math what the gradients look like? That said, I don't think we need the gradients of funk hecke wrt the lengthscale and variance. For the lengthscale you can simply multiply the inputs X before evaluating the spherical harmonics and the variance can be accounted for by multiplying the eigenvalues with it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gradient wrt the variance is just the funk hecke integral.
The gradient wrt the lengthscale is the funk hecke integral with the shape function replaced by the derivative of the shape function wrt to the lengthscale, i.e. v \int d/dl s(t, l) f(t) dt, where f(t) is the product of the Gegenbauer polynomial and the weighting function wrt which the Gegenbauers are orthogonal and v is the variance.

Comment thread gspheres/kernels/chord_matern.py Outdated
values.append(tf.reshape(v, shape=[-1]))
return tf.concat(values, axis=0)

def verify_eigenvalue_cache(self):
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this functionality. Minor: maybe a more descriptive name would be clear_eigenvalue_cache_onchange?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid having to recompute funk-hecke integrals. A call to the eigenvalues method is made every time the Kuu matrix is required, which means every time a prediction is required. If we didn't cache the eigenvalues then we will do a lot of redundant computation whilst optimising the acquisition function, since we need to recompute the posterior variance at every step.

Comment thread gspheres/kernels/chord_matern.py Outdated

def K_diag(self, X: TensorType) -> tf.Tensor:
""" Approximate the true kernel by an inner product between feature functions. """
# TODO: Refactor in terms of truncated Mercer decomposition.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would you want to do that?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that the posterior variance at the data locations goes down to the likelihood variance. This is necessary for the acquisition function to work properly. The variance of the unwarped GP is combination of the posterior mean and posterior variance of the warped GP. If the posterior variance doesn't get sufficiently small, the acquisition function will keep selecting the peak of the mean function. (You can see this for yourself if you run the experiment but change line 38 of the config file vish.yaml from "spherical_matern" to "chord_matern".)

I appreciate that this issue is specific to using VISH for PI, so maybe you don't want it changed in this repo? If so, we could fork it, or just work in a different branch? We could also just keep using the SphericalMatern kernel, which is now working fine.

Comment thread gspheres/vish/__init__.py Outdated
from gspheres.vish.mixed_variables import MixedFeatures


def map_to_sphere(X: np.ndarray, bias: Union[int, float]) -> np.ndarray:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: I would place this in a utils.py.

@@ -0,0 +1,200 @@
import tensorflow as tf
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: rename file to covariances.py?

Comment thread gspheres/vish/eigenfunction_variables.py Outdated
Comment thread gspheres/vish/eigenfunction_variables.py
Comment thread gspheres/vish/eigenfunction_variables.py
Comment thread gspheres/vish/mixed_variables.py Outdated
@@ -0,0 +1,66 @@
import gpflow.kernels
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you still using this? If not, best to remove the file from the PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not using it anymore; will remove.

Comment thread scripts/vish.py Outdated
@@ -0,0 +1,94 @@
import numpy as np
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would refactor some of this code and have it as an integration test too in tests.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@drsaadhamid
Copy link
Copy Markdown
Collaborator Author

I added a model for SGPR in vish/__init__.py. You were right that simply setting the optimal variational parameters for the SVGP model wasn't a good idea because the matrix that needs to be inverted is not well conditioned. "Rotating" (more of a scaling since Kuu is diagonal) by the cholesky decomposition of Kuu does help in this regard.

I've also added a script to compare the posterior variances at data points between a TruncatedChordMatern kernel and a ChordMatern kernel. The difference between the two is that the prior variance is computed using a truncated mercer decomposition for the TruncatedChordMatern, and using the ordinary Matern equations for the ChordMatern. When the likelihood variance is allowed to be optimised the posterior variance at data points is below the likelihood variance for all kernels. (I've added a pytest that ensures this in tests/test_vish_gpr.py.) However, we need to keep the likelihood variance small if we want to use the VISH model with the uncertainty sampling acquisition function or we will keep making acquisitions at the same location. If we fix it to 0.01 or below (in practice we'll want it quite small, say 1e-4), then notice that the posterior variance at data points stays below the likelihood variance for the TruncatedChordMatern, but not for the ChordMatern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants