stable gaussian process classification with global kernel #316

adjidieng · 2016-10-29T00:22:16Z

This provides a more stable GPC code...The changes are the following:

added a kernel() method to the model wrapper that uses the existing multivariate.rbf() method to compute the kernel matrix for the whole data
the kernel has been stabilized by adding a negligible number in its diagonal...

Note: this uses the benchmark crabs dataset from UCI. The data is shuffled...

dustinvtran · 2016-10-29T18:02:27Z

cool! could we have a version of this that uses the modeling language? currently, the examples/ directory is trying to consist of scripts that only use the modeling language. for those using model wrappers, we prepend the filename, e.g., with tf_ to get tf_gp_classification.py.

dustinvtran · 2016-11-07T16:19:10Z

i added a gp_classification.py file, which uses Edward's language. this could help you get started with my comment above.

adjidieng · 2016-11-09T21:25:56Z

@dustinvtran : yes this will be helpful. Thx! I was having "cannot copy z" errors with Edward language...So I will submit the initial code I have that do not use Edward language as tf_gpc.py and add yours as gpc.py into one PR. Will get back to you this weekend.

adjidieng · 2016-11-15T00:49:38Z

@dustinvtran : this is ready to merge...

dustinvtran · 2016-11-15T02:42:47Z

edward/util/tensorflow.py

@@ -310,6 +310,44 @@ def multivariate_rbf(x, y=0.0, sigma=1.0, l=1.0):
      tf.exp(-1.0 / (2.0 * tf.pow(l, 2.0)) * tf.reduce_sum(tf.pow(x - y, 2.0)))


+def multivariate_rbf_kernel(x, sigma=1.0, l=1.0):


when a function is added to the codebase in util, it needs a unit test.

also, in general, do we plan on supporting kernel functions in edward.util as a long term thing? or do you think it makes sense to leave them as part of the example scripts?

multivariate_rbf_kernel() might be needed for any model with a RBF kernel such as GP, Cox.

in the long term i think we might want to implement other kernels and not just RBF. Probably not in util but in a kernel.py file as is done in GPflow. Then in the example scripts we can simply do:

from ed.kernel import rbf, linear
K = rbf(x)
K_lin = linear(x)

i will move it to the example scripts for now.

dustinvtran · 2016-11-15T02:43:30Z

edward/util/tensorflow.py

+def multivariate_rbf_kernel(x, sigma=1.0, l=1.0):
+  """
+  computes the rbf kernel for the whole data x
+  Args:


following our convention, can we convert this docstring style to be NumPy?

dustinvtran · 2016-11-15T02:45:03Z

edward/util/tensorflow.py

+  """
+  N = x.get_shape()[0]
+  mat = []
+  for i in range(N):


is there any way we can convert the multivariate rbf to be vectorized in some way, and not have to loop over every entry in the matrix? i've found this to be the computational bottleneck in our GP experiments.

looking at GPflow's kernels.py, i couldn't find out how (or if) they vectorize this operation.

yes i agree. i think this is what is slowing it down. GPflow seems to be using a custom _slice() method on their input. I will have to look into that. they don't use a loop.

dustinvtran · 2016-11-15T02:46:38Z

examples/gpc.py

+X_train = df[:, 1:][permutation]
+y_train = df[:, 0][permutation]
+
+print("pre-computing the kernel matrix...")


Can you elaborate on what you mean by "pre-computing the kernel matrix"?

i meant compute K externally and not inside the model definition...

got it. i guess that makes sense for the model wrapper. everything is already "pre-computed" in the native language.

dustinvtran · 2016-11-15T02:48:09Z

examples/tf_gpc.py

+y = df[:, 0][permutation]
+
+print("computing the kernel matrix...")
+K = multivariate_rbf_kernel(


if we're building this as part of the model, it would be nice to prevent global scoping for classes and have K be an argument to GaussianProcess. self.N and self.n_vars for example could also be inferred from the shape of K.

yes passing K as an argument is also an option. will add that.

dustinvtran · 2016-11-15T02:48:43Z

Thanks for the update! It looks great. Comments above.

adjidieng · 2016-11-17T03:57:35Z

@dustinvtran : i am not sure why the python3.4 check is failing. this pr is ready once that is sorted out.

dawenl · 2016-11-17T04:13:43Z

That's a legacy issue which has already been fixed in #324.

adjidieng · 2016-11-17T04:17:25Z

@dawenl : that's great! thanks

dustinvtran · 2016-11-17T04:23:31Z

cool! since there's already a gp_classification.py, can you replace that script with these lines of code? (preferably keeping the filename gp_classification.py)

… pep8

dustinvtran · 2017-04-27T03:17:42Z

Improved via #596.

adjidieng added the models label Nov 15, 2016

dustinvtran reviewed Nov 15, 2016

View reviewed changes

adjidieng added 9 commits November 16, 2016 23:40

stable gaussian process classification with global kernel

25ed329

stable goc with edward language

4777575

stable gpc with tensorflow

d196fc5

complying to pep8 for gpc.py

bea1479

complying to pep8 for tf_gpc.py

3d7c5cb

added probit link and wrapper around multivariate_rbf ...complying to…

a667f55

… pep8

gpc with dustins comments non-vectorized kernel

8431e15

took out rbf wrapper off util

1f1cc29

took out tf_gpc.py....redundant...refer to tf_gp_classification.py

f277adc

adjidieng force-pushed the gpc branch from 155d183 to f277adc Compare November 17, 2016 04:59

copied gpc.py into gp_classification.py

0ff84f6

dustinvtran force-pushed the master branch from 93ef0c7 to 1266444 Compare November 18, 2016 20:09

dustinvtran force-pushed the master branch 2 times, most recently from a5ac146 to de5c24c Compare January 21, 2017 16:31

dustinvtran closed this Apr 27, 2017

dustinvtran deleted the gpc branch April 27, 2017 03:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stable gaussian process classification with global kernel #316

stable gaussian process classification with global kernel #316

adjidieng commented Oct 29, 2016 •

edited

Loading

dustinvtran commented Oct 29, 2016

dustinvtran commented Nov 7, 2016

adjidieng commented Nov 9, 2016

adjidieng commented Nov 15, 2016

dustinvtran Nov 15, 2016

dustinvtran Nov 15, 2016

adjidieng Nov 15, 2016

adjidieng Nov 15, 2016

dustinvtran Nov 15, 2016

adjidieng Nov 15, 2016

dustinvtran Nov 15, 2016

adjidieng Nov 15, 2016

dustinvtran Nov 15, 2016

adjidieng Nov 15, 2016

dustinvtran Nov 15, 2016

dustinvtran Nov 15, 2016

adjidieng Nov 15, 2016

dustinvtran commented Nov 15, 2016

adjidieng commented Nov 17, 2016

dawenl commented Nov 17, 2016

adjidieng commented Nov 17, 2016

dustinvtran commented Nov 17, 2016

dustinvtran commented Apr 27, 2017

		@@ -310,6 +310,44 @@ def multivariate_rbf(x, y=0.0, sigma=1.0, l=1.0):
		tf.exp(-1.0 / (2.0 * tf.pow(l, 2.0)) * tf.reduce_sum(tf.pow(x - y, 2.0)))


		def multivariate_rbf_kernel(x, sigma=1.0, l=1.0):

stable gaussian process classification with global kernel #316

stable gaussian process classification with global kernel #316

Conversation

adjidieng commented Oct 29, 2016 • edited Loading

dustinvtran commented Oct 29, 2016

dustinvtran commented Nov 7, 2016

adjidieng commented Nov 9, 2016

adjidieng commented Nov 15, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dustinvtran commented Nov 15, 2016

adjidieng commented Nov 17, 2016

dawenl commented Nov 17, 2016

adjidieng commented Nov 17, 2016

dustinvtran commented Nov 17, 2016

dustinvtran commented Apr 27, 2017

adjidieng commented Oct 29, 2016 •

edited

Loading