Deep-Kernel-GP

Dependencies

The package has numpy and scipy.linalg as dependencies. The examples also use matplotlib and scikit-learn

Introduction

Instead of learning a mapping X-->Y with a neural network or GP regression, we learn the following mappings: X-->Z-->Y where the first step is performed by a neural net and the second by a gp regression algorithm.

This way we are able to use GP Regression to learn functions on data where the the assumption that y(x) is a gaussian surface with covariance specified by one of the standard covariance fucntions, might not be a fair assumption. For instance we can learn functions with image pixels as inputs or functions with length scales that varies with the input.

The parameters of the neural net are trained maximizing the log marginal likelihood implied by z(x_train) and y_train.

Deep Kernel Learning - A.G. Wilson ++

Using Deep Belief Nets to Learn Covariance Kernels for Gaussian Processes - G. Hinton ++

Examples

Basic usage is done with a Scikit ish API:

layers=[]
layers.append(Dense(32,activation='tanh'))
layers.append(Dense(1))
layers.append(CovMat(kernel='rbf'))

opt=Adam(1e-3) # or opt=SciPyMin('l-bfgs-b')

gp=NNRegressor(layers,opt=opt,batch_size=x_train.shape[0],maxiter=1000,gp=True,verbose=True)
gp.fit(x_train,y_train)
y_pred,std=gp.predict(x_test)

The example creates a mapping z(x) where both x and z are 1d vectors using a neural network with 1 hidden layer. The CovMat layer creates a covariance matrix from z using the covariance function v*exp(-0.5*|z1-z2|**2) with noise y where x and y are learned during training.

x and y are available after training as gp.layers[-1].var and gp.layers[-1].s_alpha. The gp.fast_forward() function can be used to extract the z(x) function (It skips the last layer that makes an array of size [batch_size, batch_size]).

Learning a function with varying length scale

In the example.py script, deep kernel learning (DKL) is used to learn from samples of the function sin(64(x+0.5)**4).

Learning this function with a Neural Network would be hard, since it can be challenging to fit rapidly oscilating functions using NNs. Learning the function using GPRegression with a squared exponential covariance function, would also be suboptimal, since we need to commit to one fixed length scale. Unless we have a lot of samples,we would be forced to give up precision on the slowly varying part of the function.

DKL Prediction:

DKL Prediction

z(x) function learned by neural network.

We see that DKL solves the problem quite nicely, given the limited data. We also see that for x<-0.5 the std.dev of the DKL model does not capture the prediction error.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
dknet		dknet
loo-loss		loo-loss
README.md		README.md
ex1_1.png		ex1_1.png
ex1_2.png		ex1_2.png
ex2_1.png		ex2_1.png
example.py		example.py
example_mnist.py		example_mnist.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

dknet

dknet

loo-loss

loo-loss

README.md

README.md

ex1_1.png

ex1_1.png

ex1_2.png

ex1_2.png

ex2_1.png

ex2_1.png

example.py

example.py

example_mnist.py

example_mnist.py

Repository files navigation

Deep-Kernel-GP

Dependencies

Introduction

Examples

Learning a function with varying length scale

About

Releases

Packages

Languages

maka89/Deep-Kernel-GP

Folders and files

Latest commit

History

Repository files navigation

Deep-Kernel-GP

Dependencies

Introduction

Examples

Learning a function with varying length scale

About

Topics

Resources

Stars

Watchers

Forks

Languages