# Prior selection
In this notebook we illustrate the selection of a prior for a gaussian process.

The prior is the value returned by the gaussian process when the uncertainty is too large, in can be any model of the data (and will be learned conjointly with the gaussian process during the training).

In [1]:
from fastai.tabular.all import *
from tabularGP import tabularGP_learner
from tabularGP.prior import *

## Data

Builds a regression problem on a subset of the adult dataset:

In [2]:
path = untar_data(URLs.ADULT_SAMPLE)
df = pd.read_csv(path/'adult.csv').sample(1000)
procs = [FillMissing, Normalize, Categorify]

In [3]:
cat_names = ['workclass', 'education', 'marital-status', 'occupation', 'relationship', 'race']
cont_names = ['education-num', 'fnlwgt']
dep_var = 'age'

In [4]:
data = TabularDataLoaders.from_df(df, path, procs=procs, cat_names=cat_names, cont_names=cont_names, y_names=dep_var)

## Priors

The default prior is a constant (`ConstantPrior`), it tend to be a solid choice across problems (as long as the problem is stationary):

In [5]:
learn = tabularGP_learner(data, prior=ConstantPrior)
learn.fit_one_cycle(5, max_lr=1e-3)

epoch,train_loss,valid_loss,time
0,6.999245,6.444979,00:04
1,10.029132,10.245059,00:02
2,9.672515,9.264366,00:02
3,9.207201,9.275622,00:02
4,8.874594,9.348218,00:02


The simplest prior is the `ZeroPrior` which returns zero for all values. It is only recommended if you have prior knowledge on your output domain and know that zero should be the default value:

In [6]:
learn = tabularGP_learner(data, prior=ZeroPrior)
learn.fit_one_cycle(5, max_lr=1e-3)

epoch,train_loss,valid_loss,time
0,7.948129,7.006125,00:02
1,10.382019,9.145267,00:02
2,9.810264,8.853174,00:02
3,9.34489,8.924518,00:02
4,9.031573,9.014092,00:02


We also provide a linear prior (`LinearPrior`) which is usefull when you know that your output is non stationary and follows a trend:

In [7]:
learn = tabularGP_learner(data, prior=LinearPrior)
learn.fit_one_cycle(5, max_lr=1e-3)

epoch,train_loss,valid_loss,time
0,9.789579,7.680765,00:02
1,8.720916,6.786974,00:02
2,8.115565,6.657002,00:02
3,7.76941,6.658643,00:02
4,7.571187,6.686262,00:02


## Transfer learning

As the prior can be any arbitrary model (as long as the input and output types are compatible), nothing stops us from building our gaussian process on top of a prior model (which might be a gaussian process used for a similar output or a different type of model used for the same output type).

Here is a deep neural network trained on the same task as our gaussian process:

In [8]:
learn_dnn = tabular_learner(data, layers=[200,100])
learn_dnn.fit_one_cycle(5, max_lr=1e-3)

epoch,train_loss,valid_loss,time
0,1756.064087,1494.598022,00:00
1,1705.314209,1468.095947,00:00
2,1630.018311,1421.468384,00:00
3,1533.601562,1369.066528,00:00
4,1460.467773,1313.670288,00:00


We can now pass the trained prior to the prior argument of our builder:

In [9]:
learn = tabularGP_learner(data, prior=learn_dnn)
learn.fit_one_cycle(5, max_lr=1e-3)

epoch,train_loss,valid_loss,time
0,7.707906,6.73713,00:02
1,7.474609,6.405858,00:02
2,7.281069,6.397628,00:02
3,7.120369,6.418474,00:02
4,9.726858,10.190261,00:02


Note that, by default, the prior is frozen when transfering knowledge. Lets unfreeze it now that the rest of the gaussian process is trained:

In [10]:
learn.unfreeze(prior=True)
learn.fit_one_cycle(5, max_lr=1e-3)

epoch,train_loss,valid_loss,time
0,9.013882,8.64404,00:02
1,8.379719,8.049027,00:02
2,7.875585,7.847621,00:02
3,7.499548,7.844793,00:02
4,7.267066,7.893331,00:02
