## Gaussian process classification
Here is a very simple example script that applies [Gaussian process](https://scikit-learn.org/stable/modules/gaussian_process.html) classification to the Titanic data set. 
In [Gaussian process classification](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html) a [kernel function](https://scikit-learn.org/stable/modules/gaussian_process.html#gp-kernels) (or [covariance](https://en.wikipedia.org/wiki/Covariance) function) is used to help to mould the shape of prior and posterior values. It is assumed that any uncertainty, or noise, is Gaussian in nature (i.e. can be sampled from a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution)), hence the name.

In [1]:
import pandas  as pd

#===========================================================================
# read in the data
#===========================================================================
train_data = pd.read_csv('../input/titanic/train.csv')
test_data  = pd.read_csv('../input/titanic/test.csv')

#===========================================================================
# select some features of interest ("ay, there's the rub", Shakespeare)
#===========================================================================
features = ["Pclass", "Sex", "SibSp", "Parch"]

#===========================================================================
# for the features that are categorical we use pd.get_dummies:
# "Convert categorical variable into dummy/indicator variables."
#===========================================================================
X_train       = pd.get_dummies(train_data[features])
y_train       = train_data["Survived"]
final_X_test  = pd.get_dummies(test_data[features])

#===========================================================================
# create the kernel 
#===========================================================================
from sklearn.gaussian_process.kernels import RBF
kernel = 1.0 * RBF(1.0)

#===========================================================================
# perform the classification
#===========================================================================
from sklearn.gaussian_process import GaussianProcessClassifier
classifier = GaussianProcessClassifier(kernel=kernel)

#===========================================================================
# and the fit 
#===========================================================================
classifier.fit(X_train, y_train)

#===========================================================================
# use the model to predict 'Survived' for the test data
#===========================================================================
predictions = classifier.predict(final_X_test)

#===========================================================================
# write out CSV submission file
#===========================================================================
output = pd.DataFrame({'PassengerId': test_data.PassengerId, 
                       'Survived': predictions})
output.to_csv('submission.csv', index=False)

### Kernel functions available in scikit-learn are:
* [CompoundKernel](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.CompoundKernel.html): Kernel which is composed of a set of other kernels.
* [ConstantKernel](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.ConstantKernel.html): Constant kernel.
* [DotProduct](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.DotProduct.html): Dot-Product kernel.
* [Exponentiation](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Exponentiation.html): The Exponentiation kernel takes one base kernel and a scalar parameter and combines them.
* [ExpSineSquared](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.ExpSineSquared.html): Exp-Sine-Squared kernel (aka periodic kernel).
* [Hyperparameter](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Hyperparameter.html): A kernel hyperparameter’s specification in form of a namedtuple.
* [Kernel](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Kernel.html): Base class for all kernels.
* [Matern](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Matern.html): The class of Matern kernels is a generalization of the radial-basis function kernel (aka squared-exponential kernel).
* [PairwiseKernel](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.PairwiseKernel.html): Wrapper for kernels in sklearn.metrics.pairwise.
* [Product](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Product.html): The Product kernel takes two kernels and combines them.
* [RationalQuadratic](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.RationalQuadratic.html): The RationalQuadratic kernel can be seen as a scale mixture (an infinite sum) of radial-basis function kernels with different characteristic length scales.
* [RBF](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.RBF.html): Radial-basis function kernel (aka squared-exponential kernel).
* [Sum](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Sum.html): The Sum kernel takes two kernels and combines them.
* [WhiteKernel](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.WhiteKernel.html): The main use-case of the White kernel is as part of a sum-kernel where it explains the noise of the signal as independently and identically normally-distributed.

For more information on which choice to make see [The Kernel Cookbook: 'Advice on Covariance functions'](https://www.cs.toronto.edu/~duvenaud/cookbook/), by David Duvenaud.

## Related reading
* [Gaussian process](https://en.wikipedia.org/wiki/Gaussian_process) on Wikipedia
* [Gaussian Processes](https://scikit-learn.org/stable/modules/gaussian_process.html) on Scikit-learn
* [GaussianProcessRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html) on Scikit-learn
* [Carl Edward Rasmussen and Christopher K. I. Williams "Gaussian Processes for Machine Learning"](http://www.gaussianprocess.org/gpml/) (book website)
* [Gaussian Process Regression Models](https://www.mathworks.com/help/stats/gaussian-process-regression-models.html) on MathWorks

### Related notebooks:
* [Gaussian process regression and classification](https://www.kaggle.com/residentmario/gaussian-process-regression-and-classification) by Aleksey Bilogur
* [Feature Engineering with Gaussian Process](https://www.kaggle.com/kenmatsu4/feature-engineering-with-gaussian-process) by kenmatsu4