Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
[MRG+2] Allowing Gaussian process kernels on structured data - updated #15557
This PR is a from-scratch refresh of #13954 (with all previous modifications incorporated) based on the latest master branch.
Made the Gaussian process regressor and classifier compatible with generic kernels on arbitrary data. Specifically:
…base `Kernel` class. Removed `VectorKernelMixin` which is redundant in presence of a default `requires_vector_input()` method.
@NicolasHug Hi Nicolas, the primary motivation that I brought up the PR was due to a need to perform Gaussian process regressions on an ensemble of graphs. While the PR intends to bridge scikit-learn's GPR module to data including and beyond variable-length sequences, the actual changes involves no more than allowing non-vectorial data to be passed through the GP regressors/classifiers --- without being touched at all.
Thanks to the kernel trick, as discussed in the original issue, the logic of computing the kernel matrix from samples is delegated to a kernel, which will be user-supplied for sequence and generic data. As such, I would argue that this introduces a very minimal amount of burden to the development and maintenance of the GP module and does not disrupt the API and/or future development, while at the same time greatly extends the applicability of the module.