# Kernel Functions

We refer [here](https://scikit-learn.org/stable/modules/classes.html#module-sklearn.gaussian_process.kernels) for implementations of different kernel functions, and [the kernel cookbook](https://www.cs.toronto.edu/~duvenaud/cookbook/) which contains instructions on how to choose kernels.

## Import packages

In [1]:
import numpy as np
from sklearn.gaussian_process.kernels import RBF
from sklearn.gaussian_process.kernels import Matern
from scipy.spatial.distance import cdist

## Gaussian RBF kernel

The Gaussian RBF kernel is defined as

\begin{equation}
K(x,x') := \operatorname{exp}\left(-\frac{\|x-x'\|_2^2}{2\sigma^2}\right),
\end{equation}

where $\sigma$ is called the length-scale parameter. An important fact to note is that for the length-scale parameter $\nu \to \infty$, the Matern kernel becomes identical to the Gaussian RBF kernel. See [here](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.RBF.html#sklearn.gaussian_process.kernels.RBF) and references herein for further details.



## Matèrn 2.5 kernel

The class of Matèrn kernels is a generalization of the RBF kernel, see [here](https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Matern.html#sklearn.gaussian_process.kernels.Matern). It has an additional parameter which controls the smoothness of the resulting function. The smaller $\nu$, the less smooth the approximated function is. Here $\nu=2.5$ corresponds to twice differentiable functions. The kernel is defined as

\begin{equation}
K(x,x') := \frac{1}{\Gamma(\nu) \, 2^{\nu - 1}} \left(\frac{\sqrt{2\nu}}{\rho} \|x-x'\|_2\right)^\nu K_{\nu}\left(\frac{\sqrt{2\nu}}{\rho} \|x-x'\|_2\right),
\end{equation}

where $\Gamma(\cdot)$ is the Gamma function, $K_\nu(\cdot)$ is the modified Bessel function of the second kind, and $\rho >0$ is the length-scale parameter.

## Laplacian kernel

When the parameter $\nu=0.5$ for the class of Matèrn kernels, it is called the Laplacian kernel. In particular, the Laplacian kernel has the expression, see [here](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.laplacian_kernel.html),

\begin{equation}
K(x,x') = \operatorname{exp}\left(-\gamma\|x - x'\|_1\right),
\end{equation}

where $\gamma>0$ is the length-scale parameter. It corresponds to less smooth functions than the Matèrn $2.5$ or the RBF family.

## Choice of kernel

All the above kernels are respectively used to represent functions of varying smoothness. Although there are many other kernels that are used in practice, nevertheless, we will use just the above three kernels here, namely, the Gaussian RBF, the Matèrn $2.5$, and the Laplacian. For further details on the class of kernels used in prediction processes, see [here](https://scikit-learn.org/stable/modules/gaussian_process.html#gp-kernels). Depending on the kernel chosen, you should check with varying the length-scale parameters. Furthermore, a good practice is to see the scatterplot of the features and the target variables. This gives you an idea as to which kernel can be more appropriate.  

In [2]:
# Complete the function below that evaluates the kernel matrix K and G for a particular choice among the three kernels

def kernel_choice(X_train, X_test=None, kernel_type='RBF', length_scale=1.0):
    
    n_samples_train = X_train.shape[0]
    
    # Select the appropriate kernel based on the specified kernel_type
    if kernel_type == 'RBF':
        kernel = RBF(length_scale=length_scale)
    elif kernel_type == 'Matern':
        kernel = Matern(length_scale=length_scale, nu=2.5)
    elif kernel_type == 'Laplacian':
        # Laplacian kernel is equivalent to the Exponential kernel with nu=0.5 in the Matérn class
        kernel = Matern(length_scale=length_scale, nu=0.5)
    else:
        raise ValueError("Invalid kernel type. Choose 'RBF', 'Matern', or 'Laplacian'.")
        
    # Compute the kernel matrix K
    K = kernel(X_train)
    
    # If X_test is provided, compute the kernel matrix G between X_test and X_train
    if X_test is not None:
        G = kernel(X_test, X_train)
        return K, G
    else:
        return K

## Convex combination of kernels

In [3]:
# Complete the function below that evaluates the kernel matrix K and G for any convex combination of the three kernels

def convex_kernel(X_train, X_test, weights, length_scales):
    
    # Verify that the weights add to 1 and are non-negative
    if not np.isclose(sum(weights.values()), 1.0):
        raise ValueError("Weights should sum to 1.")
    if any(w < 0 for w in weights.values()):
        raise ValueError("Weights should be non-negative.")
    
    K_combined = np.zeros((X_train.shape[0], X_train.shape[0]))
    G_combined = np.zeros((X_test.shape[0], X_train.shape[0]))

    for kernel_type, weight in weights.items():
        K, G = kernel_choice(X_train, X_test, kernel_type=kernel_type, length_scale=length_scales[kernel_type])
        K_combined += weight * K
        G_combined += weight * G

    return K_combined, G_combined

## Product of kernels

In [4]:
# Complete the function below that evaluates the kernel matrix K and G for product of the three kernels

def product_kernel(X_train, X_test, length_scales):
    """
    Compute the kernel matrix K and matrix G for the product of the three kernels.

    Parameters:
    X_train (numpy.ndarray): Training data points, shape (n_samples_train, n_features)
    X_test (numpy.ndarray): Test data points, shape (n_samples_test, n_features)
    length_scales (dict): Length-scale parameters for each kernel, e.g., {'RBF': 1.0, 'Matern': 1.0, 'Laplacian': 1.0}

    Returns:
    tuple: Product kernel matrices K and G
    """
    # Initialize the product matrices
    K_product = np.ones((X_train.shape[0], X_train.shape[0]))
    G_product = np.ones((X_test.shape[0], X_train.shape[0]))

    for kernel_type in ['RBF', 'Matern', 'Laplacian']:
        K, G = kernel_choice(X_train, X_test, kernel_type=kernel_type, length_scale=length_scales[kernel_type])
        K_product *= K
        G_product *= G

    return K_product, G_product

## Writing kernel_functions.py

In [5]:
%%writefile kernel_functions.py
import numpy as np
from sklearn.gaussian_process.kernels import RBF
from sklearn.gaussian_process.kernels import Matern
from scipy.spatial.distance import cdist
def kernel_choice(X_train, X_test=None, kernel_type='RBF', length_scale=1.0):
    
    n_samples_train = X_train.shape[0]
    
    # Select the appropriate kernel based on the specified kernel_type
    if kernel_type == 'RBF':
        kernel = RBF(length_scale=length_scale)
    elif kernel_type == 'Matern':
        kernel = Matern(length_scale=length_scale, nu=2.5)
    elif kernel_type == 'Laplacian':
        # Laplacian kernel is equivalent to the Exponential kernel with nu=0.5 in the Matérn class
        kernel = Matern(length_scale=length_scale, nu=0.5)
    else:
        raise ValueError("Invalid kernel type. Choose 'RBF', 'Matern', or 'Laplacian'.")
        
    # Compute the kernel matrix K
    K = kernel(X_train)
    
    # If X_test is provided, compute the kernel matrix G between X_test and X_train
    if X_test is not None:
        G = kernel(X_test, X_train)
        return K, G
    else:
        return K

# Complete the function below that evaluates the kernel matrix K and G for any convex combination of the three kernels

def convex_kernel(X_train, X_test, weights, length_scales):
    
    # Verify that the weights add to 1 and are non-negative
    if not np.isclose(sum(weights.values()), 1.0):
        raise ValueError("Weights should sum to 1.")
    if any(w < 0 for w in weights.values()):
        raise ValueError("Weights should be non-negative.")
    
    K_combined = np.zeros((X_train.shape[0], X_train.shape[0]))
    G_combined = np.zeros((X_test.shape[0], X_train.shape[0]))

    for kernel_type, weight in weights.items():
        K, G = kernel_choice(X_train, X_test, kernel_type=kernel_type, length_scale=length_scales[kernel_type])
        K_combined += weight * K
        G_combined += weight * G

    return K_combined, G_combined

def product_kernel(X_train, X_test, length_scales):
  
    # Initialize the product matrices
    K_product = np.ones((X_train.shape[0], X_train.shape[0]))
    G_product = np.ones((X_test.shape[0], X_train.shape[0]))

    for kernel_type in ['RBF', 'Matern', 'Laplacian']:
        K, G = kernel_choice(X_train, X_test, kernel_type=kernel_type, length_scale=length_scales[kernel_type])
        K_product *= K
        G_product *= G

    return K_product, G_product

Writing kernel_functions.py
