# Practice 5

Suppose you have 3 data points $x_1 = (1, 2)$, $x_2 = (3, 1)$ and $x_3 = (0, 1)$ and a kernel function
$k(x, y) = (< x, y > +2)^2$


In [1]:
import numpy as np
from itertools import combinations

def k(x,y):
    return (np.dot(x,y) + 2)**2

x1 = np.array([1, 2])
x2 = np.array([3, 1])
x3 = np.array([0, 1])

data = [x1,x2,x3]

## Exercise 1

What is the dimension of the feature space $F$ induced by the kernel, and what is the kernel induced function $Φ : X → F$ ?

$$
\begin{array}{ccl}
    < Φ(x), Φ(y) >_F & = & k(x,y) \\
                     & = & (<x,y> + 2)^2 \\
                     & = & (<x,y>)^2 + 4<x,y> + 4 \\
                     & = & (x_1y_1 + x_2y_2)^2 + 4(x_1y_1 + x_2y_2) + 4 \\
                     & = & x_1^2y_1^2 + 2x_1y_1x_2y_2 + x_2^2y_2^2 + 4x_1y_1 + 4x_2y_2 + 4 \\
                     & = & <(x_1^2 , x_2^2 , \sqrt{2}x_1x_2, 2x_1, 2x_2, 2) , (y_1^2 , y_2^2 , \sqrt{2}y_1y_2, 2y_1, 2y_2, 2)>
\end{array}
$$

In this sense, we can define $\Phi$ as:

$$
\begin{array}{ccc}
    Φ : & \mathbb{R}^2 & \rightarrow & \mathbb{R}^6 \\
        & (x_1 , x_2) & \mapsto & (x_1^2 , x_2^2 , \sqrt{2}x_1x_2, 2x_1, 2x_2, 2)
\end{array}
$$

Given $F = \mathbb{R}^6$, we can say that the dimension of the feature space is $6$.

In [2]:
phi = lambda x: np.array([x[0]**2, x[1]**2, np.sqrt(2)*x[0]*x[1], 2*x[0], 2*x[1], 2])

for x,y in combinations([0,1,2], 2):
    x1_ = data[x]
    x2_ = data[y]
    assert np.allclose(k(x1_, x2_), np.dot(phi(x1_), phi(x2_))) , "Test failed for x1 and x2"
    print(f"Test passed for x{x+1} and x{y+1}: k(x{x+1},x{x+1}) = {k(x1_, x2_)} = <Φ(x1), Φ(x2)>")

Test passed for x1 and x2: k(x1,x1) = 49 = <Φ(x1), Φ(x2)>
Test passed for x1 and x3: k(x1,x1) = 16 = <Φ(x1), Φ(x2)>
Test passed for x2 and x3: k(x2,x2) = 9 = <Φ(x1), Φ(x2)>


## Exercise 2

Calculate $< Φ(x_1), Φ(x_2) >_F$ in the feature space.

In [3]:
d_prod = k(x1,x2)
print("<Φ(x1), Φ(x2)> = k(x1,x2) = ", d_prod)

<Φ(x1), Φ(x2)> = k(x1,x2) =  49


## Exercise 3

Calculate the distance between all the data points in the feature space.

In [4]:
feature_dot = lambda x,y : k(x,x) + k(y,y) - 2*k(x,y)
data = [x1, x2, x3]
for x,y in combinations([0,1,2], 2):
    print(f'K(x{x+1}, x{y+1}) = {k(data[x], data[y])}')
    print(f'|| Φ(x{x+1}) - Φ(x{y+1}) || = {feature_dot(data[x], data[y])}')
    print(f'd(Φ(x{x+1}),Φ(x{y+1})) =  √{feature_dot(data[x], data[y])} = {np.sqrt(feature_dot(data[x], data[y])):.3f}', end='\n\n')

K(x1, x2) = 49
|| Φ(x1) - Φ(x2) || = 95
d(Φ(x1),Φ(x2)) =  √95 = 9.747

K(x1, x3) = 16
|| Φ(x1) - Φ(x3) || = 26
d(Φ(x1),Φ(x3)) =  √26 = 5.099

K(x2, x3) = 9
|| Φ(x2) - Φ(x3) || = 135
d(Φ(x2),Φ(x3)) =  √135 = 11.619

