# 4. Constructing kernels
In class, we saw that by choosing a kernel $K(x, z) = \phi(x).T \phi(z)$, we can implicitly map data to a high dimensional space, and have the SVM algorithm work in that space. One way to generate kernels is to explicitly define the mapping $\phi$ to a higher dimensional space, and then work out the corresponding K.
However in this question we are interested in direct construction of kernels. I.e., suppose we have a function $K(x, z)$ that we think gives an appropriate similarity measure for our learning problem, and we are considering plugging $K$ into the SVM as the kernel function. However for $K(x,z)$ to be a valid kernel, it must correspond to an inner product in some higher dimensional space resulting from some feature mapping $\phi$. Mercer's theorem tells us that $K(x,z)$ is a (Mercer) kernel if and only if for any finite set $\{x^{(1)}, \ldots , x^{(m)}\}$, the square matrix $K\in \mathbb{R}^{m\times m}$ whose entries are given by $K_{ij} = K(x^{(i)},x^{(j)})$ is symmetric and positive semidefinite. You can find more details about Mercer’s theorem in the notes, though the description above is sufficient for this problem.

Now here comes the question: Let $K_1, K_2$ be kernels over $\mathbb{R}^n\times \mathbb{R}^n$, let $a\in \mathbb{R}^+$ be a positive real number, let $f:\mathbb{R}^n\rightarrow \mathbb{R}$ be a real-valued function, let $\phi:\mathbb{R}^n\rightarrow \mathbb{R}^d$ be a function mapping from $\mathbb{R}^n$ to $\mathbb{R}^d$, let $K_3$ be a kernel over $\mathbb{R}^d\times \mathbb{R}^d$, and let $p(x)$ a polynomial over $x$ with positive coefficients.

For each of the functions $K$ below, state whether it is necessarily a kernel. If you think it is, prove it; if you think it isn't, give a counter-example.


<b>Hint:</b> For part (e), the answer is that K is indeed a kernel. You still have to prove it, though. (This one may be harder than the rest.) This result may also be useful for another part of the problem.

<b>(a)</b> [1 points] $K(x, z) = K_1(x, z) + K_2(x, z)$

### Answer: 
Yes. Using Mercer's theorem, $K(x,z)$ is a kernel, since the sum of two symetric positive semidefinite matrix is a symetric positive semidefinite matrix.

<b>(b)</b> [1 points] $K(x, z) = K_1(x, z) − K_2(x, z)$

### Answer:
Not necessarily. For instance, assume that $K_1(x, z) = x^T.y$. If we set  $K_2(x, z)= 2K_1(x, z)$, then $K(x, z) = -x^T.y$ which is not clearly a Kernel since Mercer's theorem does not hold.

<b>(c)</b> [1 points] $K(x, z) = aK_1(x, z)$

### Answer: 
Yes. Indeed, a positive multiplication of a symetric positive semidefinite matrix is a symetric positive semidefinite matrix.

<b>(d)</b> [1 points] $K(x, z) = −aK_1(x, z)$

### Answer:
No. A negative multiplication of a symetric positive semidefinite matrix is not a positive semidefinite matrix.

<b>(e)</b> [5 points] $K(x, z) = K_1(x, z)K_2(x, z)$

### Answer:
We should prove that, for any finite set $\{x^{(1)}, \ldots , x^{(m)}\}$, the square matrix $K\in \mathbb{R}^{m\times m}$ whose entries are given by $K_{ij} = K(x^{(i)},x^{(j)})$ is symmetric and positive semidefinite. Since $K_1$ and $K_2$ are symmetric and positive semidefinite, thera are $S_1,S_2\in\mathbb{R}^{m\times m}$ such that $K_1=S_1S_1^T$ and $K_2=S_2S_2^T$. 
To prove that $K$ is positive semidefinite, it suffices to show that $zKz^T\geq 0$ for each  $z\in \mathbb{R}^m$. 
For a given vector $z\in \mathbb{R}^m$, we have 
    
\begin{align*}
zKz^T 
& = zK_1\circ K_2 z^T\\
& = z(S_1S_1^t)\circ(S_2S_2^t)z^T\\
& = z(S_1\circ S_2)(S_1\circ S_2)^Tz^T\\
& = z(S_1\circ S_2)(S_1\circ S_2)^Tz^T\\
& = \|z(S_1\circ S_2)\|_2^2\\
&\geq 0,\\
\end{align*}
where $\circ$ refers to elementwise product of two matrices (the Hadamard product).

<b>It is noticable that in general, we have the follwoing theorem. 
    
<b>Theorem (Schur product theorem).</b> 
The Hadamard product of two positive (semi)definite matrices is also a positive definite matrix.</b>

__(f)__ [3 points] $K(x,z) = f(x)f(z)$

### <font color=red> Answer:</font>
<font color=blue>
We should prove that, for any finite set $\{x^{(1)}, \ldots , x^{(m)}\}$, the square matrix $K\in \mathbb{R}^{m\times m}$ whose entries are given by $K_{ij} = K(x^{(i)},x^{(j)})$ is symmetric and positive semidefinite.
To prove that $K$ is positive semidefinite, it suffices to show that $zKz^T\geq 0$ for each  $z\in \mathbb{R}^m$. 
For a given vector $z\in \mathbb{R}^m$, we have 
    
\begin{align*}
zKz^T 
& = \sum_i\sum_j z_iK(x^{i},x^{(j)})z_j\\
& = \sum_i\sum_j z_if(x^{(i)})f(x^{(j)})z_j\\
& = \sum_i z_if(x^{(i)})\sum_jf(x^{(j)})z_j\\
& = \left(\sum_i z_if(x^{(i)})\right)^2\\
&\geq 0.\\
\end{align*}
</font>

__(g)__ [3 points] $K(x, z) = K_3(\phi(x), \phi(z))$

### <font color=red> Answer:</font>
<font color=blue>
Yes. Here $K$ is a valid kernel since for any finite set $\{x^{(1)}, \ldots , x^{(m)}\}\subset \mathbb{R}^d$, the square matrix $K\in \mathbb{R}^{m\times m}$ whose entries are given by $K_{ij} = K_3(x^{(i)},x^{(j)})$ is symmetric and positive semidefinite and it does not matter that these $x^{(j)}$'s are came through a mapping $\phi(\cdot)$. 
</font>

__(h)__ [3 points] $K(x, z) = p(K_1(x, z))$

### <font color=red> Answer:</font>
<font color=blue>
Yes. Using (a), (c), and (e), we can simply prove it.
</font>