<div class="title">Non-Linear Models and SVMs: Review</div>
<div class="subtitle">Machine Learning</div>
<div class="author">Carlos María Alaíz Gudín &mdash; Universidad Autónoma de Madrid</div>

---

**Configuration**

This cell defines the general configuration of Jupyter Notebook.

In [None]:
%%html
<head><link rel="stylesheet" href="style.css"></head>

This cell imports the packages to be used.

In [None]:
# Standard packages.
import matplotlib
from matplotlib import pyplot as plt
import numpy as np
from sklearn.metrics import mean_squared_error
import sys

# Initialisations.
matplotlib.rc("figure", figsize=(15, 5))
sys.dont_write_bytecode = True

# Review of Non-Linear Models and SVMs

## Key Concepts

* Generalized Linear Model.


* Non-Linear Mapping or Embedding.


* Feature Construction.


* Set of Basis Functions.


* Adaptive Basis Functions.


* Dual Problem.


* Kernel Trick.


* Kernel Function.


* Kernel Ridge Regression.


* Margin.


* Hard-Margin Support Vector Machine.


* Dual Formulation.


* Support Vector.


* Soft-Margin Support Vector Machine.


* Hinge Loss Function.


* Support Vector Regression.


* $\epsilon$-Insensitive Loss.


* RBF or Gaussian Kernel.

## Additional Exercises

### Generalized Linear Model

Given the generalized linear model built over the mapping
$$ \boldsymbol{\phi}(x_1, x_2) = (x_1^2, x_2^2, x_1 x_2 + 2) , $$
and with parameters $\boldsymbol{\theta} = \{ b = 2, \mathbf{w} = (1, 2, 3)^\intercal \}$, and given the following dataset:

| $$x_{i, 1}$$ | $$x_{i, 2}$$ | $$y_i$$ |
|--------|---------|--------|
|    3   |    2    |   40   |
|    1   |    4    |   55   |
|    0   |    0    |   10   |

1. Compute the Mean Squared Error.

In [None]:
################################################################################
# Insert code.
def map_data(X):
    x1 = X[:, 0]
    x2 = X[:, 1]
    return np.column_stack((x1**2, x2**2, x1 * x2 + 2))


b = 2
w = np.array([1, 2, 3])

x = np.array([[3, 2], [1, 4], [0, 0]])
y = np.array([40, 55, 10])

Phi = map_data(x)
pred = Phi @ w + b

for i in range(len(x)):
    print("\nx = ", x[i], "\nphi(x) = ", Phi[i], "\nf(x) = ", pred[i])

print("\nMSE: {:.2f}".format(mean_squared_error(y, pred)))
################################################################################

### Margin of a Linear Model

Given a 2-dimensional linear classification model with parameters $\boldsymbol{\theta} = \{ b = -1.25, \mathbf{w} = (1, 1)^\intercal \}$, and given the following dataset:

| $$x_{i, 1}$$ | $$x_{i, 2}$$ |
|--------|---------|
|    1   |    0    |
|    0   |    1    |
|    1   |    1    |

1. Compute the margin of the linear model.

In [None]:
################################################################################
# Insert code.
b = -1.25
w = np.array([1, 1])

x = np.array([[1, 0], [0, 1], [1, 1]])

n_w = np.linalg.norm(w)
margin = np.inf
for i in range(len(x)):
    pred = x[i] @ w + b
    dist = np.abs(pred) / n_w
    margin = min(dist, margin)

    print("x        = ", x[i])
    print("Distance = {:.2f}\n".format(dist))

print("\n\tMargin: {:.2f}".format(margin))

xlim = np.array([-0.5, 1.5])
plt.plot(xlim, (-b - xlim * w[0]) / w[1], "--")
plt.scatter(x[:, 0], x[:, 1])
plt.xlabel("$x_1$")
plt.ylabel("$x_2$")
plt.axis("equal")
plt.show()
################################################################################

### Hinge Loss

Given a 2-dimensional linear classification model with parameters $\boldsymbol{\theta} = \{ b = -2.5, \mathbf{w} = (2, 2)^\intercal \}$, and given the following dataset:

| $$x_{i, 1}$$ | $$x_{i, 2}$$ | $$y_i$$ |
|--------|---------|--------|
|    1   |    0    |   -1   |
|    0   |    1    |    1   |
|    1   |    1    |    1   |

1. Compute the hinge loss error for each pattern.

In [None]:
################################################################################
# Insert code.
b = -2.5
w = np.array([2, 2])

x = np.array([[1, 0], [0, 1], [1, 1]])
y = np.array([-1, 1, 1])

for i in range(len(x)):
    pred = x[i] @ w + b
    loss = np.maximum(1 - y[i] * pred, 0)

    print("x     = ", x[i])
    print("Error = {:.2f}\n".format(loss))

xlim = np.array([-0.5, 1.5])
plt.plot(xlim, (-b - xlim * w[0]) / w[1], "--")
plt.plot(xlim, (-b - xlim * w[0] + 1) / w[1], ":k")
plt.plot(xlim, (-b - xlim * w[0] - 1) / w[1], ":k")
plt.scatter(x[:, 0], x[:, 1], c=y)
plt.xlabel("$x_1$")
plt.ylabel("$x_2$")
plt.axis("equal")
plt.show()
################################################################################

### $\epsilon$-Insensitive Loss

Given a 2-dimensional linear regression model with parameters $\boldsymbol{\theta} = \{ b = -2.5, \mathbf{w} = (2, 2)^\intercal \}$, and given the following dataset:

| $$x_{i, 1}$$ | $$x_{i, 2}$$ | $$y_i$$ |
|--------|---------|--------|
|    1   |    0    |   -1   |
|    0   |    1    |    0   |
|    1   |    1    |    1.4 |

1. Compute the $\epsilon$-insensitive loss error for each pattern, for $\epsilon = 0.2$.

In [None]:
################################################################################
# Insert code.
b = -2.5
w = np.array([2, 2])
epsilon = 0.2

x = np.array([[1, 0], [0, 1], [1, 1]])
y = np.array([-1, 0, 1.4])

for i in range(len(x)):
    pred = x[i] @ w + b
    diff = np.abs(pred - y[i])
    loss = np.maximum(diff - epsilon, 0)

    print("x     = ", x[i])
    print("y     = {:.2f}".format(y[i]))
    print("f(x)  = {:.2f}".format(pred))
    print("Error = {:.2f}\n".format(loss))
################################################################################