# Margin definition
Recall that the signed distance to a point $x$ from a hyperplane $\theta, \theta_{0}$ is $s d\left(x, \theta, \theta_{0}\right)=\frac{\theta^{T} x+\theta_{0}}{\|\theta\|}$.


\begin{exercise}


You start with a hyperplane $\theta, \theta_{0}$ and a point $x$. Suppose a new separator is given, where $\hat{\theta}=-\theta$ and $\hat{\theta}_{0}=-\theta_{0}$.

Which of the following is true?

- the signed distance changes sign but not magnitude

- the signed distance changes magnitude but not sign

- the signed distance stays the same 

- both the sign and the magnitude may change
\end{exercise}

Solution: the signed distance changes sign but not magnitude.

Explanation: for $-\theta, -\theta_{0}\space \space ,\space s d\left(x, -\theta, -\theta_{0}\right)=\frac{-\theta^{T} x-\theta_{0}}{\|-\theta\|}=-\frac{\theta^{T} x+\theta_{0}}{\|\theta\|}$.



\begin{exercise}


You start with a hyperplane $\theta, \theta_{0}$ and a point $x$. Suppose a new separator is given, where $\hat{\theta}=\theta$ and $\hat{\theta}_{0}=-\theta_{0}$.

Which of the following is true?

- the signed distance changes sign but not magnitude

- the signed distance changes magnitude but not sign

- the signed distance stays the same 

- both the sign and the magnitude may change
\end{exercise}

Solution: both the sign and the magnitude may change

Explanation:  for $\theta, -\theta_{0}\space \space ,\space s d\left(x, \theta, -\theta_{0}\right)=\frac{\theta^{T} x-\theta_{0}}{\|\theta\|}$




\begin{exercise}



The margin of labeled point $(x, y)$ with respect to separator $\theta, \theta_{0}$ is:
$$
\gamma\left(x, y, \theta, \theta_{0}\right)=\frac{y\left(\theta^{T} x+\theta_{0}\right)}{\|\theta\|}
$$
Let sd stand for $s d\left(x, \theta, \theta_{0}\right)$, the signed distance from the separator to $x$. Define the margin in terms of sd and $\mathrm{y}$, the label of $x$. Note that both of these are scalars. Provide an expression in Python syntax.

$$
\gamma\left(x, y, \theta, \theta_{0}\right)=
$$

\end{exercise}

Solutions: $\space \gamma\left(x, y, \theta, \theta_{0}\right)= y \cdot s d\left(x, \theta, \theta_{0}\right)$

Explanation: $ \space s d\left(x, \theta, \theta_{0}\right)=\frac{\theta^{T} x+\theta_{0}}{\|\theta\|} \Leftrightarrow y \cdot s d\left(x, \theta, \theta_{0}\right)= y \cdot \frac{\theta^{T} x+\theta_{0}}{\|\theta\|} \Leftrightarrow   y \cdot  s d\left(x, \theta, \theta_{0}\right)= \gamma\left(x, y, \theta, \theta_{0}\right)$





\begin{exercise}


What is the sign of the signed distance when the prediction is incorrect?

- positive 
- negative 
- could be either
\end{exercise}


Solution: could be either.

Explanation: the prediction depends on the sign of the distance , could be a false negative or false positive.



\begin{exercise}

What is the sign of the margin when the prediction is incorrect?

- positive 
- negative 
- could be either
\end{exercise}


Solution: negative .

Explanation: when the prediction is incorrect the actual value y and the prediction value given by the signed distance have opposite signs .


# Margin practice

\begin{exercise}
What are the margins of the labeled points $(x, y)=((3,2),+1),((1,1),-1)$, and $((4,2),-1)$ with respect to the separator defined by $\theta=(1,1), \theta_{0}=-4$ ? The situation is illustrated in the figure below.

![](margin.png)
\end{exercise}

Solution:

Explanation: $\space \gamma\left(x, y, \theta, \theta_{0}\right)=\frac{y\left(\theta^{T} x+\theta_{0}\right)}{\|\theta\|} \space$

for $\space (x, y)=((3,2),+1) ,\space\space \gamma\left(\begin{bmatrix} 3 \\ 2 \end{bmatrix}, 1,  \begin{bmatrix} 1 \\ 1 
\end{bmatrix}, -4\right)=\frac{1 \cdot \left(\begin{bmatrix} 1 & 1 
\end{bmatrix}\cdot \begin{bmatrix} 3 \\ 2 \end{bmatrix}-4\right)}{\sqrt{1^{2}+1^{2}}}= \frac{5-4}{\sqrt{2}}=\frac{\sqrt{2}}{2}$

for $\space (x, y)=((1,1),-1) ,\space\space \gamma\left(\begin{bmatrix} 1 \\ 1 \end{bmatrix}, -1,  \begin{bmatrix} 1 \\ 1 
\end{bmatrix}, -4\right)=\frac{-1 \cdot \left(\begin{bmatrix} 1 & 1 
\end{bmatrix}\cdot \begin{bmatrix} 1 \\ 1 \end{bmatrix}-4\right)}{\sqrt{1^{2}+1^{2}}}= \frac{4-2}{\sqrt{2}}=\sqrt{2}$

for $\space (x, y)=((4,2),-1) ,\space\space \gamma\left(\begin{bmatrix} 4 \\ 2 \end{bmatrix}, -1,  \begin{bmatrix} 1 \\ 1 
\end{bmatrix}, -4\right)=\frac{-1 \cdot \left(\begin{bmatrix} 1 & 1 
\end{bmatrix}\cdot \begin{bmatrix} 4 \\ 2 \end{bmatrix}-4\right)}{\sqrt{1^{2}+1^{2}}}= \frac{4-6}{\sqrt{2}}=-\sqrt{2}$


# Max Margin Separator

\begin{exercise}

Consider the four points and separator:
$$
\begin{aligned}
&\text { data }=\text { np. array }([[1,1,3,3],[3,1,4,2]]) \\
&\text { labels }=n p \cdot \operatorname{array}([[-1,-1,1,1]]) \\
&\text { th }=\mathrm{np} \cdot \operatorname{array}([[0,1]]) . \mathrm{T} \\
&\text { tho }=-3
\end{aligned}
$$
The situation is shown below:

![](sep1.png)

Enter the four margins in order as a Python list of four numbers.
\end{exercise}


In [1]:
import numpy as np
data = np.array([[1, 1, 3, 3],[3, 1, 4, 2]])
labels = np.array([[-1, -1, 1, 1]])
th = np.array([[0, 1]]).T
th0 = -3

Solution: 

Explanation:



In [19]:
#print(th)
sd=(np.dot(th.T,data)+th0)/np.linalg.norm(th) # since the norm of [0 1] = 1
print('signed distance :',sd)
margin=np.multiply(labels,sd)
print('margin :', margin)

signed distance : [[ 0. -2.  1. -1.]]
margin : [[-0.  2.  1. -1.]]


\begin{exercise}

A maximum margin separator is a separator that maximizes the minimum margin between that separator and all points in the dataset.
Enter $\theta$ and $\theta_{0}$ for a maximum margin separator as a Python list of three numbers.
Remember the equation of the separator is $wx+b=0$.
\end{exercise}

Solution: 
    
Explanation:
    


In [1]:
# creating a random maximum margin separator
separator=[0,1,-3]

\begin{exercise}

If you scaled this separator by a positive constant $k$ (i.e., replace $\theta$ by $k \theta_{\text {, and }} \theta_{0}$ by $k \theta_{0}$ ), would it still be a maximum margin separator?

yes 

no 

Justify your answer
\end{exercise}

Solution: yes 

Explanation: $\space \theta^{T} x+\theta_{0}=0 \Leftrightarrow k\theta^{T} x+k\theta_{0}=0$



Enraged by Gossip Girl's most recent social media post, John and George team up to expose Gossip Girl's secret identity. They determine that the suspects consist of people who attended the same party as Gossip Girl.

John and George know of four ATTENDEES and four NO-SHOWS. The GPS locations of the four ATTENDEES ( $+)$ and four NO-SHOWS (-) are plotted below.

![](https://cdn.mathpix.com/snip/images/hP9eR7pYr6sU1ok-SwxPg2aPmSbjvD-53lWe1-09hgQ.original.fullsize.png)

\begin{exercise}

Which of the following kernel functions can John and George use to perfectly classify all people as ATTENDEE or NO-SHOW in the above dataset? Circle all that apply or circle NONE OF THESE if none apply.

LINEAR

QUADRATIC

CUBIC

RADIAL BASIS

NONE OF THESE
\end{exercise}

your answer: quadratic and radial basis should work best

\begin{exercise}
John and George apply a polar transformation on the data, $\varphi(<x, y>)=\langle r, \theta>$. 

Below is a graph of this transformation. 

- Describe the SVM decision boundary.
- Describe the SVM gutters lines that correspond to +/-.
- Indicate the support vectors.

![](kernel1.png)
\end{exercise}

your answer: the orthogonal line at 2.5 in the r axxis is the best decision boundary , the gutters lines that correspond to +/- are the orthogonal lines at 2 and 3 respectively , support vectors : D , L , M


\begin{exercise}
For your decision boundary, calculate the values of the normal vector $\overrightarrow{\mathbf{w}}$ and the offset $\boldsymbol{b}$, given the boundary equation $\overrightarrow{\mathbf{w}} \cdot \overrightarrow{\mathbf{x}}+\boldsymbol{b}=\mathbf{0}$. The classifier must produce an output of $+\mathbf{1}$ for ATTENDEES (+) and an output of $-1$ for NO-SHOWS (-).

Indicate the vector $\overrightarrow{\mathbf{W}}=   $ and  $b=  $ 

\end{exercise}


Your Answer: 


\begin{exercise}
John and George give you the GPS locations in polar coordinates of three people. For each of the three locations listed below, circle the ONE best answer indicating how the SVM would classify the person at that location.
$\begin{array}{llll}(\mathbf{r}=\mathbf{1}, \boldsymbol{\theta}=\boldsymbol{\pi}): & \text { ATTENDEE }(+) & \text { NO-SHOW }(-) & \text { CAN'T TELL } \\ (\mathbf{r}=\mathbf{2 . 5}, \boldsymbol{\theta}=\boldsymbol{\pi} / \mathbf{2}): & \text { ATTENDEE }(+) & \text { NO-SHOW }(-) & \text { CAN'T TELL } \\ (\mathbf{r}=\mathbf{3}, \boldsymbol{\theta}=\mathbf{2} \boldsymbol{\pi}): & \text { ATTENDEE }(+) & \text { NO-SHOW }(-) & \text { CAN'T TELL }\end{array}$
\end{exercise}

your answer here: