## Klasyfikator minimalno-odległościowy jako klasyfikator liniowy



## Minimal distance classifier as a linear classifier

In [None]:
%matplotlib inline
import numpy as np 
import matplotlib.pyplot as plt


In [None]:
from sklearn.datasets import make_blobs

blobs_data = None
blobs_target = None

train, train_labels = make_blobs(n_samples = 500, n_features=2, centers=2, random_state=1234)

a = 10
xx, yy = np.meshgrid(np.linspace(-a,a,60), np.linspace(-a,a,60))

### Zadanie 1.

Oblicz środki klas.

### Task 1.

Calculate class measures.

In [None]:
class_means = np.array([[-3,  2],[-.19,  3]])
### BEGIN SOLUTION
class_means = np.vstack([ np.mean( train[np.ravel(train_labels==i)], axis=0) for i in np.unique(train_labels) ])
###END SOLUTION

In [None]:
np.testing.assert_allclose(class_means,[[-6.11,  2.40],[-1.19,  5.67]],rtol=1e-2)
assert class_means.shape == (2, 2)

Zweryfikujmy wizualnie wynik:

Let's visually verify the result:

In [None]:
plt.scatter(train[:,0], train[:,1],c= np.ravel(train_labels))
plt.plot(class_means[0,0], class_means[0,1], 'bo',markersize=10)
plt.plot(class_means[1,0], class_means[1,1], 'ro',markersize=10)


Obszary decyzyjne dla klasyfikatora minimalno odległościowego, zaimplementowane z definicji:

Decision areas for the minimum distance classifier, implemented by definition:

In [None]:
xm,ym = class_means[:,0],  class_means[:,1]
d = np.argmin((xx[...,np.newaxis]-xm[np.newaxis,np.newaxis,:])**2+\
          (yy[...,np.newaxis]-ym[np.newaxis,np.newaxis,:])**2,axis=-1)

Wykres tych obszarów:

Graph of these areas:

In [None]:
plt.imshow(d,origin='lower',extent=[-a,a,-a,a])
plt.colorbar()
plt.scatter(train[:,0], train[:,1],marker='.',c= -np.ravel(train_labels))
plt.plot(class_means[0,0], class_means[0,1], 'bo',markersize=10)
plt.plot(class_means[1,0], class_means[1,1], 'ro',markersize=10)



### Klasyfikator minimalno odległościowy jako klasyfikator liniowy

$$ \mathbf{w}\cdot\mathbf{x} = t$$

Wektor $\mathbf{w}$ jest równoległy do wektora $\mathbf{cm_1} - \mathbf{cm_0}$, gdzie $\mathbf{cm_1}$ i $\mathbf{cm_0}$ to środki geometryczne poszczególnych klas. 

Możemy przyjąć $\mathbf{w}=\mathbf{cm_1} - \mathbf{cm_0}$. Pozostaje ustalić wartość stałej $t$ (bias). Jendym z wyborów będzie takie $t$ by funkcja dyscryminacyjna znikała na środku odcinka  $(\mathbf{cm_0}, \mathbf{cm_1})$.
Warunek ten przyjmuje taką postać:

$$ \mathbf{w}\cdot\mathbf{x_{m}} - t = 0,$$

gdzie $\mathbf{x_{m}}$ to środek odcinka $(\mathbf{cm_0}, \mathbf{cm_1})$. 








### Minimal distance classifier as a linear classifier

$$ \mathbf{w}\cdot\mathbf{x} = t$$

The $\mathbf{w}$ vector is parallel to the $\mathbf{cm_1} - \mathbf{cm_0}$ vector, where $\mathbf{cm_1}$ and $\mathbf{cm_0}$ are the geometric centers of the individual classes.

We can accept $\mathbf{w}=\mathbf{cm_1} - \mathbf{cm_0}$. It remains to set the value of the $t$ constant (bias). One of the choices will be $t$ so that the discrimination function disappears in the middle of the $(\mathbf{cm_0}, \mathbf{cm_1})$ episode.
This condition takes the following form:

$$ \mathbf{w}\cdot\mathbf{x_{m}} - t = 0,$$

where $\mathbf{x_{m}}$ is the middle of the $(\mathbf{cm_0}, \mathbf{cm_1})$ episode.

### Zadanie 2

Oblicz wagi $\mathbf{w}$ i bias $t$ dla przykładowych danych i zweryfikuj wizualnie obszary decyzyjne

Obszary te  dane są wzorem:

$$ \mathbf{w}\cdot\mathbf{x} - t > 0$$ 

( `True` to klasa 1 a `False` to klasa 0 )


### Exercise 2

Calculate $\mathbf{w}$ and bias $t$ weights for sample data and visually verify decision areas

These data areas are a formula:

$$ \mathbf{w}\cdot\mathbf{x} - t > 0$$

(`True` is class 1 and` False` is class 0)

In [None]:
t = 0
w = np.array([0,1])
d_lin = np.zeros_like(xx)
### BEGIN SOLUTION

w = class_means[1]-class_means[0]
t = w.dot(0.5*(class_means[0]+class_means[1]))

d_lin = (np.tensordot(w,np.stack([xx,yy]),axes=[0,0])-t)>0

### END SOLUTION

In [None]:
np.testing.assert_allclose(w,[4.92, 3.261],rtol=1e-2)


In [None]:
np.testing.assert_allclose(t,-4.80,rtol=1e-2)


In [None]:
assert d_lin[2,3] ==  False
assert d_lin[44,43] ==  True

Wykonaj wykres obszarów decyzyjnych:

Make a chart of decision areas:

In [None]:
plt.imshow(d_lin.astype(np.float), origin='lower',extent=[-a,a,-a,a])
plt.colorbar()
plt.scatter(train[:,0], train[:,1],marker='.',c= -np.ravel(train_labels))
plt.plot(class_means[0,0], class_means[0,1], 'bo',markersize=10)
plt.plot(class_means[1,0], class_means[1,1], 'ro',markersize=10)
