## Aufgabe 21

__a)__ Dimensionen der Größen gekennzeichnet durch dim($\dots$) = (m, n) - m Zeilen, n Spalten<br>
 - dim($x_i$) = (M, 1) 
 - dim(C) = (1, 1)
 - dim(W) = (K, M)
 - dim(b) = (K, 1)

__d)__ Implementierung der linearen Klassifikation:

In [None]:
import numpy as np
import pandas as pd
from pandas import DataFrame
import matplotlib.pyplot as plt
np.random.seed(42)

In [None]:
#load data
P_0 = pd.read_hdf('populationen.hdf5', key = 'P_0')
P_1 = pd.read_hdf('populationen.hdf5', key = 'P_1')

In [None]:
#add labels 
P_0['label'] = np.zeros_like(P_0.x)
P_1['label'] = np.ones_like(P_1.x)

In [None]:
#combine data sets
P = pd.concat([P_0, P_1], ignore_index=True)

In [None]:
#linear classification model 
def f(x, W, b):
    return np.matmul(W, x.T) + b

In [None]:
#update matrix and vector b, learning rate h = 0.5
def update(W, grad_W, b, grad_b, h = 0.5):
    W = W - h * grad_W
    b = b - h * grad_b
    return W, b

Die Funktion 'grad_f' berechnet den Gradienten $\nabla_f C$, der für die Berechnung der Gradienten 
$\nabla_W C$ und $\nabla_b C$ benötigt wird.

In [None]:
def grad_f(x, f, labels):
    labels = labels.values
    m = np.shape(x)[0] # number of examples
    x = np.matrix(x.values)
    
    #all different classes (here only 0 and 1)
    classes = np.unique(labels)  
    
    #generate a matrix representing the real class distribution
    truth = np.zeros_like(x.T)
    mask = []
    for i in range(len(classes)):
        mask.append(labels == classes[i]) 
    truth[~np.array(mask)] = 1  
    
    #calculate the gradient 
    grad_f = 1 / m * (np.exp(f) / np.exp(f).sum(axis = 0) - truth)
    
    return grad_f

Funktionen zur Berechnung der Gradienten $\nabla_W C$ und $\nabla_b C$:

In [None]:
def grad_W(x, f, labels):
    grad_F = grad_f(x, f, labels)
    grad_W = np.matmul(grad_F, x)
    return grad_W

In [None]:
def grad_b(x, f, labels):
    grad_F = grad_f(x, f, labels)
    grad_b = grad_F.sum(axis = 1)
    return grad_b

Verwende nun die Funktionen um für 100 Epochen zu trainieren:

In [None]:
#choose initial W and b randomly 
W = np.matrix(np.random.rand(2, 2)) 
b = np.matrix(np.random.rand(2, 1))
x = P.drop(columns = 'label')

for i in range(100):
    f_init = f(x, W, b)
    W, b = update(W = W, grad_W = grad_W(x, f_init, P.label), 
                  b = b, grad_b = grad_b(x, f_init, P.label))

__e)__ Die Geradengleichung (hier nur Spezialfall für 2 Klassen): 
$$
    y(x) = \frac{1}{W_{12} - W_{22}} \left\{(W_{21} - W_{11}) x + b_{2} - b_{1} \right\}
$$
Ergibt sich aus der Bedingung $f_1 = f_2$, da entlang der Gerade der Score für beide Klassen gleich ist. Die Gerade trennt die beiden Populationen. <br>
Grafische Darstellung des Ergebnisses:

In [None]:
def lin(x, W, b):
    return 1 / (W[0, 1] - W[1, 1]) * ((W[1, 0] - W[0, 0])*x + b[1] - b[0] )

In [None]:
plt.figure(figsize=(15, 15))
plt.rcParams.update({'font.size': 16})

plt.scatter(P_0.x, P_0.y, s = 1, label = 'Population 0')
plt.scatter(P_1.x, P_1.y, s = 1, label = 'Population 1')
xplot = np.linspace(-15, 20, 100)
plt.xlim(xplot[0], xplot[-1])
plt.ylim(lin(xplot[0], W, b), lin(xplot[-1], W, b))
plt.plot(xplot, lin(xplot, W, b).T, color = 'r', label = 'Gerade $y(x)$')
plt.legend()
plt.xlabel('x', )
plt.ylabel('y')


plt.show()