# Neural Networks - MiniFramework

Tak podme spravit nieco zabavne, s cim sa bude dat dobre experimentovat.
Spravime miniaturny framework na ucenie neuronovych sieti.


In [1]:
import numpy as np


## 1. Aktivacne funkcie

Zacnime zakladnymi suciastkami, z ktorych neskor framework zlozime.


Kedze budeme chciet implementovat priamy aj spatny prechod neuronovou sietou, budeme potrebovat, aby vsetky funkcie zucastnujuce sa vypoctoveho grafu podporovali priame aj spatne vykonavanie.


In [2]:
class ActivationFunction:
    def __init__(self):
        pass

    def __call__(self, z):
        pass

    def derivative(self, z):
        pass

Dalej implementujme niekolko oblubenych aktivacnych funkcii:
- Linear
- ReLU
- Sigmoid

In [3]:
class LinearActivationFunction(ActivationFunction):
    def __call__(self, z):
        return z

    def derivative(self, z):
        return np.ones_like(z)

In [4]:
class ReLUActivationFunction(ActivationFunction):
    def __call__(self, z):
        return np.maximum(z, 0)

    def derivative(self, z):
        return (z > 0).astype(float)

In [5]:
class SigmoidActivationFunction(ActivationFunction):
    def __call__(self, z):
        return 1.0 / (1.0 + np.exp(-z))

    def derivative(self, z):
        a = self(z)
        return np.multiply(a, 1-a)

Dalsie aktivacne funkcie mozeme pridavat neskor. Chceme, aby nas framework bol rozsirovatelny.
Preto spravime Factory funkciu schopnu vytvarat aktivacne funkcie podla mena.


In [6]:
MAP_ACTIVATION_FUNCTIONS = {
    "linear": LinearActivationFunction,
    "relu": ReLUActivationFunction,
    "sigmoid": SigmoidActivationFunction
}

# Vytvarame funkcie podla mena. Stazujeme sa, ak narazime na neznamy typ funkcie.
def CreateActivationFunction(kind):
    if (kind in MAP_ACTIVATION_FUNCTIONS):
        return MAP_ACTIVATION_FUNCTIONS[kind]()
    raise ValueError(kind, "Unknown activation function {}".format(kind))

Podme otestovat, ako sa spravaju nase aktivacne funkcie

In [7]:
def test_ActivationFunctions():
    # Spravime si vektor hodnot
    z = np.arange(-2.0, 2.0, 0.5)
    print('z: ', z)

    for kind in ['linear', 'relu', 'sigmoid']:
        # spravime danu aktivacnu gunkciu
        g = CreateActivationFunction(kind)
        a = g(z)
        dz = g.derivative(z)

        print('Function: ', kind)
        print('   a  = ', a)
        print('   dz = ', dz)

test_ActivationFunctions()


z:  [-2.  -1.5 -1.  -0.5  0.   0.5  1.   1.5]
Function:  linear
   a  =  [-2.  -1.5 -1.  -0.5  0.   0.5  1.   1.5]
   dz =  [1. 1. 1. 1. 1. 1. 1. 1.]
Function:  relu
   a  =  [0.  0.  0.  0.  0.  0.5 1.  1.5]
   dz =  [0. 0. 0. 0. 0. 1. 1. 1.]
Function:  sigmoid
   a  =  [0.11920292 0.18242552 0.26894142 0.37754067 0.5        0.62245933
 0.73105858 0.81757448]
   dz =  [0.10499359 0.14914645 0.19661193 0.23500371 0.25       0.23500371
 0.19661193 0.14914645]


## 2. Stratove funkcie

Obdobne ako aktivacne funkcie, aj stratove musia podporovat priamy a spatny prechod.

In [8]:
class LossFunction:
    def __init__(self):
        pass

    def __call__(self, a, y):
        pass

    def derivative(self, a, y):
        pass

Teraz si mozeme implementovat Logistic Loss funkciu. Je to vlastne specialny pripad cross-entropy funkcie pre pripad klasifikacie pre viacero tried. V machine learning frameworkoch sa zvykne oznacovat aj ako BinaryCrossEntropy. Pouzijeme rovnake meno aj tu.

In [9]:
class BinaryCrossEntropyLossFunction(LossFunction):
    def __call__(self, a, y):
        return -( np.multiply(y, np.log(a)) + np.multiply((1-y), np.log(1-a)) )

    def derivative(self, a, y):
        return -np.divide(y, a) + np.divide((1-y), (1-a))


Dalsie oblubene stratove funkcie, si mozete naimplementovat ako cvicenie:
- absolute error
- square error
- huber loss
- ...

Poskytneme miesto vo frameworku, kde bude mozne lahko rozsirovat podporovane loss funkcie.


In [10]:
MAP_LOSS_FUNCTIONS = {
    "bce": BinaryCrossEntropyLossFunction
}

def CreateLossFunction(kind):
    if (kind in MAP_LOSS_FUNCTIONS):
        return MAP_LOSS_FUNCTIONS[kind]()
    raise ValueError(kind, "Unknown loss function {}".format(kind))

Podme vyskusat, ci nasa stratova funkcia funguje podla ocakavania


In [11]:
def test_LossFunction():
    # Vytvorime binary crossentropy
    L = CreateLossFunction("bce")

    # Nainicializujeme Y, YHat
    Y    = np.array([0,    0,   0,   0,   1,   1,   1,   1], dtype=float)
    YHat = np.array([0.01, 0.1, 0.2, 0.4, 0.6, 0.8, 0.9, 0.99], dtype=float)

    # Spocitame hodnotu loss funkcie
    loss = L(YHat, Y)
    da = L.derivative(YHat, Y)

    print('Y: \n', Y, '\n')
    print('YHat: \n', YHat, '\n')
    print('loss: \n', loss, '\n')
    print('da:   \n', da, '\n')

test_LossFunction()


Y: 
 [0. 0. 0. 0. 1. 1. 1. 1.] 

YHat: 
 [0.01 0.1  0.2  0.4  0.6  0.8  0.9  0.99] 

loss: 
 [0.01005034 0.10536052 0.22314355 0.51082562 0.51082562 0.22314355
 0.10536052 0.01005034] 

da:   
 [ 1.01010101  1.11111111  1.25        1.66666667 -1.66666667 -1.25
 -1.11111111 -1.01010101] 



## 3. Optimizer

Optimizer predstavuje algoritmus upravy parametrov siete podla gradientu. Zaciname jednoducho a naimplementujeme Gradient Descent optimizer. Neskor budeme rozsirovat a pridavat ucinnejsie optimizery.

In [12]:
class Optimizer:
    def __init__(self):
        pass

    def backward(self, optimizerContext, dW, db):
        # Pri spatnom prechode moze optimizer vykonat svoju pracu nad gradientami dW, db
        pass

    def update(self, optimizerContext, W, b):
        # Uprava vah - jeden krok zostupu
        pass


In [13]:
class GradientDescent(Optimizer):
    def __init__(self, learningRate):
        self.learningRate = learningRate

    def backward(self, optimizerContext, dW, db):
        # Odkladame aktualne hodnoty dW a db do noveho kontextu
        return (dW, db)

    def update(self, optimizerContext, W, b):
        # Pouzijeme gradienty z kontextu
        dW, db = optimizerContext

        # Jeden krok zostupu
        W = W - self.learningRate*dW
        b = b - self.learningRate*db
        return W, b