# Activation Functions

* Problem 1: Linear Regression models (aka. single neurons) cannot be chained 
without any activation function

* Problem 2: Classification problem --> Class separability, results as class predictions or as probability distribution

## Functions
- Sigmoid
- ReLU
- tanh
- etc.

In [1]:
# http://yann.lecun.com/exdb/mnist/
# FashionMNIST: https://github.com/zalandoresearch/fashion-mnist

import gzip
import numpy as np

def open_images(filename):
    with gzip.open(filename, "rb") as file:
        data = file.read()
        return np.frombuffer(data, dtype=np.uint8, offset=16)\
            .reshape(-1, 28, 28)\
            .astype(np.float32)


def open_labels(filename):
    with gzip.open(filename, "rb") as file:
        data = file.read()
        return np.frombuffer(data, dtype=np.uint8, offset=8)

In [2]:
from keras.utils import to_categorical

Using TensorFlow backend.


In [3]:
X_train = open_images("./data/fashion/train-images-idx3-ubyte.gz")
y_train = open_labels("./data/fashion/train-labels-idx1-ubyte.gz")

X_test = open_images("./data/fashion/t10k-images-idx3-ubyte.gz")
y_test = open_labels("./data/fashion/t10k-labels-idx1-ubyte.gz")

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

In [4]:
ACTIVATION_FUNCTIONS = ["sigmoid", "hard_sigmoid", "relu", "selu", "elu", "tanh"]

In [5]:
from keras.models import Sequential
from keras.layers import Dense

In [6]:
def create_model(a_fn="sigmoid"):
    print("Model ", a_fn)
    model = Sequential()
    model.add(Dense(100, activation=a_fn, input_shape=(784,)))
    model.add(Dense(10, activation="softmax"))
    return model

In [7]:
MODELS = [create_model(act) for act in ACTIVATION_FUNCTIONS]

Model  sigmoid
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Model  hard_sigmoid
Model  relu
Model  selu
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Model  elu
Model  tanh


In [8]:
print("Testing %s models" % len(MODELS))

Testing 6 models


In [17]:
def train_model(model):
    model.compile(optimizer="sgd", loss="binary_crossentropy", metrics=["accuracy"])
    model.fit(
    X_train.reshape(-1, 784),
    y_train,
    epochs=10,
    batch_size=1000)
    results = model.predict(X_test.reshape(-1,784), verbose=0)
    return results

In [18]:
RESULTS = []
for model, a_fn in zip(MODELS, ACTIVATION_FUNCTIONS):
    print("Training model: ", a_fn)
    RESULTS.append(train_model(model))
    print("*"*50)

Training model:  sigmoid
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
**************************************************
Training model:  hard_sigmoid
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
**************************************************
Training model:  relu
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
**************************************************
Training model:  selu
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
**************************************************
Training model:  elu
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
**************************************************
Training model:  tanh
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoc

In [20]:
for activation, result in zip(ACTIVATION_FUNCTIONS, RESULTS):
    print("Activation: %s -> Result: %s" %(activation, result))

Activation: sigmoid -> Result: [[0.00949962 0.00851517 0.01060433 ... 0.16868016 0.02505522 0.56912965]
 [0.02112993 0.01797128 0.66048914 ... 0.00915623 0.03898954 0.02242609]
 [0.01041819 0.9186594  0.00746383 ... 0.0031421  0.00251293 0.00386413]
 ...
 [0.08556654 0.00496031 0.01536833 ... 0.009886   0.64514494 0.02176427]
 [0.01405453 0.7670687  0.01102634 ... 0.00921539 0.00651392 0.00793751]
 [0.00367453 0.01976525 0.01799609 ... 0.5125976  0.06029106 0.04777737]]
Activation: hard_sigmoid -> Result: [[0.01024636 0.01859024 0.01927294 ... 0.10929573 0.0548073  0.6558491 ]
 [0.06395135 0.01440797 0.43126917 ... 0.00424565 0.03318367 0.02673943]
 [0.02033909 0.83876634 0.01112897 ... 0.01501446 0.01193923 0.00351407]
 ...
 [0.08137689 0.04067555 0.01333263 ... 0.02409885 0.3638395  0.01893006]
 [0.03728367 0.774372   0.01516961 ... 0.00910512 0.00901546 0.00645342]
 [0.04698382 0.01140661 0.05859979 ... 0.18832847 0.1197657  0.15880537]]
Activation: relu -> Result: [[0.0000000e+00 1