# <center> Keras </center>
## <center>1.5.1 Activation functions</center>

# Activation functions

Activation functions are used to transform the output of a layer. The activation functions can be linear or non-linear. 
Activation functions are an extremely important feature of the artificial neural networks. They basically decide whether a neuron should be activated or not. Whether the information that the neuron is receiving is relevant for the given information or should it be ignored. They add non-linearity to the basic dot-product and multiplication of layers.<br>
Keras supports the following activation functions by default but one can easily create a custom activation function as well: 
- softmax
- elu
- selu
- softplus
- softsign
- relu
- tanh
- sigmoid
- hard_sigmoid
- linear

<br>
Some common activation functions are shown below:
<img src="img/activation_functions.JPG" width = "80%" />

# Code

Let us run the code from the previous section first.

In [None]:
# Importing the MNIST dataset
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Processing the input data
train_images = train_images.reshape((60000, 28 * 28))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28 * 28))
test_images = test_images.astype('float32') / 255

# Processing the output data
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

Let us now examine the code related to building a network. 

In [None]:
# Build a network
from keras import models
from keras import layers

network = models.Sequential()
network.add(layers.Dense(units=512, activation='relu', input_shape=(28 * 28,)))
network.add(layers.Dense(units=10, activation='softmax'))

Here we are using `relu` as the activation function between input and first hidden layer and `softmax` at the output layer.

# Best practice
At the output layer, it is recommended to use the softmax function for binary and multiclass classification, because the output values sum up to 1. This is useful for calculating probabilities via binary cross-entropy and cross-entropy.

In [None]:
import matplotlib.pyplot as plt
def plot_training_history(history):
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()
    #loss
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

In [None]:
# Compile the network
network.compile(optimizer='rmsprop',
                loss='categorical_crossentropy',
                metrics=['accuracy'])

# Train the network
history = network.fit(train_images, train_labels, epochs=5, batch_size=128, 
                      verbose=1, validation_data=(test_images, test_labels))

# Plot the training results
plot_training_history(history)

# Task
Change the activation functions and check how the network is performing. Note down the results:<br>

# Summary
In this section we learned about the activation functions.

# Feedback
<a href = "http://goto/ml101_doc/Keras06">Feedback: Activation functions</a> <br>

# Navigation

<div>
<span> <h3 style="display:inline">&lt;&lt; Prev: <a href = "keras05.ipynb">Build a network</a></h3> </span>
<span style="float: right"><h3 style="display:inline">Next: <a href = "keras07.ipynb">Units</a> &gt;&gt; </h3></span>
</div>