### Activation Function
An activation function determines when a neuron will activate. We apply a nonlinear activation function to some of its layers. Commonly used functions include sigmoid, tanh, softman, relu, and even linear.

When we have a classification problem, then the last layer should be softman. When the problem is a multiclass problem, then we can use sigmoid functions. For image classification problems, it's better to use relu for the hidden layer.

### Loss Function
When we build a neural network, the neural network tries to predict the output as close as possible to the actual value.

For a prediction type problem the cost functions are MSE, MAE...
`model.compile(optimizer='rmsprop', loss='mse')`

For classification type problems the cost functions are Categorical Cross-Entropy, Binary Cross-Entropy.

### Algorithms to Minimize Error
- Gradient Descent: take small steps; like starting on top of a hill and climbing down. We havve point $x$, then move $\Delta h$, update our position to $x + \Delta h$ and keep going until we hit the bottom. Fixed learning rate?

- Stochastic gradient descent (SGD)

- Learning rate: both GD and SGD need a learning rule to update weights.
    - RMSprop, Adagrad, Adadelta, Adam, Adamax, Nadam
    
### Dropout
Dropout is a regularization technique for Neural Networks to prevent overfitting. Removes connections.

### Epoch and Batch
The epoch defines how many times the model will see the dataset and update itself. Batch defines the number of samples per gradient update. An epoch is usually set to between 10 to 20.

### Batch Normalization
A technique that normalizes the data even at hidden layers

In [2]:
from tensorflow.keras import backend
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras import utils
from sklearn.preprocessing import LabelEncoder
from sklearn import datasets
from sklearn.model_selection import train_test_split

iris = datasets.load_iris()
X, y = iris.data, iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

y_train_one_hot = utils.to_categorical(y_train)
y_test_one_hot = utils.to_categorical(y_test)

# print(y_one_hot)

model = Sequential()
model.add(Dense(16, input_shape=(4,)))
model.add(Activation('sigmoid'))
model.add(Dense(3))
model.add(Activation('softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=["accuracy"])
model.fit(X_train, y_train_one_hot, epochs=100, batch_size=1, verbose=0);
loss, accuracy = model.evaluate(X_test, y_test_one_hot, verbose=0)
print("Accuracy = {:.2f}".format(accuracy))

Accuracy = 0.98


In [3]:
from tensorflow.keras.layers import Lambda, Input
from tensorflow.keras import Model

import numpy as np

inp = Input(shape=(3,))
double = Lambda(lambda x: 2 * x)(inp)

model = Model(inputs=inp, outputs=double)

data = np.array([[5, 12, 1]])
print(model.predict(data))

[[10. 24.  2.]]


In [4]:
iris.target

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [5]:
y_train_one_hot

array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [1., 0., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [1., 0., 0.],
       [1., 0., 0.],
       [0., 1., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 1., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [0., 0., 1.],
       [1., 0., 0.],
       [1., 0., 0.],
       [0., 0., 1.],
       [0., 0., 1.],
       [1., 0., 0.],
       [0., 0