# Keras Model Life-Cycle

1. Define Network. -> 2. Compile Network. -> 3. Fit Network. -> 4. Evaluate Network. -> 5. Make Predictions.

## Example Sequential Network

Network Topology:

    A multilayer perceptro that takes in 2 inputs, has 5 hidden nodes and 1 ouput layer. 
    2 -> [5] -> 1

The first layer defines number of attributes. The activation function of the last layer will define the model's ouput format. 

** Output Layer Activation Examples **:

    - Regression: Linear activation function, or linear, and the number of neurons matching the number of outputs.
    - Binary Classification (2 class): Logistic activation function, or sigmoid, and one neuron the output layer.
    - Multiclass Classification (>2 class): Softmax activation function, or softmax, and one output neuron per class value, assuming a one hot encoded output pattern.

** Model Compilation **

Compilation is an efficiency step. It transforms the simple sequence of layers that we defined into a highly efficient series of matrix transforms in a format intended to be executed on your GPU or CPU, depending on how Keras is configured. Compilation requires the optimization algorithm to use to train the network and the loss function used to evaluate the network that is minimized by the optimization algorithm. 

** Example Loss functions and optimizations **

    Loss Fuctions
    - Regression: Mean Squared Error or mean squared error.
    - Binary Classification (2 class): Logarithmic Loss, also called cross entropy or binary crossentropy.
    - Multiclass Classification (>2 class): Multiclass Logarithmic Loss or categorical crossentropy.
    
    Optimization Functions
    - Stochastic Gradient Descent, or sgd, that requires the tuning of a learning rate and momentum.
    - Adam, or adam, that requires the tuning of learning rate.
    - RMSprop, or rmsprop, that requires the tuning of learning rate.
    

In [1]:
import warnings
warnings.filterwarnings('ignore')

## Keras Functional Models
The sequential API allows you to create models layer-by-layer for most problems. It is limited in that it does not allow you to create models that share layers or have multiple inputs or outputs. The functional API in Keras is an alternate way of creating models that offers a lot more flexibility, including creating more complex models. It specifically allows you to define multiple input or output models as well as models that share layers. 

### Defining Input
The input layer takes a shape argument that is a tuple that indicates the dimensionality of the input data. When input data is one-dimensional, such as for a Multilayer Perceptron, the shape must explicitly leave room for the shape of the mini-batch size used when splitting the data when training the network. Therefore, the shape tuple is always defined with a hanging last dimension (2,).

In [None]:
from keras.layers import Input
visible = Input(shape=(2,))

### Connecting Layers
The layers in the model are connected pairwise. This is done by specifying where the input comes from when defining each new layer. A bracket or functional notation is used, such that after the layer is created, the layer from which the input to the current layer comes from is specified.

In [None]:
from keras.layers import Input, Dense
visible = Input(shape=(2,))
hidden = Dense(2)(visible)

### Creating The Model 

Keras provides a Model class that you can use to create a model from your created layers. It requires that you only specify the input and output layers.

In [None]:
from keras.models import Model
from keras.layers import Input, Dense
visible = Input(shape=(2,))
hidden = Dense(2)(visible)
model = Model(inputs=visible, outputs=hidden)

## Standard Network Models

### Multilayer Perceptron: Binary Classification

Network Topology:

    10 -> [ [10] -> [20] -> [10] ] -> 1

In [2]:
from keras.models import Model
from keras.layers import Input, Dense

# Define Model
visible = Input(shape=(10,))
hidden1 = Dense(10, activation='relu')(visible) 
hidden2 = Dense(20, activation='relu')(hidden1) 
hidden3 = Dense(10, activation='relu')(hidden2) 
output = Dense(1, activation='sigmoid')(hidden3) 
model = Model(inputs=visible, outputs=output)

# summarize layers
model.summary()

Using TensorFlow backend.


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 10)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                110       
_________________________________________________________________
dense_2 (Dense)              (None, 20)                220       
_________________________________________________________________
dense_3 (Dense)              (None, 10)                210       
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 11        
Total params: 551
Trainable params: 551
Non-trainable params: 0
_________________________________________________________________


## Convolutional Neural Network (CNN)

The model receives black and white 64 x 64 images as input, then has a sequence of two convolutional and pooling layers as feature extractors, followed by a fully connected layer to interpret the features and an output layer with a sigmoid activation for two-class predictions.

In [3]:
from keras.models import Model
from keras.layers import Input, Dense
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D

# Define Model
visible = Input(shape=(64,64,1))
conv1 = Conv2D(32, kernel_size=4, activation='relu')(visible)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
conv2 = Conv2D(16, kernel_size=4, activation='relu')(pool1) 
pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
hidden1 = Dense(10, activation='relu')(pool2) 
output = Dense(1, activation='sigmoid')(hidden1) 
model = Model(inputs=visible, outputs=output)

# Summarize Layers 
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 64, 64, 1)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 61, 61, 32)        544       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 30, 30, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 27, 27, 16)        8208      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 13, 13, 16)        0         
_________________________________________________________________
dense_5 (Dense)              (None, 13, 13, 10)        170       
_________________________________________________________________
dense_6 (Dense)              (None, 13, 13, 1)         11        
Total para

### Recurrent Neural Network (RNN)

The model expects 100 time steps of one feature as input. The model has a single LSTM hidden layer to extract features from the sequence, followed by a fully connected layer to interpret the LSTM output, followed by an output layer for making binary predictions.

In [5]:
from keras.models import Model
from keras.layers import Input, Dense
from keras.layers.recurrent import LSTM

# Define Model
visible = Input( shape=(100,1) )
hidden1 = LSTM(10)(visible)
hidden2 = Dense(10, activation = 'relu')(hidden1)
output  = Dense(1, activation = 'sigmoid')(hidden2)
model   = Model(inputs = visible, outputs = output)

# Summarize Layers
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_3 (InputLayer)         (None, 100, 1)            0         
_________________________________________________________________
lstm_1 (LSTM)                (None, 10)                480       
_________________________________________________________________
dense_7 (Dense)              (None, 10)                110       
_________________________________________________________________
dense_8 (Dense)              (None, 1)                 11        
Total params: 601
Trainable params: 601
Non-trainable params: 0
_________________________________________________________________
