<a href="https://colab.research.google.com/github/lblogan14/master_tensorflow_keras/blob/master/ch3_keras_101.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Installing Keras
`pip3 install keras`

#Neural Network Models in Keras
Neural network models in Keras are defined as the graph of layers. The models in Keras can
be created using the sequential or the functional APIs. Both the functional and sequential
APIs can be used to build any kind of models.

Use the sequential API for simple models
built from simple layers and the functional API for complex models involving branches and
sharing of layers.

##Creating the Keras model
###Sequential API for creating the Keras model
In the sequential API, to create the empty model,

    model = Sequential()

Then you can add more layers to this model.

ALternatively, you can pass all the layers as a list to the constructor,

    model = sequential([Dense(10, input_shape=256,),
                        Activation('tanh'),
                        Dense(10),
                        Activation('softmax')])
                        
###Functional API for creating the Keras model
model is created as an instance of the `Model` class that takes an input and output parameter.

    model = Model(inputs=tensor1, outputs=tensor2)
    
`tensor1` and `tensor2` are either tensors or objects that can be treated like tensors, for example, Keras `layer` objects.

If there are more than one input and output tensors, pass as a list,

    model = Model(inputs=[i1, i2, i3], outputs=[o1, o2, o3])

#Keras Layers
##Keras core layers
Layer name | Description
--- | ---
`Dense` | This is a simple fully connected neural network layer. This layer produces the output of the following function: **activation((inputs x weights)+bias)** where *activation* refers to the activation function passed to the layer, which is `None` by default.
`Activation` | This layer applies the specified activation function to the output. This layer produces the output of the following function: **activation(inputs)** where activation refers to the *activation* function passed to the layer. The following activation functions are available to instantiate this layer: `softmax, elu, selu, softplus, softsign, relu, tanh, sigmoid, hard_sigmoid,` and `linear`
`Dropout` | This layer applies the dropout regularization to the inputs at a specified dropout rate.
`Flatten` | This layer flattens the input, that is, for a three-dimensional input, it flattens and produces a one-dimensional output.
`Reshape` | This layer converts the input to the specified shape.
`Permute` | This layer reorders the input dimensions as per the specified pattern.
`RepeatVector` | This layer repeats the input by the given number of times. Thus, if the input is a 2D tensor of shape (#samples, #features) and the layer is given n times to repeat, then the output will be a 3D tensor of shape (#samples, n, #features).
`Lambda` | This layer wraps the provided function as a layer. Thus, the inputs are passed through the custom function provided to produce the outputs. This layer provides ultimate extensibility to Keras users to add their own custom functions as layers.
`ActivityRegularization` | This layer applies L1 or L2, or a combination of both kinds of regularization to its inputs. This layer is applied to the output of an activation layer or to the output of  layer that has an activation function.
`Masking` | This layer masks or skips those time steps in the input tensor where all the values in the input tensor are equal to the mask value provided as an argument to the layer.

##Keras convolutional layers
Layer Name | Description
--- | ---
`Conv1D` | This layer applies convolutions over a single spatial or temporal dimension to the inputs.
`Conv2D` | This layer applies two-dimensional convolutions to the inputs.
`SeparableConv2D` | This layer applies a depth-wise spatial convolution on each input channel, followed by a pointwise convolution that mixes together the resulting output channels.
`Conv2DTranspose` | This layer reverts the shape of convolutions to the shape of the inputs that produced those convolutions.
`Conv3D` | This layer applies three-dimensional convolutions to the inputs.
`Cropping1D` | This layer crops the input data along the temporal dimension.
`Cropping2D` | This layer crops the input data along the spatial dimensions, such as width and height in the case of an image.
`Cropping3D` | This layer crops the input data along the spatio-temporal, that is all three dimensions.
`UpSampling1D` | This layer repeats the input data by specified times along the time axis.
`UpSampling2D` | This layer repeats the row and column dimensions of the input data by specified times along the two dimensions.
`UpSampling3D` | This layer repeats the three dimensions of the input data by specified times along the three dimensions.
`ZeroPadding1D` | This layer adds zeros to the beginning and end of the time dimension.
`ZeroPadding2D` | This layer adds rows and columns of zeros to the top, bottom, left, or right of a 2D tensor.
`ZeroPadding3D` | This layer adds zeros to the three dimensions of a 3D tensor.

##Keras pooling layers
Layer Name | Description
--- | ---
`MaxPooling1D` | This layer implements the max pooling operation for one-dimensional input data.
`MaxPooling2D` | This layer implements the max pooling operation for two-dimensional input data.
`MaxPooling3D` |  This layer implements the max pooling operation for three-dimensional input data.
`AveragePooling1D` | This layer implements the average pooling operation for two-dimensional input data.
`AveragePooling2D` | This layer implements the average pooling operation for two-dimensional input data.
`AveragePooling3D` | This layer implements the average pooling operation for three-dimensional input data.
`GlobalMaxPooling1D` | This layer implements the global max pooling operation for one-dimensional input data.
`GlobalAveragePooling1D` | This layer implements the global average pooling operation forone-dimensional input data.
`GlobalMaxPooling2D` |  This layer implements the global max pooling operation for two-dimensional input data.
`GlobalAveragePooling2D` | This layer implements the global average pooling operation fortwo-dimensional input data.

##Keras locally-connected layers
Layer Name | Description
--- | ---
`LocallyConnected1D` | This layer applies convolutions over a single spatial or temporal dimension to the inputs, by applying a different set of filters at each different patch of the input, thus not sharing the weights.
`LocallyConnected2D` | This layer applies convolutions over two dimensions to the inputs, by applying a different set of filters at each different patch of the input, thus not sharing the weights.

##Keras recurrent layers
Layer Name | Description
--- | ---
`SimpleRNN` | This layer implements a fully connected recurrent neural network.
`GRU` | This layer implements a gated recurrent unit network.
`LSTM` | This layer implements a long short-term memory network.

##Keras embedding layers
Layer Name | Description
--- | ---
`Embedding` | This layer takes a 2D tensor of shape (batch_size, sequence_length) consisting of indexes, and produces a tensor consisting of dense vectors of shape (batch_size, sequence_length, output_dim).

##Keras merge layers
merge two or more input tensors and produce a single output tensor by
applying a specific operation that each layer represents:

Layer Name | Description
--- | ---
`Add` | This layer computes the element-wise addition of input tensors.
`Multiply` | This layer computes the element-wise multiplication of input tensors
`Average` | This layer computes the element-wise average of input tensors.
`Maximum` | This layer computes the element-wise maximum of input tensors.
`Concatenate` | This layer concatenates the input tensors along a specified axis.
`Dot` | This layer computes the dot product between samples in two input tensors.
`add, multiply, average, maximum, concatenate,`and `dot`| These functions represent the functional interface to the respective merge layers described in this table.

##Keras advanced activation layers
Layer Name | Description
--- | ---
`LeakyReLU` | This layer computes the leaky version of the ReLU activation function.
`PReLU` | This layer computes the parametric ReLU activation function.
`ELU` | This layer computes the exponential linear unit activation function.
`ThresholdedReLU` | This layer computes the thresholded version of the ReLU activation function.

##Keras normalization layers
Layer name | Description
--- | ---
`BatchNormalization` | This layer normalizes the outputs of the previous layer at each batch, such that the output of this layer is approximated to have a mean close to zero and a standard deviation close to 1.

##Keras noise layers
can be added to the model to prevent overfitting by adding noise; they are also
known as regularization layers. These layers operate the same way as
the `Dropout()` and `ActivityRegularizer()` layers in the core layers section.

Layer name | Description
--- | ---
`GaussianNoise` | This layer applies additive zero-centered Gaussian noise to the inputs.
`GaussianDropout` | This layer applies multiplicative one-centered Gaussian noise to the inputs.
`AlphaDropout` | This layer drops a certain percentage of inputs, such that the mean and variance of the outputs after the dropout match closely with the mean and variance of the inputs.

#Adding Layers to the Keras Model
##Sequential API to add layers to the Keras model
Use `model.add()` to add layers

    model = Sequential()
    model.add(Dense(10, input_shape=(256,))
    model.add(Activation('tanh'))
    model.add(Dense(10))
    model.add(Activation('softmax'))
    
##Functional API to add layers to the Keras Model
    
    input = Input(shape=(64,))
    hidden = Dense(10)(inputs)
    hidden = Activation('tanh')(hidden)
    hidden = Dense(10)(hidden)
    output = Activation('tanh')(hidden)
    model = Model(inputs=input, outputs=output)

##Compiling the Keras model
The `model.compile()` method has to be called before it can be used for training and prediction after building the network.

`compile(self, optimizer, loss, metrics=None, sample_weight_mode=None)`

* `optimizer`:
  * `SGD, RMSprop, Adagrad, Adadelta, Adam, Adamax, Nadam`
* `loss`:
  * `mean_squared_error, mean_absolute_error, mean_absolute_pecentage_error, mean_squared_logarithmic_error, squared_hinge, hinge, categorical_hinge, sparse_categorical_crossentropy, binary_crossentropy, poisson, cosine proximity, binary_accuracy, categorical_accuracy, sparse_categorical_accuracy, top_k_categorical_accuracy, sparse_top_k_categorical_accuracy`
* `metrics`

##Training the Keras model
The `fit` method can be initialized as 

`fit(self, x, y, batch_size=32, epochs=10, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)`

For example, 

`model.fit(x_data, y_labels)`

##Predicting with the Keras model
For prediction,

`model.prdict()`

The `predict` method takes `predict(self, x, batch_size=32, verbose=0)`

For evaluation,

`model.evaluate()`

The `evaluate` method takes `evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)`

##Additional module in Keras
* `preprocessing`
* `datasets`
* `initializers`, provides functions to set initial random weight parameters of layers, such as, `as Zeros, Ones, Constant, RandomNormal, RandomUniform, TruncatedNorma
l, VarianceScaling, Orthogonal, Identity, lecun_normal, lecun_unifor
m, glorot_normal, glorot_uniform, he_normal, and he_uniform.`
* `model`, provides several functions to restore the model architectures and weights, such as `model_from_json, model_from_yaml, and load_model`. Also, `model.to_yaml()` and `model.to_json()`are used to save model architectures.
* `application`, provides pre-built and pre-trained models, such as Xception, VGG16, VGG19, ResNet50, Inception V3, InceptionResNetV2, and MobileNet

##Keras sequential model example for MNIST dataset

In [1]:
# import the keras modules
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import SGD
from keras import utils
import numpy as np

Using TensorFlow backend.


In [3]:
# define parameters
batch_size = 100
n_inputs = 28*28
n_classes = 10
n_epochs = 10

# get data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# reshape 28x28 inputs to a row vector of 784 pixels
x_train = x_train.reshape(60000, n_inputs)
x_test = x_test.reshape(10000, n_inputs)

# convert input values to float32
x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)

# normalization
x_train /= 255.
x_test /= 255.

# convert output to one-hot
y_train = utils.to_categorical(y_train, n_classes)
y_test = utils.to_categorical(y_test, n_classes)

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz


In [5]:
# sequential mode
model = Sequential()
# the first layer has to specify the dimensions of the input vector
model.add(Dense(units=128, activation='sigmoid', input_shape=(n_inputs,)))
# add dropout layer for preventing overfitting
model.add(Dropout(0.1))
model.add(Dense(units=128, activation='sigmoid'))
model.add(Dropout(0.1))
# output layer can only have the neurons equal to the number of outputs
model.add(Dense(units=n_classes, activation='softmax'))

# print the summary
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 128)               100480    
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 128)               16512     
_________________________________________________________________
dropout_2 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                1290      
Total params: 118,282
Trainable params: 118,282
Non-trainable params: 0
_________________________________________________________________


In [6]:
# compile model
model.compile(loss='categorical_crossentropy',
             optimizer=SGD(),
             metrics=['accuracy'])

# train model
model.fit(x_train, y_train, batch_size=batch_size, epochs=n_epochs)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7fe6d94733c8>

In [8]:
# evaluate model
scores = model.evaluate(x_test, y_test)
print('\n loss:', scores[0])
print('\n accuracy:', scores[1])


 loss: 0.8117022010803223

 accuracy: 0.802
