# Background
**Keras** is a high-level API to build and train deep learning models. It's used for fast prototyping, advanced research, and production, with three key advantages:
- 1 User friendly
Keras has a simple, consistent interface optimized for common use cases. It provides clear and actionable feedback for user errors.
- 2 Modular and composable
Keras models are made by connecting configurable building blocks together, with few restrictions.
- 3 Easy to extend
Write custom building blocks to express new ideas for research. Create new layers, loss functions, and develop state-of-the-art models.

References:
- https://keras.io/;
- https://www.tensorflow.org/guide/keras

**Note**: `tf.keras` is TensorFlow's implementation of the [Keras API specification](https://keras.io/). This is a high-level API to build and train models that includes first-class support for TensorFlow-specific functionality, such as `eager execution`, `tf.data pipelines`, and `Estimators`. `tf.keras` makes TensorFlow easier to use without sacrificing flexibility and performance.

To get started, import tf.keras as part of your TensorFlow program setup:

In [1]:
import tensorflow as tf
from tensorflow import keras

  from ._conv import register_converters as _register_converters


`tf.keras` can run any Keras-compatible code, but keep in mind:
- The `tf.keras` version in the latest TensorFlow release might not be the same as the latest keras version from PyPI. Check `tf.keras.version`.
- When saving a model's weights, `tf.keras` defaults to the checkpoint format. Pass `save_format='h5'` to use HDF5.

In [2]:
print(tf.__version__)
print(tf.keras.__version__) # the current version is 2.2.2

1.10.0
2.1.6-tf


## Build a simple model

### Sequential model
In Keras, you assemble layers to build models. A model is (usually) a graph of layers. The most common type of model is a stack of layers: the `tf.keras.Sequential` model.
To build a simple, fully-connected network (i.e. multi-layer perceptron):

In [3]:
model = keras.Sequential()
# Adds a densely-connected layer with 64 units to the model:
model.add(keras.layers.Dense(64, activation='relu'))
# Add another:
model.add(keras.layers.Dense(64, activation='relu'))
# Add a softmax layer with 10 output units:
model.add(keras.layers.Dense(10, activation='softmax'))

### Configure the layers
There are many `tf.keras.layers` available with some common constructor parameters:
- **activation**: Set the activation function for the layer. This parameter is specified by the name of a built-in function or as a callable object. By default, no activation is applied.
- **kernel_initializer and bias_initializer**: The initialization schemes that create the layer's weights (kernel and bias). This parameter is a name or a callable object. This defaults to the "Glorot uniform" initializer.
- **kernel_regularizer and bias_regularizer**: The regularization schemes that apply the layer's weights (kernel and bias), such as L1 or L2 regularization. By default, no regularization is applied.
The following instantiates `tf.keras.layers.Dense layers` using constructor arguments:

```python
# Create a sigmoid layer:
layers.Dense(64, activation='sigmoid')
# Or:
layers.Dense(64, activation=tf.sigmoid)

# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:
layers.Dense(64, kernel_regularizer=keras.regularizers.l1(0.01))
# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
layers.Dense(64, bias_regularizer=keras.regularizers.l2(0.01))

# A linear layer with a kernel initialized to a random orthogonal matrix:
layers.Dense(64, kernel_initializer='orthogonal')
# A linear layer with a bias vector initialized to 2.0s:
layers.Dense(64, bias_initializer=keras.initializers.constant(2.0))
```

In [4]:
model2 = keras.Sequential()
# create a sigmoid layer:
model2.add(keras.layers.Dense(64, activation ='sigmoid'))
# or 
model2.add(keras.layers.Dense(64, activation = tf.sigmoid))

# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:
model2.add(keras.layers.Dense(64, kernel_regularizer = keras.regularizers.l1(0.001)))
# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
model2.add(keras.layers.Dense(64, bias_regularizer=keras.regularizers.l2(0.01)))

# A linear layer with a kernel initialized to a random orthogonal matrix:
model2.add(keras.layers.Dense(64, kernel_initializer='orthogonal'))
# A linear layer with a bias vector initialized to 2.0s:
model2.add(keras.layers.Dense(64, bias_initializer=keras.initializers.constant(2.0)))

for layer in model2.layers:
    print(layer.name, layer.inbound_nodes, layer.outbound_nodes)

# print(model2.summary())

dense_3 [] []
dense_4 [] []
dense_5 [] []
dense_6 [] []
dense_7 [] []
dense_8 [] []


## Train and evaluate
### Set up training

After the model is constructed, configure its learning process by calling the compile method:

In [8]:
model.compile(optimizer=tf.train.AdamOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

`tf.keras.Model.compile` takes three important arguments:
- **optimizer**: This object specifies the training procedure. Pass it optimizer instances from the `tf.train` module, such as `AdamOptimizer`, `RMSPropOptimizer`, or `GradientDescentOptimizer`.
- **loss**: The function to minimize during optimization. Common choices include mean square error (`mse`), `categorical_crossentropy`, and `binary_crossentropy`. Loss functions are specified by name or by passing a callable object from the `tf.keras.losses` module.
- **metrics**: Used to monitor training. These are string names or callables from the `tf.keras.metrics` module.

The following shows a few examples of configuring a model for training:

```python
# Configure a model for mean-squared error regression.
model.compile(optimizer = tf.train.AdamOptimizer(0.01),
              loss = 'mse', # mean squared error
              metrics = ['mae']) # mean absolute error
# configure a model for categorical classification.
model.compile(optimizer = tf.train.RMSPropOptimizer(0.01),
              loss = keras.losses.categorical_crossentropy,
              metrics = [keras.metrics.categorical_accuracy])
```

In [9]:
 print(model.summary()) 

ValueError: This model has never been called, thus its weights have not yet been created, so no summary can be displayed. Build the model first (e.g. by calling it on some data).

### Input Numpy data
For small datasets, use in-memory [`NumPy`](https://www.numpy.org/) arrays to train and evaluate a model. The model is "fit" to the training data using the fit method:

In [10]:
import numpy as np

data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

model.fit(data, labels, epochs=10, batch_size=32)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x2d88a26ffd0>

In [11]:
 print(model.summary()) 

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 64)                2112      
_________________________________________________________________
dense_1 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
Total params: 6,922
Trainable params: 6,922
Non-trainable params: 0
_________________________________________________________________
None


`tf.keras.Model.fit` takes three improtant arguments:
- `epochs`: Training is structured into epochs. An epoch is one iteration over the entirre input data (this is done in smaller batches).
- `batch_size`: When passed NumPy data, the model slices the data into smaller batches and iterates over these batches during training. This integer specifies the size of each batch.Be ware that the last batch may be smaller if the total number of samples is not divisible by the batch size.
- `validation_data`: When prototping a model, you want to easily monitor its performance on some validaiton data. Passing this argument - a tuple of inputs and lables - allows the model to display the loss and metrics in inference mode for the passed data, at the end of each epoch.

Here's an example using `validation_data`:

In [13]:
import numpy as np

data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

val_data = np.random.random((100, 32))
val_labels = np.random.random((100, 10))

model.fit(data, labels, epochs=10, batch_size=32,
          validation_data=(val_data, val_labels))
print(model.summary()) 

Train on 1000 samples, validate on 100 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 64)                2112      
_________________________________________________________________
dense_1 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
Total params: 6,922
Trainable params: 6,922
Non-trainable params: 0
_________________________________________________________________
None


### Input tf.data datasets
Use the [Datasets API](https://www.tensorflow.org/guide/datasets) to scale to large datasets or multi-device training. Pass a `tf.data.Dataset` instance to the `fit` method:

In [15]:
# instantiates a toy dataset instance:
dataset = tf.data.Dataset.from_tensor_slices((data,labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()
# Don't forget to specify `steps_per_epoch` when calling `fit` on a dataset.
model.fit(dataset, epochs=10, steps_per_epoch=30)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x2d89015a6d8>

Here, the fit method uses the `steps_per_epoch` argument—this is the number of training steps the model runs before it moves to the next epoch. Since the `Dataset` yields batches of data, this snippet does not require a batch_size.

Datasets can also be used for validation:

In [16]:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32).repeat()

val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32).repeat()

model.fit(dataset, epochs=10, steps_per_epoch=30,
          validation_data=val_dataset,
          validation_steps=3)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x2d8901a66a0>

### Evaluate and predict
The `tf.keras.Model.evaluate` and `tf.keras.Model.predict` methods can use `NumPy` data and a `tf.data.Dataset`.

To evaluate the inference-mode loss and metrics for the data provided:

```python
model.evaluate(x, y, batch_size=32)
```

In [17]:
model.evaluate(dataset, steps=30)



[11.39530970255534, 0.23229166666666667]

And to predict the output of the last layer in inference for the data provided, as a `NumPy` array:

```python
model.predict(x, batch_size=32)
```

In [18]:
model.predict(dataset, steps=30)

array([[0.08554967, 0.09201679, 0.0918855 , ..., 0.10770091, 0.07330296,
        0.09502364],
       [0.07582473, 0.07689644, 0.12348803, ..., 0.09331267, 0.08332489,
        0.11231889],
       [0.09682094, 0.09555242, 0.1254478 , ..., 0.0874647 , 0.08615924,
        0.09156223],
       ...,
       [0.11067551, 0.11609698, 0.08456248, ..., 0.10321286, 0.09033251,
        0.09206989],
       [0.10804565, 0.09348496, 0.12605694, ..., 0.0802942 , 0.08369122,
        0.11313645],
       [0.11168894, 0.09263443, 0.12812832, ..., 0.09883527, 0.10700932,
        0.09209047]], dtype=float32)

## Build advanced models
### Functional API

The `tf.keras.Sequential` model is a simple stack of layers that cannot represent arbitrary models. Use the `Keras functional API` to build complex model topologies such as:
- Multi-input models,
- Multi-output models,
- Models with shared layers (the same layer called several times),
- Models with non-sequential data flows (e.g. residual connections).

Building a model with the functional API works like this:
1. A layer instance is callable and returns a tensor.
2. Input tensors and output tensors are used to define a `tf.keras.Model` instance.
3. This model is trained just like the `Sequential` model.

The following example uses the functional API to build a simple, fully-connected network:

In [20]:
inputs = keras.Input(shape=(32,))  # Returns a placeholder tensor

# A layer instance is callable on a tensor, and returns a tensor.
x = keras.layers.Dense(64, activation='relu')(inputs)
x = keras.layers.Dense(64, activation='relu')(x)
predictions = keras.layers.Dense(10, activation='softmax', name = 'OutputLayer')(x)

# Instantiate the model given inputs and outputs.
model = keras.Model(inputs=inputs, outputs=predictions)

# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Trains for 5 epochs
model.fit(data, labels, batch_size=32, epochs=5)

print(model.summary())

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 32)                0         
_________________________________________________________________
dense_12 (Dense)             (None, 64)                2112      
_________________________________________________________________
dense_13 (Dense)             (None, 64)                4160      
_________________________________________________________________
OutputLayer (Dense)          (None, 10)                650       
Total params: 6,922
Trainable params: 6,922
Non-trainable params: 0
_________________________________________________________________
None


### Model subclassing
Build a fully-customizable model by subclassing `tf.keras.Model` and defining your own forward pass. Create layers in the `__init__` method and set them as attributes of the class instance. Define the forward pass in the `call` method.

Model subclassing is particularly useful when `eager execution` is enabled since the forward pass can be written imperatively.

**Key Point**: Use the right API for the job. While model subclassing offers flexibility, it comes at a cost of greater complexity and more opportunities for user errors. If possible, prefer the functional API.

The following example shows a subclassed `tf.keras.Model` using a custom forward pass:

In [21]:
class MyModel(keras.Model):
    def __init__(self, num_classes=10):
        super(MyModel, self).__init__(name='my_model')
        self.num_classes = num_classes
        # define your layers here.
        self.dense_1 = keras.layers.Dense(32, activation='relu')
        self.dense_2 = keras.layers.Dense(num_classes, activation='sigmoid')

    def call(self, inputs):
        # Define your forward pass here,
        # using layers you previously defined (in `__init__`).
        x = self.dense_1(inputs)
        return self.dense_2(x)        
        
    def compute_output_shape(self, input_shape):
        # You need to override this function if you want to use the subclassed model
        # as part of a functional-style model.
        # Otherwise, this method is optional.
        shape = tf.TensorShape(input_shape).as_list()
        shape[-1] = self.num_classes
        return tf.TensorShape(shape)


# Instantiates the subclassed model.
model = MyModel(num_classes=10)

# The compile step specifies the training configuration.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Trains for 5 epochs.
model.fit(data, labels, batch_size=32, epochs=5)    
print(model.summary())

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_14 (Dense)             multiple                  1056      
_________________________________________________________________
dense_15 (Dense)             multiple                  330       
Total params: 1,386
Trainable params: 1,386
Non-trainable params: 0
_________________________________________________________________
None


### Custom layers
Create a custom layer by subclassing `tf.keras.layers.Layer` and implementing the following methods:
- `build`: Create the weights of the layer. Add weights with the `add_weight` method.
- `call`: Define the forward pass.
- `compute_output_shape`: Specify how to compute the output shape of the layer given the input shape.
- Optionally, a layer can be serialized by implementing the `get_config` method and the from_config class method.

Here's an example of a custom layer that implements a `matmul` of an input with a kernel matrix:

In [23]:
class MyLayer(keras.layers.Layer):

    def __init__(self, output_dim, **kwargs):
        self.output_dim = output_dim
        super(MyLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        shape = tf.TensorShape((input_shape[1], self.output_dim))
        # Create a trainable weight variable for this layer.
        self.kernel = self.add_weight(name='kernel',
                                      shape=shape,
                                      initializer='uniform',
                                      trainable=True)
        # Be sure to call this at the end
        super(MyLayer, self).build(input_shape)

    def call(self, inputs):
        return tf.matmul(inputs, self.kernel)

    def compute_output_shape(self, input_shape):
        shape = tf.TensorShape(input_shape).as_list()
        shape[-1] = self.output_dim
        return tf.TensorShape(shape)

    def get_config(self):
        base_config = super(MyLayer, self).get_config()
        base_config['output_dim'] = self.output_dim

    @classmethod
    def from_config(cls, config):
        return cls(**config)

# Create a model using the custom layer
model = keras.Sequential([MyLayer(10),
                          keras.layers.Activation('softmax')])

# The compile step specifies the training configuration
model.compile(optimizer=tf.train.RMSPropOptimizer(0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Trains for 5 epochs.
model.fit(data, labels, batch_size=32, epochs=5)
print(model.summary())

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
my_layer_1 (MyLayer)         (None, 10)                320       
_________________________________________________________________
activation_1 (Activation)    (None, 10)                0         
Total params: 320
Trainable params: 320
Non-trainable params: 0
_________________________________________________________________
None


https://www.tensorflow.org/guide/keras