# Keras

- Keras is a high-level API to build and train deep learning models. It's used for fast prototyping, advanced research, and production, with three key advantages:

    - *User friendly*: Keras has a simple, consistent interface optimized for common use cases. It provides clear and actionable feedback for user errors.
    
    - *Modular and composable*: Keras models are made by connecting configurable building blocks together, with few restrictions.
    
    - *Easy to extend*: Write custom building blocks to express new ideas for research. Create new layers, loss functions, and develop state-of-the-art models.

## Import tf.keras

- `tf.keras` is TensorFlow's implementation of the [Keras API specification](https://keras.io/). This is a high-level API to build and train models that includes first-class support for TensorFlow-specific functionality, such as [eager execution](https://www.tensorflow.org/guide/keras#eager_execution), `tf.data` pipelines, and [Estimators](https://www.tensorflow.org/guide/estimators). `tf.keras` makes TensorFlow easier to use without sacrificing flexibility and performance.

- To get started, import `tf.keras` as part of your TensorFlow program setup:

In [1]:
!pip install -q pyyaml # Require to save models in YAML format

In [1]:
import tensorflow as tf
from tensorflow.keras import layers

print(tf.VERSION)
print(tf.keras.__version__)

1.13.1
2.2.4-tf


- `tf.keras` can run any Keras-compatiable code, but keep in mind:

    - The `tf.keras` version in the latest TensorFlow release might not be the same as the latest `keras` version from PyPI. Check `tf.keras.version`.
    
    - When [saving a model's weights](https://www.tensorflow.org/guide/keras#weights_only), defaults to the [checkpoint format](https://www.tensorflow.org/guide/checkpoints). Pass `save_format='h5'` to use HDF5.

## Build a simple model

### Sequential moodel

- In Keras, you assemble `layers` to build `models`. A model is (usually) a graph of layers. The most common type of model is a stack of layers: the `tf.keras.Sequential` model.

- To build a simple, fully-connected network (i.e. multi-layer perceptron):

In [3]:
model = tf.keras.Sequential()
# Adds a densely-connected layer with 64 units to the model:
model.add(layers.Dense(64, activation='relu'))
# Add another:
model.add(layers.Dense(64, activation='relu'))
# Add a softmax layer with 10 outputs units:
model.add(layers.Dense(10, activation='softmax'))

### Configure the layers

- There are many `tf.keras.layers` available with some common constructor parameters:

    - `activation`: Set the activation function for the layer. This parameter is specified by the name of a built-in function or as a callable object. By default, no activation is applied.
    
    - `kernel_initializer` and `bias_initializer`: The initialization schemes that create the layer's weights (kernel and bias). This parameter is a name or a callable object. This defaults to the `"Glorot uniform"` initializer.
    
    - `kernel_regularizer` and `bias_regularizer`: The regularization schemes that apply the layer's weights (kernel and bias), such as L1 or L2 regularization. By default, no regularization is applied.
    
- The following instantiates `tf.keras.layers.Dense` layers using constructor arguments:

In [2]:
# Create a sigmoid layer:
layers.Dense(64, activation='sigmoid')
# Or:
layers.Dense(64, activation=tf.sigmoid)

# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:
layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l1(0.01))

# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
layers.Dense(64, bias_regularizer=tf.keras.regularizers.l2(0.01))

# A linear layer with a kernel initialized to a random orthogonal matrix:
layers.Dense(64, kernel_initializer='orthogonal')

# A linear layer with a bias vector initialized to 2.0s:
layers.Dense(64, bias_initializer=tf.keras.initializers.constant(2.0))

<tensorflow.python.keras.layers.core.Dense at 0x2261e609550>

## Train and evaluate

### Set up training

- After the model is constructed, configure its learning process by calling the `compile` method:

In [4]:
model = tf.keras.Sequential([
    # Add a densely-connected layer with 64 units to the model:
    layers.Dense(64, activation='relu', input_shape=(32,)),
    # Add another:
    layers.Dense(64, activation='relu'),
    # Add a softmax layer with 10 output units:
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer=tf.train.AdamOptimizer(0.001), 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

Instructions for updating:
Colocations handled automatically by placer.


- `tf.keras.Model.compile` takes three important arguments:
    
    - `optimizer`: This object specifies the training procedures. Pass it optimizer instances from the `tf.train` module, such as `tf.train.AdamOptimizer`, `tf.train.RMSPropOptimizer`, or `tf.train.GradientDescentOptimizer`.
    
    - `loss`: The function to minimize during optimization. Common choices include mean square error(`mse`), `categorical_crossentropy`, and `binary_crossentropy`. Loss functions are specified by name or by passing a callable object from the `tf.keras.losses` module.
    
    - `metrics`: Used to monitor training. These are string names or callables from the `tf.keras.metrics` module.
    
- The following shows a few examples of configuring a model for training:

In [5]:
# Configure a model for mean-squared error regression.
model.compile(optimizer=tf.train.AdamOptimizer(0.01), 
              loss='mse', # mean squared error
              metrics=['mae']) # mean absolute error

# Configure a model for categorical classification.
model.compile(optimizer=tf.train.RMSPropOptimizer(0.01), 
              loss=tf.keras.losses.categorical_crossentropy, 
              metrics=[tf.keras.metrics.categorical_accuracy])

Instructions for updating:
Use tf.cast instead.


### Input NumPy data

- For small datasets, use in-memory NumPy arrays to train and evaluate a model. The model is "fit" to the training data using the `fit` method:

In [6]:
import numpy as np

data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

model.fit(data, labels, epochs=10, batch_size=32)

Instructions for updating:
Use tf.cast instead.
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x2261e609208>

- `tf.keras.Model.fit` takes three important arguments:

    - `epochs`: Training is structured into `epochs`. An epoch is one iteration over the entire input data (this is done in smaller batches).
    - `batch_size`: When passed NumPy data, the model slices the data into smaller batches and iterates over these batches during training. This integer specfies the size of each batch. Be aware that the last batch may be smaller if the total number of samples is not divisible by the batch size.
    - `validation_data`: When prototyping a model, you want to easily monitor its performance on some validation data. Passing this argument - a tuple of inputs and labels - allows the model to display the loss and metrics in inference mode for the passed data, at the end of each epoch.
    
- Here's an example using `validation_data`:

In [7]:
import numpy as np

data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

val_data = np.random.random((100, 32))
val_labels = np.random.random((100, 10))

model.fit(data, labels, epochs=10, batch_size=32, 
          validation_data=(val_data, val_labels))

Train on 1000 samples, validate on 100 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x22645296128>

### Input tf.data datasets

- Use the [Datasets API](https://www.tensorflow.org/guide/datasets) to scale large datasets or multi-device training. Pass a `tf.data.Dataset` instance to the `fit` method:

In [8]:
# Instantiates a toy dataset instance:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()

# Don't forget to specify `step_per_epoch` when calling `fit` on a dataset.
model.fit(dataset, epochs=10, steps_per_epoch=30)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x226452ab630>

In [9]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_9 (Dense)              (None, 64)                2112      
_________________________________________________________________
dense_10 (Dense)             (None, 64)                4160      
_________________________________________________________________
dense_11 (Dense)             (None, 10)                650       
Total params: 6,922
Trainable params: 6,922
Non-trainable params: 0
_________________________________________________________________


- Here, the `fit` method uses the `steps_per_epoch` argument - this is the number of training steps the model runs before it moves to the next epochs, Since the `Dataset` yields batches of data, this snippet does not require a `batch_size`.

- Datasets can also be used for validation:

In [10]:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32).repeat()

val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32).repeat()

model.fit(dataset, epochs=10, steps_per_epoch=30, 
          validation_data=val_dataset, 
          validation_steps=3)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x226fc120898>

### Evaluate and predict

- The `tf.keras.Model.evaluate` and `tf.keras.Model.predict` methods can use NumPy data and a `tf.data.Dataset`.

- To *evaluate* the inference-mode loss and metrics for the data provided:

In [11]:
data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

model.evaluate(data, labels, batch_size=32)

model.evaluate(dataset, steps=30)



[11.473672898610433, 0.15416667]

- Add to *predict* the output of the last layer in inference for the data provided, as a NumPy array:

In [12]:
result = model.predict(data, batch_size=32)
print(result.shape)

(1000, 10)
