# Keras

- A high-level API to build and train deep learning models
- Used for fast prototyping, advanced research, and production
- User friendly
- Modular and composable
- Easy to extend

In [1]:
!pip3 install -q pyyaml

## Import tf.keras

`tf.keras`
- A high-level API to build and train models that includes first class support for TensorFlow-specific functionality
- Such as eager execution, `tf.data` pipelines, and Estimators
- `tf.keras` makes TensorFlow easier to use without sacrificing flexibiity and performance

In [2]:
import tensorflow as tf
from tensorflow.keras import layers

print(tf.VERSION)
print(tf.keras.__version__)

1.12.0
2.1.6-tf


- `tf.keras` version in the latest TensorFlow might not be the same as the latest `keras` version from PyPI
- When saving a model's weights, `tf.keras` defaults to the checkpoint format.

## Build a simple model

- Assemble layers to build model
- A model is (usually) a graph of layers
- The most common type of model is a stack of layers: the `tf.keras.Sequential` model

In [3]:
# A simple, fully-connected network (i.e. multi-layer perceptron)
model = tf.keras.Sequential()

# Adds a densely-connected layer with 64 units to the model
model.add(layers.Dense(64, activation='relu'))
# Add another
model.add(layers.Dense(64, activation='relu'))
# Add a softmax layer with 10 output units
model.add(layers.Dense(10, activation='softmax'))

## Configure the layers

`tf.keras.layers`
- `activation`
    - Activation function of the layer
    - This parameter is specified by the name of a built-int function or as a callable object
    - By default, no activation is applied
- `kernel_initializer` and `bias_initializer`
    - The initialization schemes that create the layer's weights (kernel and bias)
    - This parameter is a name or a callable object
    - Defaults to the `Glorot uniform` initializer
- `kernel_regularizer` and `bias_regularizer`
    - The regularization schemes that apply the layer's weights (kernel and bias)
    - Such as L1 or L2 regularization
    - By defaults, no regularization is applied

In [4]:
# Instantiating tf.keras.layers.Dense layers using constructor 
# arguments

# Create a sigmoid layer:
layers.Dense(64, activation='sigmoid')
# Or:
# layers.Dense(64, activation=tf.sigmoid)

# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix:
layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l1(0.01))

# A linear layer with L2 regularization of factor 0.01 applied to the bias vector:
layers.Dense(64, bias_regularizer=tf.keras.regularizers.l2(0.01))

# A linear layer with a kernel initialized to a random orthogonal matrix:
layers.Dense(64, kernel_initializer='orthogonal')

# A linear layer with a bias vector initialized to 2.0s:
layers.Dense(64, bias_initializer=tf.keras.initializers.constant(2.0))

<tensorflow.python.keras.layers.core.Dense at 0xb33a33c18>

## Training and evaluate

### Set up training

Configuring the learning process by calling `compile` method

In [5]:
model = tf.keras.Sequential([
    layers.Dense(64, activation='relu'),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer=tf.train.AdamOptimizer(0.001),
               loss='categorical_crossentrophy',
               metrics=['accuracy'])

`tf.keras.Model.compile` takes three important arguments:
- `optimizer`
    - Specifies the training procedure
    - Pass the optimizer instances from the `tf.train` module (such as `tf.train.AdamOptimizer`, `tf.train.RMSPropOptimizer`, or `tf.train.GradientDescentOptimizer`
- `loss`
    - The function to minimize during optimization
    - Common choices includes mean square error (`mse`), `categorical_crossentrophy`, and `binary_crossentrophy`
    - Loss functions are specified by name or by passing a callable object from the `tf.keras.losses` module
- `metrics`
    - Used to monitor training
    - String names or callables from `tf.keras.metrics` module

Configuring a model for training

In [6]:
# Configure a model for mean-squared error regression
model.compile(optimizer=tf.train.AdamOptimizer(0.01),
              loss='mse',       # mean squared error
              metrics=['mae'])  # mean absolute error

# Configure a model for categorical classification
model.compile(optimizer=tf.train.RMSPropOptimizer(0.01),
              loss=tf.keras.losses.categorical_crossentropy,
              metrics=[tf.keras.metrics.categorical_accuracy])

### Input NumPy Data

The model is "fit" to the training data using the `fit` method

In [7]:
import numpy as np

data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

model.fit(data, labels, epochs=10, batch_size=32)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0xb33a52c50>

`tf.keras.Model.fit` takes three important arguments:
- `epochs`
    - Training is structured into epochs
    - An epoch is one iteration over the entire input data (this is done in similar patches)
- `batch_size`
    - The model slices the data into smaller batches and iterates over these batches during training
    - This integer specifies the size of each batch
    - The last batch may be smaller if the total number of samples is not divisible by the batch size
- `validation_data`
    - This parameter is a tuple of inputs and labels
    - This allows the model to display the loss and metrics in inference mode for the passed data at the end of each epoch (shown above)


An example using `validation_data`:

In [8]:
data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

val_data = np.random.random((100, 32))
val_labels = np.random.random((100, 10))

model.fit(data, labels, epochs=10, batch_size=32, validation_data=(val_data, val_labels))

Train on 1000 samples, validate on 100 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0xb33f66eb8>

### Input `tf.data` dadasets

In [11]:
# Instantiates a toy dataset instance
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32)
dataset = dataset.repeat()

# Don't forget to specify 'steps_per_epoch' when calling 'fit' on a dataset
model.fit(dataset, epochs=10, steps_per_epoch=30)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0xb34fb89b0>

`steps_per_epoch`
- The number of training steps the model runs before it moves to the next epoch

Since the `Dataset` yields batches of data, this snippet does not require a `batch_size`.

Datasets can also be used for validation:

In [13]:
dataset = tf.data.Dataset.from_tensor_slices((data, labels))
dataset = dataset.batch(32).repeat()

val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels))
val_dataset = val_dataset.batch(32).repeat()

model.fit(dataset, epochs=10, steps_per_epoch=30,
          validation_data=val_dataset,
          validation_steps=3)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x10448da20>

### Evaluate and predict

The `tf.keras.Model.evaluate` and `tf.keras.Model.predict` methods can use Numpy data and a `tf.data.Dataset`.

To evaluate the inference-mode loss and metrics for the data provided:

In [18]:
data = np.random.random((1000, 32))
labels = np.random.random((1000, 10))

model.evaluate(data, labels, batch_size=32)
model.evaluate(dataset, steps=30)



[11.427550125122071, 0.203125]

To predict the output of the last layer in inference for the data provided, as a NumPy array:

In [19]:
result = model.predict(data, batch_size=32)
print(result.shape)

(1000, 10)
