In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

## Introduction
- We use three Keras functionalities (inherits from keras model) to train, evaluate and pedict from the model that developed
    - Model.fit() - for training
    - Model.evaluate() - to evaluate the model
    - Model.predict() - to infere the model on test example or test dataset
- When we use built in loops for training and evaluation, process will be same for both Seqential and Functional API models

## A First end -to-end Example
- Data can be fed to training loops either using
    - Numpy Arrays (When the data is small and can be fit into memory)
    - tf.data Dataset objects

Lets consider the following model for MNIST classification:

A Typical end-to-end workflow looks like consists of:
- Training
- Validation on Hold out Data (generated from original training data)
- Evaluation on Test Data


In [2]:
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense1")(inputs)
x = layers.Dense(64, activation="relu", name="dense2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictio")(x)

model = keras.Model(inputs=inputs, outputs=outputs)

In [3]:
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

# Preprocess the data
x_train = x_train.reshape(-1,784).astype("float32")/255
x_test = x_test.reshape(-1,784).astype("float32")/255

y_train = y_train.astype("float32")
y_test = y_test.astype("float32")

# Reserve 10000 samples for validation
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]

# Specify the training configuration (optimizser, loss, metrics)
model.compile(optimizer = keras.optimizers.RMSprop(),
              loss = keras.losses.SparseCategoricalCrossentropy(),
              metrics = [keras.metrics.SparseCategoricalAccuracy()],)

# fit() is for training the model with several parameters
print("Training The Model...")
history = model.fit(x_train, y_train, batch_size=64, epochs=2, validation_data=(x_val, y_val))

Training The Model...
Epoch 1/2
Epoch 2/2


In [4]:
# history object holds the metrics and losses for each epoch of both training and validation data
history.history

{'loss': [0.34058651328086853, 0.16712217032909393],
 'sparse_categorical_accuracy': [0.9034000039100647, 0.9503399729728699],
 'val_loss': [0.21763567626476288, 0.13958387076854706],
 'val_sparse_categorical_accuracy': [0.9380000233650208, 0.9599999785423279]}

In [5]:
# Evaluate the model on the test data using "evaluate" method
print("Evauae on Test Data")
results = model.evaluate(x_test, y_test, batch_size=128)
print("test loss, test acc:", results)
# Get predictions on individual images or batch of images using predict method
print("Generate Predictions for 3 samples")
predictions = model.predict(x_test[:3])
print("predictions shape", predictions.shape)

Evauae on Test Data
test loss, test acc: [0.13907140493392944, 0.9585000276565552]
Generate Predictions for 3 samples
predictions shape (3, 10)


## Compiling a Model: Loss, Metrics, Optimizer
To train a model before going to fit() we need to compile the model with following fields
- optimizer - Algorithm for Backpropogation (example: Adam, RMSProp, Adagrad,...etc)
- loss - If the model have multiple outputs then we can specify different loss functions for each output
- metrics - its list of where we can specify any number of metrics. and also for multi output model we can specify multiple types of metrics

If we want to go with default values for (optimizer, loss and metrics) we can specify them in strings. if we want to customize them we need to call respective functions from keras


In [6]:
# Model compilation with default fileds
model.compile(
    optimizer="rmsprop",
    loss="sparse_categorical_crossentropy",
    metrics=["sparse_categorical_accuracy"],
)

In [7]:
# Model compilation with custom functions
model.compile(
    optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=[keras.metrics.SparseCategoricalAccuracy()],
)

In [8]:
# For later reuse, let's put our model definition and compile step in functions; we will call them several times across different examples in this guide.
def get_uncompiled_model():
    inputs = keras.Input(shape=(784,), name="digits")
    x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
    x = layers.Dense(64, activation="relu", name="dense_2")(x)
    outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model


def get_compiled_model():
    model = get_uncompiled_model()
    model.compile(
        optimizer="rmsprop",
        loss="sparse_categorical_crossentropy",
        metrics=["sparse_categorical_accuracy"],
    )
    return model

### Builtin Optimizers, Losses and Metircs
Optimizers: tf.keras.optimizers.
- SGD() (with or without momentum)
- RMSProp()
- Adam()
- Adagard()
- Adadelta()
- Adamax()

Losses: tf.keras.losses.
- BinaryCrossentropy()
- CategoricalCrossentropy()
- CategoricalHinge()
- CosineSimilairty()
- Hinge()
- KLDivergence()
- MeanAbsoluteError()
- MeanAbsolutePercentageError()
- MeanSquaredError()
- MeanSquaredLogarithmicError()
- SparseCategoricalCrossentropy()

Metrics: tf.keras.metrics.
- AUC()
- Precision()
- Recall()
- Accuracy()

Apart from there if we want to create a custom functions Keras has the feasability to create

### Custom Loss:
To create a custom loss function we can do it in two ways
- create a function which takes y_true and y_pred as inputs (it wont accept other inputs)
- if we want to have other paramters along with y_true and y_pred we need to create a custom loss class inherited from tf.keras.losses.Loss



In [9]:
# The following example shows a loss function that computes the mean squared error between the real data and the predictions:
def custom_mean_squared_error(y_true, y_pred):
    return tf.math.reduce_mean(tf.square(y_true - y_pred))


model = get_uncompiled_model()
model.compile(optimizer=keras.optimizers.Adam(), loss=custom_mean_squared_error)

# We need to one-hot encode the labels to use MSE
y_train_one_hot = tf.one_hot(y_train, depth=10)
model.fit(x_train, y_train_one_hot, batch_size=64, epochs=1)



<keras.callbacks.History at 0x2716fc22af0>

If you need a loss function that takes in parameters beside y_true and y_pred, you can subclass the tf.keras.losses.Loss class and implement the following two methods:
- __init__(self): accept parameters to pass during the call of your loss function
- call(self, y_true, y_pred): use the targets (y_true) and the model predictions (y_pred) to compute the model's loss

Let's say you want to use mean squared error, but with an added term that will de-incentivize prediction values far from 0.5 (we assume that the categorical targets are one-hot encoded and take values between 0 and 1). This creates an incentive for the model not to be too confident, which may help reduce overfitting (we won't know if it works until we try!).

Here's how you would do it:

In [10]:
class CustomMSE(tf.keras.losses.Loss):
    def __init__(self, regularization_factor=0.1, name="custom_mse"):
        super(CustomMSE, self).__init__(name=name)
        self.regularization_factor = regularization_factor
    def call(self, y_true, y_pred):
        mse = tf.reduce_mean(tf.square(y_true-y_pred))
        reg = tf.math.reduce_mean(tf.square(0.5 - y_pred))
        return mse+reg*self.regularization_factor

model = get_uncompiled_model()
model.compile(optimizer=keras.optimizers.Adam(), loss=CustomMSE())

y_train_one_hot = tf.one_hot(y_train, depth=10)
model.fit(x_train, y_train_one_hot, batch_size=64, epochs=1)




<keras.callbacks.History at 0x2716fa31520>

### Custom Metrics:
if we want to create or use any metric which is not a part of API we can create it by using tf.keras.metrics.Metric class. For this we need to implement 4 methods:
- __init__(self) in which we create state variables for our metric
- update_state(self, y_true, y_pred, sample_weight=None) which takes y_true and y_pred to update the sate varaibles
- result(self) which uses the state varaible to compute the result
- reset_state(Self) which reinitializes the state of the metric

State update and result computations kept separately because when the data size used for the results computation is vere huge, it will become computationally very expensive and would be done periodically. So for each period state varaibles will be updated and corresponing results as well

In [11]:
# Here's a simple example showing how to implement a CategoricalTruePositives metric that counts how many samples were correctly classified as belonging to a given class:
class CategoricalTruePositives(tf.keras.metrics.Metric):
    def __init__(self, name="categorical_true_positives", **kwargs):
        super(CategoricalTruePositives, self).__init__(name=name, **kwargs)
        self.true_positives = self.add_weight(name="ctp", initializer="zeros")

    def update_state(self, y_true, y_pred, sample_weight=None):
        y_pred = tf.reshape(tf.argmax(y_pred, axis=1), (-1,1))
        values = tf.cast(y_true, "int32") == tf.cast(y_pred, "int32")
        values = tf.cast(values, "float32")
        if sample_weight is not None:
            sample_weight = tf.cast(sample_weight, "float32")
            values = tf.multiply(values, sample_weight)
        self.true_positives.assign_add(tf.reduce_sum(values))

    def result(self):
        return self.true_positives
    
    def reset_state(self):
        # The state of the metric will be reset at the start of each epoch.
        self.true_positives.assign(0.0)

model = get_uncompiled_model()
model.compile(
    optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=[CategoricalTruePositives()],
)
model.fit(x_train, y_train, batch_size=64, epochs=3)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x27172190e20>

### Handling Losses Metrics that don' fit the standard signature
- Many of losses and metrics can be computed using y_true and y_pred
- But in some cases model output might not be used to compute the loss
- For example a regularization loss may only require the activation of a layer, and this activation may not be a model output.
- In such cases, you can call self.add_loss(loss_value) from inside the call method of a custom layer. 
- Losses added in this way get added to the "main" loss during training (the one passed to compile())

In [12]:
class ActivityRegularizationLayer(tf.keras.layers.Layer):
    def call(self, inputs):
        self.add_loss(tf.reduce_sum(inputs)*0.1)
        return inputs

inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)

# Insert activity regularization as a layer
x = ActivityRegularizationLayer()(x)
x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, name="predictions")(x)

model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(
    optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)

# The displayed loss will be much higher than before
# due to the regularization component.
model.fit(x_train, y_train, batch_size=64, epochs=1)



<keras.callbacks.History at 0x2717224dd00>

In [13]:
# You can do the same for logging metric values, using add_metric()
class MetricLoggingLayer(tf.keras.layers.Layer):
    def call(self, inputs):
        # The `aggregation` argument defines
        # how to aggregate the per-batch values
        # over each epoch:
        # in this case we simply average them.
        self.add_metric(
            tf.keras.backend.std(inputs),name="std_of_activation", aggregation="mean"
        )
        return inputs
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)

# Insert std logging as a layer.
x = MetricLoggingLayer()(x)

x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, name="predictions")(x)

model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(
    optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)
model.fit(x_train, y_train, batch_size=64, epochs=1)



<keras.callbacks.History at 0x271726b64f0>

In [14]:
# In the Functional API, you can also call model.add_loss(loss_tensor), or model.add_metric(metric_tensor, name, aggregation).
# Here's a simple example:
inputs = keras.Input(shape=(784,), name="digits")
x1 = layers.Dense(64, activation="relu", name="dense_1")(inputs)
x2 = layers.Dense(64, activation="relu", name="dense_2")(x1)
outputs = layers.Dense(10, name="predictions")(x2)
model = keras.Model(inputs=inputs, outputs=outputs)

model.add_loss(tf.reduce_sum(x1) * 0.1)

model.add_metric(keras.backend.std(x1), name="std_of_activation", aggregation="mean")

model.compile(
    optimizer=keras.optimizers.RMSprop(1e-3),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)
model.fit(x_train, y_train, batch_size=64, epochs=1)



<keras.callbacks.History at 0x271727f4130>

Note that when you pass losses via add_loss(), it becomes possible to call compile() without a loss function, since the model already has a loss to minimize.

Consider the following LogisticEndpoint layer: it takes as inputs targets & logits, and it tracks a crossentropy loss via add_loss(). It also tracks classification accuracy via add_metric()

In [15]:
class LogisticEndpoint(tf.keras.layers.Layer):
    def __init__(self, name=None):
        super(LogisticEndpoint, self).__init__(name=name)
        self.loss_fn = tf.keras.losses.BinaryCrossentropy(from_logits=True)
        self.accuracy_fn = tf.keras.metrics.BinaryAccuracy()

    def call(self, targets, logits, sample_weights=None):
        # Compute the training-time loss value and add it
        # to the layer using `self.add_loss()`.
        loss = self.loss_fn(targets, logits, sample_weights)
        self.add_loss(loss)
        # Log accuracy as a metric and add it
        # to the layer using `self.add_metric()`.
        acc = self.accuracy_fn(targets, logits, sample_weights)
        self.add_metric(acc, name="accuracy")
        # Return the inference-time prediction tensor (for `.predict()`).
        return tf.nn.softmax(logits)

#you can use it in a model with two inputs (input data & targets), compiled without a loss argument, like this:
inputs = keras.Input(shape=(3,), name="inputs")
targets = keras.Input(shape=(10,), name="targets")
logits = keras.layers.Dense(10)(inputs)
predictions = LogisticEndpoint(name="predictions")(logits, targets)

model = keras.Model(inputs=[inputs, targets], outputs=predictions)
model.compile(optimizer="adam")  # No loss argument!

data = {
    "inputs": np.random.random((3, 3)),
    "targets": np.random.random((3, 10)),
}
model.fit(data)



<keras.callbacks.History at 0x2717295b490>

### Automatically Setting Apart a Validation Holdout Set
- So far we have used validation data set for validating the model while training
- Instead we can split the training data into vaidation while training using validation_split paramters
- But this can be used only when we are using Numpy data for training


In [16]:
model = get_compiled_model()
model.fit(x_train, y_train, batch_size=64, validation_split=0.2, epochs=1)



<keras.callbacks.History at 0x27172a0dca0>

## Training and Evaluation from tf.data Datasets
- So far we worked with Numpy array datasets
- Now lets look at the case where our data comes in the form of tf.data.Dataset object.
- The tf.data API is a set of utilities in Tensorflow 2.0 for loading and preprocessing data in a way that's fast and scalable
- We cah pass a Dataset instance directly to the methods fit(), evaluate() and predict():

In [18]:
model = get_compiled_model()
# Lets create a Datset instance on MNIST data 
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
# shuffle and slice the datset
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)

# Now do the same for test dataset
test_dataset = tf.data.Dataset.from_tensor_slices((x_test, y_test))
test_dataset = test_dataset.batch(64)

# since we passed batching in dataset there is no need of providing batch_size in training
model.fit(train_dataset, epochs=3)

# lets evaluate on test dataset
print("Evaluate...")
result = model.evaluate(test_dataset)
dict(zip(model.metrics_names,result))


Epoch 1/3
Epoch 2/3
Epoch 3/3
Evaluate...


{'loss': 0.12527132034301758,
 'sparse_categorical_accuracy': 0.9616000056266785}

- Dataset will be reset at the end of the each epoch, so that it can be reused in the next epoch. 
- If we want to run only a specific numver of batches for each epoch we can pass the argument steps_per_epoch
- This argument specifies howmany training stpes the model should run using the Dataset before moving on to the next epoch. 
- If we do this the Dataset is not reset at the end of the each epoch, insted we just keep drawing the next batches. 
- The dataset will eventually run outof data (unless it is an infinitely-looping dataset)

In [20]:
model = get_compiled_model()
# prepare the training dataset
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)

# only use 1000 vatches per epoch (thats 64*100 samples)
model.fit(train_dataset, epochs=11, steps_per_epoch = 100)

Epoch 1/11
Epoch 2/11
Epoch 3/11
Epoch 4/11
Epoch 5/11
Epoch 6/11
Epoch 7/11
Epoch 8/11


<keras.callbacks.History at 0x2710b4697f0>

In [21]:
model = get_compiled_model()
# prepare the training dataset using repeat function
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train)).repeat()
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)

# only use 1000 vatches per epoch (thats 64*100 samples)
model.fit(train_dataset, epochs=11, steps_per_epoch = 100)

Epoch 1/11
Epoch 2/11
Epoch 3/11
Epoch 4/11
Epoch 5/11
Epoch 6/11
Epoch 7/11
Epoch 8/11
Epoch 9/11
Epoch 10/11
Epoch 11/11


<keras.callbacks.History at 0x2710c7a7130>

### Using a Validation Dataset
- We can pass a Dataset instance as a validation_data argument in fit():
- At the end of each epoch the model will iterate over the validation dataset and compute the validation loss and validation metrics
- If we want to run validation only on a specific number of batches from this dataset, we can pass the validation_steps argument
- this argument specifies how many validation steps the model should run with the validation dataset before interrupting validation and moving on to the next epoch.
- Note that the validation dataset will be reset after each use even with validation_steps argument (so that you will always be evaluating on the same samples from epoch to epoch).
- The argument validation_split (generating a holdout set from the training data) is not supported when training from Dataset objects, since this feature requires the ability to index the samples of the datasets, which is not possible in general with the Dataset API.

In [22]:
model = get_compiled_model()

# Prepare the training dataset
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)

# Prepare the validation dataset
val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
val_dataset = val_dataset.batch(64)

model.fit(train_dataset, epochs=1, validation_data=val_dataset)



<keras.callbacks.History at 0x2710c9aa8b0>

In [23]:
model = get_compiled_model()

# Prepare the training dataset
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)

# Prepare the validation dataset
val_dataset = tf.data.Dataset.from_tensor_slices((x_val, y_val))
val_dataset = val_dataset.batch(64)

model.fit(
    train_dataset,
    epochs=1,
    # Only run validation using the first 10 batches of the dataset
    # using the `validation_steps` argument
    validation_data=val_dataset,
    validation_steps=10,
)




<keras.callbacks.History at 0x2710c9aa610>

## Other Input Formats Supported