## BigDL-Nano Resnet example on Stanford Dogs dataset
---
This example illustrates how to apply bigdl-nano optimizations on a image recognition case based on Tensorflow Keras framework. The basic image recognition module is pre-trained EfficientNetB0 from tensorflow.keras.applications and fine-tune it on [Stanford Dogs](http://vision.stanford.edu/aditya86/ImageNetDogs/) image recognition dataset. 

In [1]:
import os
from time import time

import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.applications import EfficientNetB0
import tensorflow_datasets as tfds

from bigdl.nano.tf.keras import Model

In [2]:
IMG_SIZE=224
BATCH_SIZE=64
DATASET_NAME="stanford_dogs"

### Loading data
---
Here we load data from tensorflow_datasets (hereafter TFDS). Stanford Dogs dataset is provided in TFDS as stanford_dogs. It features 20,580 images that belong to 120 classes of dog breeds (12,000 for training and 8,580 for testing).

In [3]:
(ds_train, ds_test), ds_info=tfds.load(
    DATASET_NAME, data_dir="tensorflow_datasets/" ,split=["train", "train"], with_info=True, as_supervised=True
)
NUM_CLASSES = ds_info.features["label"].num_classes
STEPS = len(ds_train)/BATCH_SIZE

size = (IMG_SIZE, IMG_SIZE)
ds_train = ds_train.map(lambda image, label: (tf.image.resize(image, size), label))
ds_test = ds_test.map(lambda image, label: (tf.image.resize(image, size), label))

2022-05-24 11:20:42.456226: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


### Data augmentation
---
To augment the dataset it can beneficial to make augmenter functions: a function that receives an image (a tf.Tensor) and a label and returns a new augmented image and label. By defining functions for each augmentation operation we can easily attach them to datasets and control when they are evaluated.

In [4]:
def rotate(image, label):
    return (tf.image.rot90(image, tf.random_uniform_initializer(minval=0, maxval=4)(shape=[], dtype=tf.int32)), label)

def flip(image, label):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_flip_up_down(image)
    return (image, label)

def contrast(image, label):
    return (tf.image.random_contrast(image, 0.7, 1.3), label)

def color(image, label):
    image = tf.image.random_hue(image, 0.08)
    image = tf.image.random_saturation(image, 0.6, 1.6)
    image = tf.image.random_brightness(image, 0.05)
    image = tf.image.random_contrast(image, 0.7, 1.3)
    return (image, label)


In [5]:
ds_train = ds_train.repeat(50)
augmentations = [rotate, flip, contrast, color]

for f in augmentations:
    ds_train = ds_train.map(f, num_parallel_calls=4)

ds_train = ds_train.shuffle(1000)

### Prepare inputs
---
Once we verify the input data and augmentation are working correctly, we prepare dataset for training. The input data are resized to uniform IMG_SIZE. The labels are put into one-hot (a.k.a. categorical) encoding. The dataset is batched.

In [6]:
def input_preprocess(image, label):
    label = tf.one_hot(label, NUM_CLASSES)
    return image, label


ds_train = ds_train.map(
    input_preprocess, num_parallel_calls=tf.data.AUTOTUNE
)
ds_train = ds_train.batch(batch_size=BATCH_SIZE, drop_remainder=True)
ds_train = ds_train.prefetch(tf.data.AUTOTUNE)

ds_test = ds_test.map(input_preprocess)
ds_test = ds_test.batch(batch_size=BATCH_SIZE, drop_remainder=True)

### Transfer learning from pre-trained weights
---
Here we initialize the model with pre-trained ImageNet weights, and we fine-tune it on Stanford Dogs dataset.

In [7]:
def build_model(num_classes, learning_rate=1e-2):
    inputs = layers.Input(shape=(IMG_SIZE, IMG_SIZE, 3))
    model = EfficientNetB0(include_top=False, input_tensor=inputs, weights="imagenet")

    # Freeze the pretrained weights
    model.trainable = False

    # Rebuild top
    x = layers.GlobalAveragePooling2D(name="avg_pool")(model.output)
    x = layers.BatchNormalization()(x)

    top_dropout_rate = 0.2
    x = layers.Dropout(top_dropout_rate, name="top_dropout")(x)
    outputs = layers.Dense(NUM_CLASSES, activation="softmax", name="pred")(x)

    # Compile
    model = Model(inputs, outputs, name="EfficientNet")
    optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    model.compile(
        optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"]
    )
    return model

def unfreeze_model(model):
    # We unfreeze the top 20 layers while leaving BatchNorm layers frozen
    for layer in model.layers[-20:]:
        if not isinstance(layer, layers.BatchNormalization):
            layer.trainable = True

    optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)
    model.compile(
        optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"]
    )

### Train
---
Use Model.fit from bigdl.nano.tf.keras for BigDl-Nano tf.keras.

This function override tf.keras.Model.fit to add more parameters.


Additional parameters:
```
        :param num_processes:  when num_processes is not None, it specifies how many sub-processes
                               to launch to run pseudo-distributed training; when num_processes is None,
                               training will run in the current process.
                               
        :param backend: Use backend 'multiprocessing', 'horovod', 'ray', defaults to None.
                        when num_processes is not None, it specifies which backend to use when
                       launching sub-processes to run psedu-distributed training; 
                       when num_processes is None, this parameter takes no effect.
```

### Single process

In [8]:
model_single = build_model(num_classes=NUM_CLASSES)

start = time()
epochs = 25  # @param {type: "slider", min:8, max:80}
model_single.fit(ds_train, epochs=epochs, steps_per_epoch=STEPS, validation_data=ds_test, verbose=1)

unfreeze_model(model_single)

epochs = 10  # @param {type: "slider", min:8, max:50}
model_single.fit(ds_train, epochs=epochs, steps_per_epoch=STEPS,validation_data=ds_test, verbose=1)
fit_time_model_single = time() - start
acc_model_single = model_single.evaluate(ds_test, verbose=1)

model_single.save("EfficientNetB0.h5")

Epoch 1/25


tcmalloc: large alloc 1073741824 bytes == 0x562b4429a000 @  0x7f787ec64d3f 0x7f787ec9b0c0 0x7f787ec9e082 0x7f787ec9e243 0x7f786f2d0402 0x7f786366aeb0 0x7f786368b0b5 0x7f786368e9ea 0x7f786368ef69 0x7f786368f2d1 0x7f7863683ce3 0x7f785ed49051 0x7f785eba438d 0x7f785e937087 0x7f785e93791e 0x7f785e937b1d 0x7f7864ab2ded 0x7f785ed4ad7c 0x7f785ecd16d5 0x7f785ecc524e 0x7f786a6e7941 0x7f7864097953 0x7f78640941f3 0x7f785f426313 0x7f787ebfb609 0x7f787eb20163




tcmalloc: large alloc 2147483648 bytes == 0x562b8c90a000 @  0x7f787ec64d3f 0x7f787ec9b0c0 0x7f787ec9e082 0x7f787ec9e243 0x7f786f2d0402 0x7f786366aeb0 0x7f786368b0b5 0x7f786368e9ea 0x7f786368ef69 0x7f786368f2d1 0x7f7863683ce3 0x7f785ed49051 0x7f785eba438d 0x7f785e937087 0x7f785e93791e 0x7f785e937b1d 0x7f7864917bfd 0x7f785ed4ad7c 0x7f785ecd4cec 0x7f786409776e 0x7f78640941f3 0x7f785f426313 0x7f787ebfb609 0x7f787eb20163


Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


  layer_config = serialize_layer_fn(layer)


### Multiple processes

In [9]:
model_multiple = build_model(num_classes=NUM_CLASSES, learning_rate=1e-2)

start = time()
epochs = 25  # @param {type: "slider", min:8, max:80}
model_multiple.fit(ds_train,
                   epochs=epochs,
                   validation_data=ds_test,
                   steps_per_epoch=STEPS,
                   verbose=1,
                   num_processes=2,
                   backend="multiprocessing")

unfreeze_model(model_multiple)

epochs = 10  # @param {type: "slider", min:8, max:50}
model_multiple.fit(ds_train,
                  epochs=epochs,
                   steps_per_epoch=STEPS,
                  validation_data=ds_test,
                  verbose=1,
                  num_processes=2,
                  backend="multiprocessing")
fit_time_model_multiple = time() - start
acc_model_multiple = model_multiple.evaluate(ds_test, verbose=1)

2022-05-24 12:26:08.107422: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.


INFO:tensorflow:Assets written to: /tmp/tmpbx58adkb/temp_model/assets


INFO:tensorflow:Assets written to: /tmp/tmpbx58adkb/temp_model/assets
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_object(obj)
2022-05-24 12:26:26.770996: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-05-24 12:26:26.779921: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:272] Initialize GrpcChannelCache for job worker -> {0 -> localhost:53537, 1 -> localhost:49401}
2022-05-24 12:26:26.780396: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:427] Started server with target: grpc://localhost:53537
2022-05-24 12:26:26.965525: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (o

Epoch 1/25

2022-05-24 12:27:46.894594: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
2022-05-24 12:27:46.895898: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
2022-05-24 12:27:46.976849: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:27:46.978746: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 2/25


2022-05-24 12:29:07.292133: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:29:07.295382: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 3/25


2022-05-24 12:29:47.137757: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:29:47.144971: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 4/25


2022-05-24 12:30:26.851172: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:30:26.854465: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 5/25


2022-05-24 12:31:06.375512: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:31:06.378575: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 6/25


2022-05-24 12:31:45.911236: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:31:45.932201: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 7/25


2022-05-24 12:32:25.000469: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:32:25.019323: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 8/25


2022-05-24 12:33:04.777924: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:33:04.803973: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 9/25


2022-05-24 12:33:43.921737: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:33:43.933067: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 10/25


2022-05-24 12:34:23.860942: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:34:23.875494: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 11/25


2022-05-24 12:35:03.595517: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:35:03.610060: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 12/25


2022-05-24 12:35:43.164454: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:35:43.168934: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 13/25


2022-05-24 12:36:22.495723: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:36:22.509572: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 14/25


2022-05-24 12:37:02.363104: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:37:02.386006: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 15/25


2022-05-24 12:37:42.498055: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:37:42.499523: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 16/25


2022-05-24 12:38:22.199365: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:38:22.214211: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 17/25


2022-05-24 12:39:01.702932: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:39:01.725192: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 18/25


2022-05-24 12:39:41.674519: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:39:41.690605: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 19/25


2022-05-24 12:40:21.185033: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:40:21.188523: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 20/25


2022-05-24 12:41:01.052651: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:41:01.053407: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 21/25


2022-05-24 12:41:40.513596: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:41:40.516969: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 22/25


2022-05-24 12:42:20.022391: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:42:20.029482: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 23/25


2022-05-24 12:42:59.196272: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:42:59.196881: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 24/25


2022-05-24 12:43:39.127823: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:43:39.147646: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 25/25


2022-05-24 12:44:18.675124: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:44:18.678041: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:44:27.131151: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2022-05-24 12:44:27.176825: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_object(obj)
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_

INFO:tensorflow:Assets written to: /tmp/tmpl_0jbtkn/temp_model/assets


INFO:tensorflow:Assets written to: /tmp/tmpl_0jbtkn/temp_model/assets
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_object(obj)
2022-05-24 12:45:40.500341: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-05-24 12:45:40.508890: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:272] Initialize GrpcChannelCache for job worker -> {0 -> localhost:44939, 1 -> localhost:43503}
2022-05-24 12:45:40.509376: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:427] Started server with target: grpc://localhost:44939
2022-05-24 12:45:40.713656: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (o

Epoch 1/10

2022-05-24 12:47:10.852660: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
2022-05-24 12:47:10.861563: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
2022-05-24 12:47:10.944256: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:47:10.952850: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 2/10


2022-05-24 12:48:47.973745: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:48:47.975028: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 3/10


2022-05-24 12:49:36.948485: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:49:36.959055: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 4/10


2022-05-24 12:50:25.119863: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:50:25.128540: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 5/10


2022-05-24 12:51:11.940574: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:51:11.952132: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 6/10


2022-05-24 12:51:59.307261: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:51:59.318497: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 7/10


2022-05-24 12:52:46.267516: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:52:46.278612: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 8/10


2022-05-24 12:53:33.812059: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:53:33.816076: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 9/10


2022-05-24 12:54:21.123385: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:54:21.123483: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.


Epoch 10/10


2022-05-24 12:55:08.532962: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:55:08.545862: W tensorflow/core/framework/dataset.cc:744] Input of GeneratorDatasetOp::Dataset will not be optimized because the dataset does not implement the AsGraphDefInternal() method needed to apply optimizations.
2022-05-24 12:55:16.987224: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
2022-05-24 12:55:17.086375: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_object(obj)
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_



### Multiple processes with horovod

In [10]:
model_multiple_horovod = build_model(num_classes=NUM_CLASSES, learning_rate=1e-2)

start = time()
epochs = 25  # @param {type: "slider", min:8, max:80}
model_multiple_horovod.fit(ds_train,
                   epochs=epochs,
                   validation_data=ds_test,
                   steps_per_epoch=STEPS,
                   verbose=1,
                   num_processes=2,
                   backend="horovod")

unfreeze_model(model_multiple_horovod)

epochs = 10  # @param {type: "slider", min:8, max:50}
model_multiple_horovod.fit(ds_train,
                  epochs=epochs,
                   steps_per_epoch=STEPS,
                  validation_data=ds_test,
                  verbose=1,
                  num_processes=2,
                  backend="horovod")
fit_time_model_multiple_horovod = time() - start
acc_model_multiple_horovod = model_multiple_horovod.evaluate(ds_test, verbose=1)

INFO:tensorflow:Assets written to: /tmp/tmpdm0wxo_k/temp_model/assets


INFO:tensorflow:Assets written to: /tmp/tmpdm0wxo_k/temp_model/assets
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_object(obj)
[1]<stderr>:2022-05-24 12:57:24.548910: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
[1]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[0]<stderr>:2022-05-24 12:57:24.555967: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
[0]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[0]<stderr>:2022-05-24 12:57:34.451471: W tensorflow/core/grappler/optimizers/data/auto_

[0]<stdout>:Epoch 1/25


[1]<stderr>:tcmalloc: large alloc 1073741824 bytes == 0x55720a8ec000 @  0x7fca0a3e5d3f 0x7fca0a41c0c0 0x7fca0a41f082 0x7fca0a41f243 0x7fca043d6402 0x7fc9f8770eb0 0x7fc9f87910b5 0x7fc9f87949ea 0x7fc9f8794f69 0x7fc9f87952d1 0x7fc9f8789ce3 0x7fc9f3e4f051 0x7fc9f3caa38d 0x7fc9f3a3d087 0x7fc9f3a3d91e 0x7fc9f3a3db1d 0x7fc9ff42ebf5 0x7fc9f3e50d7c 0x7fc9f3ddacec 0x7fc9f919d76e 0x7fc9f919a1f3 0x7fc9f452c313 0x7fca0a37c609 0x7fca0a2a1163
[0]<stderr>:tcmalloc: large alloc 1073741824 bytes == 0x55c902706000 @  0x7f9b6a87bd3f 0x7f9b6a8b20c0 0x7f9b6a8b5082 0x7f9b6a8b5243 0x7f9b6486c402 0x7f9b58c06eb0 0x7f9b58c270b5 0x7f9b58c2a9ea 0x7f9b58c2af69 0x7f9b58c2b2d1 0x7f9b58c1fce3 0x7f9b542e5051 0x7f9b5414116a 0x7f9b5e743941 0x7f9b5e744061 0x7f9b5fbb8a1a 0x7f9b54263dbf 0x7f9b5fc83941 0x7f9b59633953 0x7f9b596301f3 0x7f9b549c2313 0x7f9b6a812609 0x7f9b6a737163


[0]<stdout>:Epoch 2/25
[0]<stdout>:Epoch 3/25
[0]<stdout>:Epoch 4/25
[0]<stdout>:Epoch 5/25
[0]<stdout>:Epoch 6/25
[0]<stdout>:Epoch 7/25
[0]<stdout>:Epoch 8/25
[0]<stdout>:Epoch 9/25

[0]<stderr>:tcmalloc: large alloc 2147483648 bytes == 0x55c94a400000 @  0x7f9b6a87bd3f 0x7f9b6a8b20c0 0x7f9b6a8b5082 0x7f9b6a8b5243 0x7f9b6486c402 0x7f9b58c06eb0 0x7f9b58c270b5 0x7f9b58c2a9ea 0x7f9b58c2af69 0x7f9b58c2b2d1 0x7f9b58c1fce3 0x7f9b542e5051 0x7f9b5414038d 0x7f9b53ed3087 0x7f9b53ed391e 0x7f9b53ed3b1d 0x7f9b59eb3bfd 0x7f9b542e6d7c 0x7f9b54270cec 0x7f9b5963376e 0x7f9b596301f3 0x7f9b549c2313 0x7f9b6a812609 0x7f9b6a737163


[0]<stdout>:Epoch 10/25
[0]<stdout>:Epoch 11/25

[1]<stderr>:tcmalloc: large alloc 2147483648 bytes == 0x5572528a4000 @  0x7fca0a3e5d3f 0x7fca0a41c0c0 0x7fca0a41f082 0x7fca0a41f243 0x7fca043d6402 0x7fc9f8770eb0 0x7fc9f87910b5 0x7fc9f87949ea 0x7fc9f8794f69 0x7fc9f87952d1 0x7fc9f8789ce3 0x7fc9f3e4f051 0x7fc9f3caa38d 0x7fc9f3a3d087 0x7fc9f3a3d91e 0x7fc9f3a3db1d 0x7fc9f9a1dbfd 0x7fc9f3e50d7c 0x7fc9f3ddacec 0x7fc9f919d76e 0x7fc9f919a1f3 0x7fc9f452c313 0x7fca0a37c609 0x7fca0a2a1163


[0]<stdout>:Epoch 12/25
[0]<stdout>:Epoch 13/25
[0]<stdout>:Epoch 14/25
[0]<stdout>:Epoch 15/25
[0]<stdout>:Epoch 16/25
[0]<stdout>:Epoch 17/25
[0]<stdout>:Epoch 18/25
[0]<stdout>:Epoch 19/25
[0]<stdout>:Epoch 20/25
[0]<stdout>:Epoch 21/25
[0]<stdout>:Epoch 22/25
[0]<stdout>:Epoch 23/25
[0]<stdout>:Epoch 24/25
[0]<stdout>:Epoch 25/25


[0]<stderr>:2022-05-24 13:28:35.322160: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
[1]<stderr>:2022-05-24 13:28:39.869799: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
[0]<stderr>:  layer_config = serialize_layer_fn(layer)
[0]<stderr>:  return generic_utils.serialize_keras_object(obj)
[1]<stderr>:  layer_config = serialize_layer_fn(layer)
[1]<stderr>:  return generic_utils.serialize_keras_object(obj)


INFO:tensorflow:Assets written to: /tmp/tmpgj57atm9/temp_model/assets


INFO:tensorflow:Assets written to: /tmp/tmpgj57atm9/temp_model/assets
  layer_config = serialize_layer_fn(layer)
  return generic_utils.serialize_keras_object(obj)
[0]<stderr>:2022-05-24 13:29:56.413279: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
[0]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[1]<stderr>:2022-05-24 13:29:56.427813: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
[1]<stderr>:To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
[0]<stderr>:2022-05-24 13:30:05.584748: W tensorflow/core/grappler/optimizers/data/auto_

[0]<stdout>:Epoch 1/10


[1]<stderr>:2022-05-24 13:30:06.323721: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
[1]<stderr>:2022-05-24 13:30:06.341739: W tensorflow/core/grappler/optimizers/data/auto_shard.cc:537] The `assert_cardinality` transformation is currently not handled by the auto-shard rewrite and will be removed.
[0]<stderr>:tcmalloc: large alloc 1073741824 bytes == 0x55de250f2000 @  0x7f745b3c7d3f 0x7f745b3fe0c0 0x7f745b401082 0x7f745b401243 0x7f74553b8402 0x7f7449752eb0 0x7f74497730b5 0x7f74497769ea 0x7f7449776f69 0x7f74497772d1 0x7f744976bce3 0x7f7444e31051 0x7f7444c8c38d 0x7f7444a1f087 0x7f7444a1f91e 0x7f7444a1fb1d 0x7f744dc6f8d2 0x7f744dc7fbda 0x7f7444e32d7c 0x7f7444dbccec 0x7f744a17f76e 0x7f744a17c1f3 0x7f744550e313 0x7f745b35e609 0x7f745b283163
[1]<stderr>:tcmalloc: large alloc 1073741824 bytes == 0x55b0f06d6000 @  0x7f5ac8406d3f 0x7f5ac843d0c0 0x7f5ac8440082 0x7f5ac8

[0]<stdout>:Epoch 2/10
[0]<stdout>:Epoch 3/10
[0]<stdout>:Epoch 4/10
[0]<stdout>:Epoch 5/10

[0]<stderr>:tcmalloc: large alloc 2147483648 bytes == 0x55de6ccce000 @  0x7f745b3c7d3f 0x7f745b3fe0c0 0x7f745b401082 0x7f745b401243 0x7f74553b8402 0x7f7449752eb0 0x7f74497730b5 0x7f74497769ea 0x7f7449776f69 0x7f74497772d1 0x7f744976bce3 0x7f7444e31051 0x7f7444c8c38d 0x7f7444a1f087 0x7f7444a1f91e 0x7f7444a1fb1d 0x7f744a9ffbfd 0x7f7444e32d7c 0x7f7444dbccec 0x7f744a17f76e 0x7f744a17c1f3 0x7f744550e313 0x7f745b35e609 0x7f745b283163


[0]<stdout>:Epoch 6/10
[0]<stdout>:Epoch 7/10
[0]<stdout>:Epoch 8/10
[0]<stdout>:Epoch 9/10
[0]<stdout>:Epoch 10/10


[0]<stderr>:2022-05-24 13:42:05.376975: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
[1]<stderr>:2022-05-24 13:42:06.630141: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
[0]<stderr>:  layer_config = serialize_layer_fn(layer)
[0]<stderr>:  return generic_utils.serialize_keras_object(obj)
[1]<stderr>:  layer_config = serialize_layer_fn(layer)
[1]<stderr>:  return generic_utils.serialize_keras_object(obj)




In [11]:
template = """
|        Precision     | Fit Time(s)       | Accuracy(%) |
|         Single       |       {:5.2f}       |    {:5.2f}    |
|        Multiple      |       {:5.2f}       |    {:5.2f}    |
| Multiple With Horovod|       {:5.2f}       |    {:5.2f}    |
"""
summary = template.format(
    fit_time_model_single, acc_model_single[1] * 100,
    fit_time_model_multiple, acc_model_multiple[1] * 100,
    fit_time_model_multiple_horovod, acc_model_multiple_horovod[1] * 100
)
print(summary)


|        Precision     | Fit Time(s)       | Accuracy(%) |
|         Single       |       3861.91       |    96.60    |
|        Multiple      |       1802.21       |    96.18    |
| Multiple With Horovod|       2760.52       |    95.92    |

