## Setup

This section contains supplementary information, functions, and installs required packages.

In [0]:
!pip install tensorflow-gpu==2.0 tensorflow_datasets gpustat -Uq

**About**

<img src="https://upload.wikimedia.org/wikipedia/en/thumb/6/6d/Nvidia_image_logo.svg/200px-Nvidia_image_logo.svg.png" width="90px" align="right" style="margin-right: 0px;">

This notebook is put together by Timothy Liu (`timothyl@nvidia.com`) for the [**PyCon SG**](https://pycon.sg/) 2019 tutorial on [**Improving Deep Learning Performance in TensorFlow**](https://github.com/NVAITC/pycon-sg19-tensorflow-tutorial).

**Acknowledgements**

* This notebook uses some materials adapted from TensorFlow documentation.
* This notebook uses the [Oxford IIT Pet Dataset](http://www.robots.ox.ac.uk/~vgg/data/pets/) ([TensorFlow Datasets page](https://www.tensorflow.org/datasets/catalog/oxford_iiit_pet)).

**Dataset Citation**

```
@InProceedings{parkhi12a,
  author       = "Parkhi, O. M. and Vedaldi, A. and Zisserman, A. and Jawahar, C.~V.",
  title        = "Cats and Dogs",
  booktitle    = "IEEE Conference on Computer Vision and Pattern Recognition",
  year         = "2012",
}
```

In [0]:
import multiprocessing

import tensorflow
print("TensorFlow version:", tensorflow.__version__)

import tensorflow.compat.v2 as tf
import tensorflow_datasets as tfds

TensorFlow version: 2.0.0


In [0]:
import time

class TimeHistory(tf.keras.callbacks.Callback):
    def on_train_begin(self, logs={}):
        self.times = []
    def on_epoch_begin(self, epoch, logs={}):
        self.epoch_time_start = time.time()
    def on_epoch_end(self, epoch, logs={}):
        self.times.append(time.time() - self.epoch_time_start)

# Pets Classification with TensorFlow

In [0]:
!gpustat

[1m[37mjupyter-admin      [m  Fri Oct 11 16:56:31 2019  [1m[30m410.104[m
[36m[0][m [34mTesla T4        [m |[1m[31m 66'C[m, [32m  0 %[m | [36m[1m[33m    0[m / [33m15079[m MB |


In [0]:
# enable XLA
tf.config.optimizer.set_jit(True)

# enable AMP
tf.keras.mixed_precision.experimental.set_policy('mixed_float16')

In [0]:
import tensorflow.keras.layers as layers
from tensorflow.keras.applications.resnet50 import ResNet50

def create_model(img_size=(224,224), num_class=2, train_base=True):
    # accept float16 image inputs
    input_layer = layers.Input(shape=(img_size[0],img_size[1],3), dtype=tf.float16)
    base = ResNet50(input_tensor=input_layer,
                    include_top=False,
                    weights="imagenet")
    base.trainable = train_base
    x = base.output
    x = layers.GlobalAveragePooling2D()(x)
    # softmax only accepts float32 - need to manually cast (likely a bug)
    preds = layers.Dense(num_class, activation="softmax", dtype=tf.float32)(x)
    return tf.keras.models.Model(inputs=input_layer, outputs=preds)

In [0]:
(train_dataset, test_dataset), info = tfds.load(name="oxford_iiit_pet:3.*.*",
                                                split=["train", "test"],
                                                shuffle_files=True,
                                                as_supervised=True,
                                                with_info=True)

num_class = info.features["label"].num_classes
num_train = info.splits["train"].num_examples
num_test  = info.splits["test"].num_examples

In [0]:
IMG_SIZE = (224, 224)

@tf.function
def format_train_example(image, label):
    image = tf.cast(image, tf.float32)
    image = (image/127.5) - 1
    image = tf.image.resize(image, IMG_SIZE)
    # perform image augmentation with tf.image
    image = tf.image.flip_left_right(image)
    image = tf.image.random_brightness(image, 0.1)
    # return images as float16
    image = tf.cast(image, tf.float16)
    return image, tf.one_hot(label, num_class)

@tf.function
def format_eval_example(image, label):
    image = tf.cast(image, tf.float32)
    image = (image/127.5) - 1
    image = tf.image.resize(image, IMG_SIZE)
    # return images as float16
    image = tf.cast(image, tf.float16)
    return image, tf.one_hot(label, num_class)

In [0]:
BATCH_SIZE = 80
N_THREADS = multiprocessing.cpu_count()
PREFETCH_COUNT = 8

train_dataset = train_dataset.shuffle(1024)
train_dataset = train_dataset.repeat(-1)
train_dataset = train_dataset.map(format_train_example,
                                  num_parallel_calls=N_THREADS)
train_dataset = train_dataset.batch(BATCH_SIZE)
train_dataset = train_dataset.prefetch(PREFETCH_COUNT)

In [0]:
test_dataset = test_dataset.map(format_eval_example,
                                num_parallel_calls=N_THREADS)
test_dataset = test_dataset.repeat(-1)
test_dataset = test_dataset.batch(BATCH_SIZE)

In [0]:
model = create_model(IMG_SIZE, num_class, train_base=True)
opt = tf.keras.optimizers.Adam()

model.compile(loss="categorical_crossentropy",
              optimizer=opt,
              metrics=["acc"])

#model.summary()

In [0]:
steps_per_epoch = num_train//BATCH_SIZE
steps_test = num_test//BATCH_SIZE

time_callback = TimeHistory()

In [0]:
model.fit(train_dataset, steps_per_epoch=steps_per_epoch,
          epochs=5, callbacks=[time_callback], verbose=1)

Train for 46 steps
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x7f6723e10f98>

In [0]:
# There is currently a bug with model.evaluate()
# Follow: https://github.com/tensorflow/tensorflow/issues/33090

#model.evaluate(train_dataset, steps=steps_per_epoch)

In [0]:
epoch_time = min(time_callback.times)
img_per_sec = num_train//epoch_time

print("Peak Img/s:", img_per_sec)

Peak Img/s: 200.0
