# MultiWorker Training

This tutorial demonstrates multi-worker distributed training with Keras model using tf.distribute.Strategy API. With the help of the strategies specifically designed for multi-worker training, a Keras model that was designed to run on single-worker can seamlessly work on multiple workers with minimal code change.

Distributed Training in TensorFlow guide is available for an overview of the distribution strategies TensorFlow supports for those interested in a deeper understanding of tf.distribute.Strategy APIs.

In [2]:
from __future__ import absolute_import, division, print_function, unicode_literals
import tensorflow_datasets as tfds
import tensorflow as tf
tfds.disable_progress_bar()
tf.__version__

'2.0.0'

In [3]:
gpus = tf.config.experimental.list_physical_devices('GPU')
print(gpus)
# GPU 메모리 제한하기
MEMORY_LIMIT_CONFIG = [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=5120)]
print(MEMORY_LIMIT_CONFIG)
tf.config.experimental.set_virtual_device_configuration(gpus[0], MEMORY_LIMIT_CONFIG)

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
[VirtualDeviceConfiguration(memory_limit=5120)]


In [4]:
BUFFER_SIZE = 10000
BATCH_SIZE = 64

def make_datasets_unbatched():
    # Scaling MNIST data from (0, 255] to (0., 1.]
    def scale(image, label):
        image = tf.cast(image, tf.float32)
        image /= 255
        return image, label

    datasets, info = tfds.load(name='mnist',
                            with_info=True,
                            as_supervised=True)

    return datasets['train'].map(scale).cache().shuffle(BUFFER_SIZE)

train_datasets = make_datasets_unbatched().batch(BATCH_SIZE)

In [5]:
img_batch, label_batch = next(iter(train_datasets))
img_batch.shape, label_batch.shape

(TensorShape([64, 28, 28, 1]), TensorShape([64]))

## Build the Keras model