In this example:

* We first load and preprocess the MNIST dataset.
* We define a simple neural network model using TensorFlow's Keras API.
* We create a tf.distribute.MirroredStrategy object, which mirrors variables across multiple GPUs.
* We create and compile the model within the strategy's scope, so it will be distributed across all available GPUs.
* We create a TensorFlow dataset from the input data and shuffle and batch it.
* We define a training step function that computes gradients and applies them using tf.GradientTape.
* We iterate over the dataset and perform distributed training using strategy.experimental_run_v2.
* The training loop runs for a specified number of epochs, printing the loss after each epoch.
* By using tf.distribute.MirroredStrategy, TensorFlow efficiently distributes the computation across multiple GPUs, leading to faster training times compared to training on a single GPU.

In [6]:
import tensorflow as tf
from tensorflow.keras import layers, models, datasets, optimizers

# Load and preprocess the MNIST dataset
(x_train, y_train), _ = datasets.mnist.load_data()
x_train = x_train.reshape(-1, 28, 28, 1).astype("float32") / 255.0
y_train = tf.keras.utils.to_categorical(y_train, num_classes=10)

# Create a simple neural network model
def create_model():
    model = models.Sequential([
        layers.Flatten(input_shape=(28, 28, 1)),
        layers.Dense(256, activation='relu'),
        layers.Dense(10, activation='softmax')
    ])
    return model

# Define the model
model = create_model()

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=5, batch_size=64)

# Evaluate the model
_, accuracy = model.evaluate(x_train, y_train)
print('Accuracy:', accuracy)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Accuracy: 0.9924833178520203
